Use Cases

Graph RAG Reporting

Learn how to build a Graph RAG reporting application using the Perseus client.

The full code for this example is available on GitHub.

This project demonstrates how to build a powerful Graph RAG (Retrieval Augmented Generation) reporting application using the Perseus client. We'll show you how to convert unstructured PDF documents into a structured knowledge graph and leverage this graph to generate insightful, context-aware reports.

The Challenge: Extracting Actionable Insights from Complex Documents

Large corporate documents, such as annual reports and press releases, contain a wealth of information. However, extracting specific data points, understanding relationships between entities, and generating summarized reports can be a time-consuming and manual process. Traditional keyword-based search often lacks the semantic understanding required for deep analysis.

This leads to recurring issues for analysts and decision-makers:

  • Information Overload: Sifting through hundreds of pages to find specific details.
  • Lack of Context: Difficult to understand the "why" and "how" behind reported figures.
  • Manual Reporting: Generating comprehensive reports requires significant manual effort and aggregation.

The Solution: Graph RAG with Perseus

This project leverages the Perseus Text-to-Graph engine to transform unstructured and semi-structured documents into a queryable knowledge graph. By combining knowledge graphs with Retrieval Augmented Generation (RAG), we can achieve more accurate, relevant, and context-aware report generation.

Why Graph RAG is crucial:

  • Structured Knowledge: Information is extracted and stored as entities and relationships, making it easily queryable.
  • Semantic Understanding: The graph preserves the context and relationships between data points, allowing for more nuanced retrieval.
  • Enhanced Retrieval: RAG models can query the graph to retrieve highly relevant information, leading to more accurate generations.
  • Dynamic Reporting: Generate reports on demand by asking natural language questions against the knowledge graph.

The Demonstration: L'Oréal Annual Report 2024

We process L'Oréal's 2024 Annual Report (PDF) to build a knowledge graph and then generate financial performance reports.

From Text to Structured Graph: An Example

Here’s how a section from a report is transformed into a structured entity within the knowledge graph, which can then be used for RAG.

📄 Source Text (LOREAL_Rapport_Annuel_2024.md):

L'Oréal s'appuie sur **37 marques internationales** réparties en 4 divisions :
**Produits Professionnels :** Kérastase, L'Oréal Professionnel, Redken, etc. Lancement du sèche-cheveu AirLight Pro.
...

🔍 Extracted Entity (in the Knowledge Graph):

type: BusinessDivision
label: Produits Professionnels
hasBrand -> Kérastase
hasBrand -> Matrix
hasBrand -> Redken
hasBrand -> L’Oréal Professionnel

This structured output forms the basis for rich, context-aware retrieval.



Powering the RAG: The simple-graph-retriever Package

The simple-graph-retriever is a small, experimental project designed to quickly demonstrate a basic approach to graph retrieval for RAG. While it illustrates the core concepts, it represents a naive implementation.

To learn more, visit the simple-graph-retriever repository on GitHub.

Once the knowledge graph is created by the Perseus client and loaded into Neo4j, we need a way to efficiently search it to find context relevant to a user's query. This is the role of the simple-graph-retriever, a specialized package that bridges the graph database with the language model.

It performs two key operations in this workflow:

  • Graph Indexing (index.py): The index.py script uses this package to process the graph in Neo4j. It applies the Leiden algorithm to identify coherent communities of related nodes within the graph. These communities are then broken down into smaller "chunks," which are embedded and indexed as vectors in the Qdrant database. This creates a searchable semantic layer on top of the graph.

  • Graph Retrieval (report.py): The report.py script uses the package to take a natural language query (e.g., "Money KPIs"), search the Qdrant index for the most relevant chunks, and retrieve the corresponding subgraphs from Neo4j. This highly relevant, structured context is then passed to the LLM to generate an accurate report.


The Workflow at a Glance

Download the SDK

First, get the Perseus SDK and navigate to the example folder for Graph RAG reporting:

git clone https://github.com/Lettria/perseus-client.git
cd perseus-client/examples/advanced/graph-rag-reporting

This folder contains all the necessary templates and scripts to follow along with this tutorial.

Setup Environment

  • Requires Docker, Docker Compose, and Python 3.8+. 🐳🐍
  • Copy template.env to .env and fill in your Perseus API key.
    cp template.env .env

Install Dependencies & Start Services

pip install -r requirements.txt
docker compose up -d

⏳ The embedder service may take a few minutes to fully boot on the first run, as it needs to download the underlying model.

Run the Workflow

  1. Convert PDF to Markdown:
    • Takes a PDF document (e.g., assets/LOREAL_Rapport_Annuel_2024.pdf).
    • Uses an LLM to convert the PDF content into a structured Markdown file, preserving key information and formatting.
      python pdf_to_markdown.py assets/LOREAL_Rapport_Annuel_2024.pdf

See the complete pdf_to_markdown.py file on GitHub.

  1. Build and Index the Knowledge Graph:
    • Takes the generated Markdown document (e.g., assets/LOREAL_Rapport_Annuel_2024.md).
    • Uploads the document to the Perseus platform, where the Text-to-Graph engine extracts structured information.
    • Saves the extracted graph as a local .ttl file (an RDF graph) and loads it into a local Neo4j database.
    • The index.py script then uses the simple-graph-retriever package to index this graph into Qdrant for efficient RAG.
      python index.py assets/LOREAL_Rapport_Annuel_2024.md

See the complete index.py file on GitHub.

  1. Generate Context-Aware Reports:
    • The report.py script uses the simple-graph-retriever package to query the indexed graph in Qdrant and retrieve relevant subgraphs from Neo4j.
    • This retrieved context is then used in a RAG approach to answer natural language queries.
    • Generates a detailed, context-aware report based on the retrieved information.
      python report.py "Money KPIs"

See the complete report.py file on GitHub.

Cleaning Up 🧹

When you're done, stop and remove the Docker containers:

docker compose down

Generated Report Example

Here's an example of a report generated by querying the knowledge graph of the L'Oréal 2024 Annual Report with the query Money KPIs.

Financial Performance Report: L’Oréal Fiscal Year 2024

This report summarizes the key financial performance indicators (KPIs) and monetary insights for L’Oréal based on the 2024 data.

1. Group Financial Overview

L’Oréal demonstrated strong financial health in 2024, characterized by solid revenue growth and high profitability margins. The company continues to deliver value to shareholders through significant dividends and earnings per share.

KPIValue
Total Revenue€43.48 Billion
Comparable Growth5.1 %
Operating Profit€8.69 Billion
Operating Margin20 %
Net Profit Per Share€12.66
Dividend per Share€7.00
Ecommerce Revenue€12.3 Billion (~28% of total revenue)

2. Divisional Performance (Growth Rates)

Growth was observed across all business divisions, with Beauté Dermatologique emerging as the primary growth engine for the group.

DivisionGrowth Rate
Beauté Dermatologique9.8 %
Produits Grand Public (Consumer Products)5.4 %
Produits Professionnels5.3 %
Luxe2.7 %

3. Geographical Revenue Distribution

L’Oréal maintains a diversified global presence. Europe remains the largest contributor to the group's top line, followed by North America and North Asia.

RegionShare of Total Revenue
Europe33 %
Amérique du Nord27 %
Asie du Nord24 %
SAPMENA-SSA9 %
Amérique latine8 %

4. Strategic Financial Investments & Allocations

The 2024 fiscal year was marked by aggressive inorganic growth through acquisitions and a commitment to social impact funding.

  • Acquisitions & Partnerships: The group expanded its portfolio and market reach by acquiring or partnering with several brands, including:

    • Galderma (Major strategic stake)
    • Miu Miu and Jacquemus (Expansion in luxury/fashion beauty)
    • Amouage and Dr.G
  • Social Investment: The Fonds L’Oréal pour les Femmes (L’Oréal Fund for Women) has a dedicated allocation of €70 Million.

  • Innovation Investment: While specific R&D spend is not listed, the filing of 694 patents and the establishment of the CreAItech lab indicate significant capital expenditure in intellectual property and tech-driven beauty.

Key Insights

  • Profitability: An operating margin of 20% indicates high operational efficiency and strong pricing power across its 37 brands.
  • Digital Transformation: Ecommerce has become a vital revenue pillar, now accounting for over €12 billion in sales.
  • Diversification: The balanced revenue share between Europe, North America, and North Asia provides a hedge against regional economic volatility.

Visualize in the Perseus Interface

The knowledge graph outputs are accessible both locally (.ttl files) and in the console for interactive exploration:

  • Nodes tab: Browse all extracted entities (e.g., companies, financial metrics, divisions).

  • Graph tab: Visualize relationships interactively.

  • Turtle and Cypher tab: View the Turtle (.ttl) and Cypher (.cql) files.

Perseus interface - Jobs view with extracted climate entities