The Text-to-Graph Pipeline

A high-level overview of the end-to-end process of turning text into a knowledge graph.

From Text to Structured Knowledge

The core purpose of the Lettria platform is to bridge the gap between unstructured text and the structured data that AI systems need. It achieves this through a straightforward, powerful process we call the Text-to-Graph Pipeline.

This pipeline provides a reliable and repeatable way to convert messy, implicit information locked in documents into clean, explicit, and interconnected knowledge.

The Three Stages of the Pipeline

The pipeline consists of three main stages, moving from your raw data to a finished, queryable output.

Input: Unstructured Files

Everything starts with your source documents. These can be articles, reports, emails, or any form of raw text. In the platform, these are represented as Files, which you upload and manage either through the Console or the SDK.

Processing: The Ontology-Driven Engine

This is where the transformation happens. Our engine, Perseus, analyzes the text. Crucially, it doesn't just guess what to extract. It uses a schema that you provide, called an Ontology, to identify and structure the information according to your specific domain needs.

Output: The Knowledge Graph

The final product is a Knowledge Graph—a structured, interconnected representation of the information from your source files. It's composed of entities, their relationships, and their properties, all conforming to the schema you defined in your ontology. This graph is now ready for use in your AI applications.

This entire process is managed by an asynchronous Job, ensuring that even large documents can be processed efficiently without blocking your applications.