Essentials

Manipulate a Graph

Learn how to programmatically modify a knowledge graph by adding new entities and relations in memory.

The full code for this example is available on GitHub.

After building a knowledge graph, you might want to enrich it with new information or modify it programmatically. This example demonstrates how to manipulate a KnowledgeGraph object directly in memory by adding new entities and relations before saving it.

What You'll Learn

  • How to build a knowledge graph from a text file.
  • How to add a new entity to the graph in memory.
  • How to find an existing entity and add a new relation to it.
  • How to save the modified graph to Neo4j and local files.

Example Use Case: Enriching Extracted Data

Imagine you have built a knowledge graph from a biography of Marie Curie. The graph contains entities for "Marie Curie" and "Physics," but it might have missed a specific achievement. This script shows how you can programmatically add new information, such as the fact that she won a "Nobel Prize in Physics," and connect it to the existing entities.

Prerequisites

Before running this example, ensure you have followed the installation guide to set up your environment and obtain the necessary API keys.


Code Walkthrough

This walkthrough covers building a graph, modifying it in memory, and saving the results.

Build the Initial Graph

As in other examples, we start by calling perseus_client.build_graph() to create a KnowledgeGraph object from a source text file.

import perseus_client
from perseus_client.models import Entity, Relation, LiteralValue

# ... (inside a main function)

# 1. Build the initial KnowledgeGraph from a file
knowledge_graphs = perseus_client.build_graph(
    file_paths=["assets/sample.txt"],
    metadata={"source": "graph_manipulation_example"},
)

if not knowledge_graphs:
    # ... (error handling)
    return

kg = knowledge_graphs[0]

# 2. Save the original, unmodified graph
kg.save_ttl("./output/original_graph.ttl")

Add a New Entity

To add new information, we first create a new Entity. Here, we create an entity for the "Nobel Prize in Physics." We define its URI, its type (Award), and its properties (its rdfs:label). Then, we simply append it to the kg.entities list.

# Define URIs for our new data
award_uri = "http://example.com/award/NobelPrizeInPhysics"
award_type = "http://example.com/ontology/Award"
award_name_predicate = "http://www.w3.org/2000/01/rdf-schema#label"

# Create the new Entity object
new_entity = Entity(
    uri=award_uri,
    types=[award_type],
    properties={
        award_name_predicate: LiteralValue(value="Nobel Prize in Physics")
    },
)

# Add the entity to the graph
kg.entities.append(new_entity)

Find an Existing Entity and Add a Relation

Next, we want to connect our new "Award" entity to the existing "Marie Curie" entity. We can iterate through the kg.entities list to find the entity whose properties contain the value "Marie Curie."

Once we find it, we create a new Relation object, specifying the source (Marie Curie's URI), the target (the Award's URI), the predicate (wonAward), and any properties of the relation itself (like the year).

# Find Marie Curie's entity
marie_curie_entity = None
for e in kg.entities:
    for prop in e.properties.values():
        if "Marie Curie" in str(prop.value):
            marie_curie_entity = e
            break
    if marie_curie_entity:
        break

# If found, create and add the new relation
if marie_curie_entity:
    won_award_predicate = "http://example.com/ontology/wonAward"
    new_relation = Relation(
        source_uri=marie_curie_entity.uri,
        target_uri=new_entity.uri,
        predicate=won_award_predicate,
        properties={ # ... properties like year ... }
    )
    kg.relations.append(new_relation)

Save the Modified Graph

Finally, we save the modified KnowledgeGraph object. The save methods will now include the new entity and relation we added in memory.

# Serialize the modified graph to local files
kg.save_ttl("./output/modified_graph.ttl")
kg.save_cql("./output/modified_graph.cql")

# Save the modified graph to Neo4j
logger.info("Saving modified graph to Neo4j...")
kg.save_to_neo4j(strip_prefixes=True)
logger.info("Modified graph saved to Neo4j.")