Essentials

Query a Graph with SPARQL

Learn how to query a knowledge graph locally using SPARQL by converting it to an rdflib.Graph object.

The full code for this example is available on GitHub.

Once you have built a knowledge graph, you can load it into memory and query it locally using SPARQL, the standard query language for RDF graphs. This example shows how to convert a KnowledgeGraph object into an rdflib.Graph and then execute a SPARQL query against it.

What You'll Learn

  • How to build a knowledge graph from a text file and an ontology.
  • How to convert a KnowledgeGraph object into a standard rdflib.Graph.
  • How to define and execute a SPARQL query to find specific information.
  • How to process and display the query results.

Example Use Case: Finding All People in a Document

After extracting a knowledge graph from a biographical text, you might want to retrieve a list of all the people mentioned. This script demonstrates how to load the graph into memory and run a targeted SPARQL query to find all entities of type Person and extract their names.

Prerequisites

Before running this example, ensure you have followed the installation guide to set up your environment and obtain the necessary API keys.


Code Walkthrough

This walkthrough covers building a graph and then querying it locally.

Build the Knowledge Graph

First, we build a KnowledgeGraph from a text file, using an ontology to ensure the extracted entities have the correct types (like ont:Person).

import perseus_client

# ... (inside a main function)

# Build a graph from a sample text file using the provided ontology
graphs = perseus_client.build_graph(
    file_paths=["./assets/sample.txt"],
    ontology_path="./assets/ontology.ttl",
)
if not graphs:
    print("Could not build graph.")
    return

knowledge_graph = graphs[0]

Convert to an RDFLib Graph

The KnowledgeGraph object has a convenient .to_rdflib() method that converts the graph into an rdflib.Graph object. RDFLib is a popular Python library for working with RDF data, and it includes a full-featured SPARQL query engine.

# Convert to an rdflib graph
rdflib_graph = knowledge_graph.to_rdflib()

Define and Execute the SPARQL Query

Now we can define a standard SPARQL query as a multi-line string. This query looks for all subjects (?person) that have the type (a) ont:Person and retrieves their label (?name). We then execute it using the rdflib_graph.query() method.

# Define a SPARQL query to find all persons and their names
query = """
PREFIX ont: <http://example.org/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?person ?name
WHERE {
    ?person a ont:Person .
    ?person rdfs:label ?name .
}
"""

# Execute the query
results = rdflib_graph.query(query)

Process the Results

The results object is an iterable that yields the query solutions. We can loop through it to access the variables we selected (?person and ?name) and print them.

from typing import cast, Iterable, Tuple
from rdflib.term import Node

# Provide a specific type hint for clarity
typed_results = cast(Iterable[Tuple[Node, Node]], results)

# Print the results
print("\n--- SPARQL Query Results ---")
for person, name in typed_results:
    print(f"Person: {person}, Name: {name}")
print("--------------------------")