Query a Graph with SPARQL
Learn how to query a knowledge graph locally using SPARQL by converting it to an rdflib.Graph object.
The full code for this example is available on GitHub.
Once you have built a knowledge graph, you can load it into memory and query it locally using SPARQL, the standard query language for RDF graphs. This example shows how to convert a KnowledgeGraph object into an rdflib.Graph and then execute a SPARQL query against it.
What You'll Learn
- How to build a knowledge graph from a text file and an ontology.
- How to convert a
KnowledgeGraphobject into a standardrdflib.Graph. - How to define and execute a SPARQL query to find specific information.
- How to process and display the query results.
Example Use Case: Finding All People in a Document
After extracting a knowledge graph from a biographical text, you might want to retrieve a list of all the people mentioned. This script demonstrates how to load the graph into memory and run a targeted SPARQL query to find all entities of type Person and extract their names.
Prerequisites
Before running this example, ensure you have followed the installation guide to set up your environment and obtain the necessary API keys.
Code Walkthrough
This walkthrough covers building a graph and then querying it locally.
Build the Knowledge Graph
First, we build a KnowledgeGraph from a text file, using an ontology to ensure the extracted entities have the correct types (like ont:Person).
import perseus_client
# ... (inside a main function)
# Build a graph from a sample text file using the provided ontology
graphs = perseus_client.build_graph(
file_paths=["./assets/sample.txt"],
ontology_path="./assets/ontology.ttl",
)
if not graphs:
print("Could not build graph.")
return
knowledge_graph = graphs[0]Convert to an RDFLib Graph
The KnowledgeGraph object has a convenient .to_rdflib() method that converts the graph into an rdflib.Graph object. RDFLib is a popular Python library for working with RDF data, and it includes a full-featured SPARQL query engine.
# Convert to an rdflib graph
rdflib_graph = knowledge_graph.to_rdflib()Define and Execute the SPARQL Query
Now we can define a standard SPARQL query as a multi-line string. This query looks for all subjects (?person) that have the type (a) ont:Person and retrieves their label (?name). We then execute it using the rdflib_graph.query() method.
# Define a SPARQL query to find all persons and their names
query = """
PREFIX ont: <http://example.org/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?person ?name
WHERE {
?person a ont:Person .
?person rdfs:label ?name .
}
"""
# Execute the query
results = rdflib_graph.query(query)Process the Results
The results object is an iterable that yields the query solutions. We can loop through it to access the variables we selected (?person and ?name) and print them.
from typing import cast, Iterable, Tuple
from rdflib.term import Node
# Provide a specific type hint for clarity
typed_results = cast(Iterable[Tuple[Node, Node]], results)
# Print the results
print("\n--- SPARQL Query Results ---")
for person, name in typed_results:
print(f"Person: {person}, Name: {name}")
print("--------------------------")