Essentials

File Operations

Learn the fundamental operations for managing files: uploading, waiting for processing, and retrieving details.

The full code for this example is available on GitHub.

This example covers the fundamental operations for managing files with the Perseus SDK. It demonstrates the complete lifecycle of a file: uploading it, waiting for the server to process it, and then retrieving its details.

What You'll Learn

  • How to upload a local file to the Perseus platform.
  • How to asynchronously wait for a file to be processed.
  • How to find and retrieve file details using its unique ID.

Example Use Case: Pre-loading Data for Graph Building

Before you can build a knowledge graph, your source documents must be uploaded to the Perseus platform. This script shows the essential functions for managing this process, ensuring that a file is successfully uploaded and ready before you proceed with a graph-building job.

Prerequisites

Before running this example, ensure you have followed the installation guide to set up your environment and obtain the necessary API keys.


Code Walkthrough

This walkthrough covers the three key stages of a file's lifecycle.

Uploading a File

The process begins with perseus_client.upload_file(). This function takes the path to a local file and sends it to the Perseus API. The API returns a File object that includes a unique id and the initial status of the file, which will typically be PENDING.

import perseus_client
from perseus_client.models import FileStatus

# ... (inside a main function)

file_path = "assets/example.txt"
print(f"Uploading file: {file_path}")

uploaded_file = perseus_client.upload_file(file_path)
print(f"File upload initiated. File ID: {uploaded_file.id}, Status: {uploaded_file.status}")

Waiting for the File to be Processed

After uploading, the file is placed in a queue for server-side processing. It's crucial to wait for this process to complete before using the file. The perseus_client.wait_for_file_upload() function handles this for you. It polls the API and will only return once the file's status has changed to UPLOADED.

if uploaded_file.status == FileStatus.PENDING:
    print(f"Waiting for file {uploaded_file.id} to be processed...")
    processed_file = perseus_client.wait_for_file_upload(uploaded_file.id)
    print(f"File processing complete. Final Status: {processed_file.status}")

Finding the File

Once a file is uploaded, you can retrieve its details at any time using perseus_client.find_files(). This function takes a list of file IDs and returns a list of File objects with the full information for each file.

print(f"Attempting to find the file with ID: {uploaded_file.id}")
found_files = perseus_client.find_files(ids=[uploaded_file.id])

if found_files:
    print("Successfully found the file:")
    for f in found_files:
        print(f"  - ID: {f.id}, Name: {f.name}, Status: {f.status}, Created At: {f.created_at}")
else:
    print(f"Could not find file with ID: {uploaded_file.id}")