Python SDK
EngramClient is a lightweight HTTP client for a single Engram miner. Store text, images, PDFs, URLs, and conversations. Query with metadata filters, retrieve by CID, delete, and list records. No extra dependencies for text; pypdf needed for PDFs.
Install
pip install engram-subnet# For PDF supportpip install engram-subnet pypdf
EngramClient
from engram.sdk import EngramClientclient = EngramClient(miner_url="http://72.62.2.34:8091", # or use from_subnet() for auto-discoverytimeout=30.0,)
| Parameter | Type | Default | Description |
|---|---|---|---|
miner_url | str | "http://127.0.0.1:8091" | Base URL of the miner's HTTP server |
timeout | float | 30.0 | Request timeout in seconds |
namespace | str | None | None | Private collection name — enables encryption |
namespace_key | str | None | None | Secret key for the namespace (min 16 chars) |
from_subnet()
Auto-discovers the best available miner from the Bittensor metagraph. Probes the top miners by incentive score in parallel and returns a client pointed at the fastest responsive one.
# One line — no miner URL neededclient = EngramClient.from_subnet(netuid=450)
| Parameter | Type | Default | Description |
|---|---|---|---|
netuid | int | 450 | Subnet UID to query |
network | str | "finney" | Subtensor network — "finney", "test", or ws:// endpoint |
timeout | float | 30.0 | Timeout for the returned client |
probe_timeout | float | 3.0 | Timeout for each health probe during discovery |
top_n | int | 5 | Number of top miners to probe (picks by incentive rank) |
bittensor to be installed. Raises RuntimeError if no miners are reachable.Private namespaces
Pass namespace and namespace_key to store data in an encrypted, private collection. Text is encrypted with AES-256-GCM client-side before being sent to any miner.
private = EngramClient("http://miner:8091",namespace="company-docs",namespace_key="your-secret-key-min-16-chars",)cid = private.ingest("Q4 revenue was $4.2M") # encrypted before leaving your machineresults = private.query("revenue figures") # decrypted client-side
See Private Namespaces for the full encryption spec and threat model.
ingest()
cid: str = client.ingest(text: str, metadata: dict = None)
Embed and store text on the miner. Returns a CID string.
cid = client.ingest("BERT uses bidirectional encoder representations.",metadata={"source": "arxiv", "year": "2018"})print(cid) # v1::a3f2b1c4d5e6f7...
| Parameter | Type | Description |
|---|---|---|
text | str | Text to embed and store (max 8192 chars) |
metadata | dict | None | Optional key-value metadata (max 4 KB JSON) |
Raises: MinerOfflineError, IngestError, InvalidCIDError
ingest_image()
Describe an image with Grok Vision (xAI) and store the description as a searchable memory. The raw image bytes are never sent to the miner — only the AI-generated description is embedded and stored. A content_cid (SHA-256 of the image) is stored as metadata for integrity verification.
result = client.ingest_image("photo.jpg", # path, or raw bytesxai_api_key="xai-...", # get one at console.x.aimetadata={"user_id": "u_123"}, # optional extra metadata)print(result["cid"]) # v1::a3f2b1... — use this for searchprint(result["description"]) # "A photograph of a whiteboard showing..."print(result["content_cid"]) # sha256:abc123... — integrity checkprint(result["filename"]) # "photo.jpg"# Search by what's in the image later:results = client.query("whiteboard diagram with architecture")
| Parameter | Type | Description |
|---|---|---|
source | str | Path | bytes | Image file path or raw bytes |
xai_api_key | str | xAI API key for Grok Vision (required) |
mime_type | str | None | MIME type e.g. "image/jpeg" — auto-detected from extension if omitted |
metadata | dict | None | Optional extra metadata |
Returns: dict with cid, description, content_cid, filename
Raises: MinerOfflineError, IngestError, RuntimeError (Grok API failure)
ingest_pdf()
Extract text from a PDF and store it as a searchable memory. Requires pypdf. The full text (up to 8192 chars) is embedded; the SHA-256 of the raw PDF is stored as content_cid.
pip install pypdf
result = client.ingest_pdf("research_paper.pdf", # path, or raw bytesmetadata={"category": "research"},)print(result["cid"]) # v1::...print(result["pages"]) # 12print(result["chars"]) # 48293print(result["content_cid"]) # sha256:...# Search the PDF content later:results = client.query("transformer attention mechanism")
| Parameter | Type | Description |
|---|---|---|
source | str | Path | bytes | PDF file path or raw bytes |
metadata | dict | None | Optional extra metadata |
Returns: dict with cid, pages, chars, content_cid, filename
Raises: MinerOfflineError, IngestError, ImportError (pypdf missing), ValueError (image-only PDF)
pytesseract) or use ingest_image() per page.ingest_url()
Fetch a web page, strip navigation and boilerplate, and store the readable text as a memory. SSRF protection is built in — private/loopback addresses are blocked.
result = client.ingest_url("https://arxiv.org/abs/1706.03762",metadata={"category": "research"},)print(result["cid"]) # v1::...print(result["title"]) # "Attention Is All You Need"print(result["chars"]) # 6842print(result["url"]) # final URL after redirects# Search later:results = client.query("transformer architecture paper")
| Parameter | Type | Description |
|---|---|---|
url | str | HTTP or HTTPS URL to fetch |
metadata | dict | None | Optional extra metadata merged with auto-extracted title/source |
Returns: dict with cid, url, title, chars
Raises: ValueError (invalid URL, private address), RuntimeError (fetch failure, no readable text)
ingest_conversation()
Store a conversation thread as individual turn memories. Each message is embedded separately so individual turns are semantically searchable. A shared session_id links them.
messages = [{"role": "user", "content": "What's the capital of France?"},{"role": "assistant", "content": "The capital of France is Paris."},{"role": "user", "content": "Tell me more about Paris."},]cids = client.ingest_conversation(messages,session_id="session_abc123",metadata={"user_id": "u_456"},)print(cids)# ["v1::a3f2...", "v1::b2e8...", "v1::c9f4..."]# Retrieve conversation turns later:results = client.query("capital city France")# Returns the turn that mentioned Paris
| Parameter | Type | Description |
|---|---|---|
messages | list[dict] | List of {"role": ..., "content": ...} dicts |
session_id | str | Shared ID linking all turns — stored as metadata |
metadata | dict | None | Optional extra metadata added to every turn |
Returns: list of CID strings — one per message turn
role, session_id, turn, and timestamp in its metadata.query()
results: list[dict] = client.query(text: str, top_k: int = 10, filter: dict = None)
Semantic search over stored embeddings — works across text, images, PDFs, URLs, and conversation turns.
# Basic searchresults = client.query("how does self-attention work?", top_k=10)# [# {"cid": "v1::a3f2b1...", "score": 0.9821, "metadata": {"source": "arxiv"}},# {"cid": "v1::b2e8c1...", "score": 0.8847, "metadata": {"type": "url"}},# ]# Filter by metadata — AND semantics (all conditions must match)results = client.query("revenue figures",top_k=5,filter={"user_id": "u_123", "type": "text"},)# Only conversation turns for a specific sessionturns = client.query("Paris",filter={"session_id": "session_abc123", "role": "assistant"},)
| Parameter | Type | Description |
|---|---|---|
text | str | Natural language query |
top_k | int | Maximum results to return (default 10) |
filter | dict | None | AND-match metadata filter — all key/value pairs must match |
get()
Retrieve a stored record by its CID. Returns the metadata (not the raw embedding vector).
record = client.get("v1::a3f2b1c4d5e6f7...")if record:print(record["cid"]) # v1::a3f2b1...print(record["metadata"]) # {"source": "arxiv", "title": "Attention Is All You Need"}else:print("Not found")
Returns: dict with cid and metadata, or None if not found
delete()
Remove a stored record by its CID. The operation is idempotent.
deleted = client.delete("v1::a3f2b1c4d5e6f7...")print(deleted) # True if it existed, False if not found
Returns: bool — True if deleted, False if CID was not found
Raises: MinerOfflineError
list()
List stored records with optional metadata filtering and pagination.
# All records (first page)records = client.list(limit=50, offset=0)# Filter by typeimage_records = client.list(filter={"type": "image"})# All memories for a user, paginatedpage1 = client.list(filter={"user_id": "u_123"}, limit=20, offset=0)page2 = client.list(filter={"user_id": "u_123"}, limit=20, offset=20)for r in page1:print(r["cid"], r["metadata"].get("title", ""))
| Parameter | Type | Description |
|---|---|---|
filter | dict | None | AND-match metadata filter |
limit | int | Max records per page (default 50) |
offset | int | Records to skip (default 0) |
Returns: list of dicts with cid and metadata
batch_ingest_file()
Ingest all records from a JSONL file. Each line must be a JSON object with a text key.
# data.jsonl format:# {"text": "First entry"}# {"text": "Second entry", "metadata": {"category": "ml"}}cids = client.batch_ingest_file("data/corpus.jsonl")print(f"Ingested {len(cids)} records")# With error trackingcids, errors = client.batch_ingest_file("corpus.jsonl", return_errors=True)for err in errors:print(f"Skipped: {err}")
health() / is_online()
# Check liveness — raises MinerOfflineError if unreachableinfo = client.health()# {"status": "ok", "vectors": 42156, "uid": 7}# Safe check — never raisesif client.is_online():cid = client.ingest("...")
Multi-miner pattern
For redundancy, ingest to multiple miners. The same text always produces the same CID.
from engram.sdk import EngramClient, MinerOfflineErrorminers = [EngramClient("http://miner1:8091"),EngramClient("http://miner2:8091"),EngramClient("http://miner3:8091"),]cids = []for miner in miners:try:cids.append(miner.ingest("Critical knowledge."))except MinerOfflineError:print(f"Miner offline: {miner.miner_url}")print(f"Stored on {len(cids)}/3 miners")