The core endpoint. Given a query and token budget, retrieve relevant evidence, pack it optimally, enforce policy, and return ready-to-inject prompt text plus a full audit receipt.Documentation Index
Fetch the complete documentation index at: https://docs.9dlabs.xyz/llms.txt
Use this file to discover all available pages before exploring further.
Request
The question or topic to retrieve evidence for.
Token budget (1 - 128,000). The output will never exceed this.
Scope retrieval to a workspace.
Scope retrieval to an actor’s permissions.
relevance optimizes for top hits. coverage spreads across more sources.Access control policy for this request.
Latency constraint (advisory).
Cost constraint (advisory).
Scope retrieval to a specific session.
Response
The full prompt string to inject into your LLM call. Contains prefix (pinned) + working set (query-specific) + conflict warnings.
Just the pinned/authoritative section.
Just the query-specific evidence section.
Provenance-anchored citations for each evidence span.
0.0 - 1.0 confidence score based on relevance, coverage, and source diversity.
true if the system determined there is insufficient evidence to answer. Your agent should handle this gracefully.Human-readable reason for abstention (e.g., “Confidence too low”, “Insufficient evidence tokens”).
UUID pointing to the full decision receipt. Use GET /v1/receipts/ to fetch it.
Exact token count of
context_text.SHA-256 of the packed output. Same inputs always produce the same hash.
Example
Response
Abstention
Abstention triggers when evidence quality or quantity is too low to provide a reliable answer. See Confidence and Abstention for details.Determinism
Memory Runtime is fully deterministic. Given the same query against the same data, it produces byte-identical output — same evidence, same order, same token count, samepack_hash. You can verify this by comparing pack_hash values across runs.