POST /v1/context-pack

The core endpoint. Given a query and token budget, retrieve relevant evidence, pack it optimally, enforce policy, and return ready-to-inject prompt text plus a full audit receipt.

Request

POST /v1/context-pack

query

string

required

The question or topic to retrieve evidence for.

max_tokens

integer

required

Token budget (1 - 128,000). The output will never exceed this.

workspace_id

string

Scope retrieval to a workspace.

actor_id

string

Scope retrieval to an actor’s permissions.

mode

string

default:"relevance"

relevance optimizes for top hits. coverage spreads across more sources.

policy

object

Access control policy for this request.

Show properties

allowed_sources

string[]

Only include these artifact types.

denied_sources

string[]

Exclude these artifact types.

allowed_actors

string[]

Only include artifacts from these actors.

rbac_required

string[]

Require these RBAC tags on included artifacts.

privacy_level

string

default:"default"

default or strict.

max_latency_ms

integer

Latency constraint (advisory).

max_cost_usd

float

Cost constraint (advisory).

session_id

string

Scope retrieval to a specific session.

Response

context_text

string

The full prompt string to inject into your LLM call. Contains prefix (pinned) + working set (query-specific) + conflict warnings.

prefix_text

string

Just the pinned/authoritative section.

working_set_text

string

Just the query-specific evidence section.

citations

Citation[]

Provenance-anchored citations for each evidence span.

Show Citation properties

artifact_id

string

UUID of the source artifact.

span_id

string

UUID of the specific span.

start_offset

integer

Byte offset in original artifact.

end_offset

integer

Byte offset end.

content_hash

string

SHA-256 of the span content.

source_type

string

Artifact type (e.g., document, chat_turn).

relevance_score

float

0.0 - 1.0 relevance score.

content_preview

string

Short preview of the span content.

confidence

float

0.0 - 1.0 confidence score based on relevance, coverage, and source diversity.

abstain_flag

boolean

true if the system determined there is insufficient evidence to answer. Your agent should handle this gracefully.

abstain_reason

string | null

Human-readable reason for abstention (e.g., “Confidence too low”, “Insufficient evidence tokens”).

receipt_id

string

UUID pointing to the full decision receipt. Use GET /v1/receipts/ to fetch it.

total_tokens

integer

Exact token count of context_text.

pack_hash

string

SHA-256 of the packed output. Same inputs always produce the same hash.

Example

pack = client.context_pack(
    query="What is the current enterprise discount cap for Q2?",
    workspace_id="ws_acme",
    max_tokens=2048,
    actor_id="user_alice",
)

if pack["abstain_flag"]:
    print(f"Cannot answer: {pack['abstain_reason']}")
else:
    # Inject into your LLM
    print(pack["context_text"])
    print(f"Confidence: {pack['confidence']:.0%}")
    print(f"Tokens used: {pack['total_tokens']}")

Response

{
  "context_text": "=== PINNED CONTEXT (Authoritative) ===\n[cite:1] The Q2 enterprise discount cap has been raised to 18%.\n\n=== TOP EVIDENCE (Most Relevant to Query) ===\n[cite:2] Enterprise pricing guidelines updated for Q2...\n\n=== END CONTEXT ===",
  "prefix_text": "=== PINNED CONTEXT (Authoritative) ===\n[cite:1] ...",
  "working_set_text": "=== TOP EVIDENCE ===\n[cite:2] ...",
  "citations": [
    {
      "artifact_id": "a1b2c3d4-...",
      "span_id": "s1e2f3g4-...",
      "start_offset": 0,
      "end_offset": 56,
      "content_hash": "sha256:...",
      "source_type": "chat_turn",
      "relevance_score": 0.94,
      "content_preview": "The Q2 enterprise discount cap has been raised to 18%."
    }
  ],
  "confidence": 0.87,
  "abstain_flag": false,
  "abstain_reason": null,
  "receipt_id": "r1a2b3c4-...",
  "total_tokens": 347,
  "pack_hash": "sha256:..."
}

Abstention

Always check abstain_flag before using context_text. When the system determines it cannot provide reliable evidence, it sets abstain_flag: true. Your agent should decline to answer rather than use low-quality context.

Abstention triggers when evidence quality or quantity is too low to provide a reliable answer. See Confidence and Abstention for details.

Determinism

Memory Runtime is fully deterministic. Given the same query against the same data, it produces byte-identical output — same evidence, same order, same token count, same pack_hash. You can verify this by comparing pack_hash values across runs.

Overview

V1 — Free & above

V2 — Pro, Team, Enterprise

POST /v1/context-pack

Request

Response

Example

Abstention

Determinism

Overview

V1 — Free & above

V2 — Pro, Team, Enterprise

Documentation Index

​Request

​Response

​Example

​Abstention

​Determinism

Request

Response

Example

Abstention

Determinism