Skip to main content
V2 endpoint — requires Pro, Team, or Enterprise plan.
The V2 context pack endpoint adds serving profiles on top of everything V1 offers. Pick low_latency for real-time interactions, balanced for general use, or high_recall for compliance and research workloads that need every relevant piece of evidence.

Request

POST /v2/context-pack
query
string
required
The question or topic to retrieve evidence for.
max_tokens
integer
required
Token budget (1 – 128,000). The output never exceeds this. V2’s adaptive budget may use less than the limit but will not exceed it.
workspace_id
string
required
Scope retrieval to this workspace.
actor_id
string
Scope retrieval to an actor’s permissions.
profile
string
default:"balanced"
Serving profile. Controls the retrieval strategy and budget behavior.
ValueWhen to use
low_latencyReal-time chat, quick lookups. Fastest response, semantic search off.
balancedGeneral-purpose. Semantic search on, adaptive budget up to 1.5×.
high_recallCompliance, legal, research. Maximum evidence, adaptive budget up to 4×.
policy
object
Per-request access control. Same as V1.
session_id
string
Scope retrieval to a session.

Response

snippets
object[]
The retrieved evidence spans, ordered by relevance.
pack_hash
string
SHA-256 fingerprint of this pack. Identical inputs always produce the same hash.
serving_profile
string
The profile that was actually used. May differ from your request if the system escalated (e.g., balancedhigh_recall) because initial results were sparse.
token_accounting
object
citation_count
integer
Number of distinct source artifacts cited.

Examples

from nined.memory import MemoryClientV2

client = MemoryClientV2(
    base_url="https://api.9dlabs.xyz",
    api_key="your-key",
    workspace_id="ws_legal",
)

pack = client.context_pack(
    "What data retention policies apply to EU customers?",
    max_tokens=8192,
    profile="high_recall",
)

print(f"Profile used: {pack['serving_profile']}")
print(f"Evidence spans: {len(pack['snippets'])}")
print(f"Tokens: {pack['token_accounting']['used']} / {pack['token_accounting']['budget']}")

for snippet in pack["snippets"]:
    print(f"[{snippet['score']:.2f}] {snippet['content'][:120]}...")
Response
{
  "snippets": [
    {
      "content": "EU customer data must be retained for no more than 90 days unless explicitly consented...",
      "artifact_id": "a1b2c3d4-...",
      "span_id": "s1e2f3g4-...",
      "score": 0.96,
      "artifact_type": "document"
    },
    {
      "content": "GDPR Article 5(1)(e): storage limitation principle applies to all personal data...",
      "artifact_id": "a5b6c7d8-...",
      "span_id": "s9a0b1c2-...",
      "score": 0.91,
      "artifact_type": "document"
    }
  ],
  "pack_hash": "sha256:...",
  "serving_profile": "high_recall",
  "token_accounting": {
    "budget": 8192,
    "used": 1847,
    "snippets": 2
  },
  "citation_count": 2
}
The serving_profile in the response can differ from what you requested. If the system detects sparse evidence with balanced, it may automatically escalate to high_recall. Always read this field if you need to know the actual strategy used.