V2 endpoint — requires Pro, Team, or Enterprise plan.
low_latency for real-time interactions, balanced for general use, or high_recall for compliance and research workloads that need every relevant piece of evidence.
Request
The question or topic to retrieve evidence for.
Token budget (1 – 128,000). The output never exceeds this. V2’s adaptive budget may use less than the limit but will not exceed it.
Scope retrieval to this workspace.
Scope retrieval to an actor’s permissions.
Serving profile. Controls the retrieval strategy and budget behavior.
| Value | When to use |
|---|---|
low_latency | Real-time chat, quick lookups. Fastest response, semantic search off. |
balanced | General-purpose. Semantic search on, adaptive budget up to 1.5×. |
high_recall | Compliance, legal, research. Maximum evidence, adaptive budget up to 4×. |
Per-request access control. Same as V1.
Scope retrieval to a session.
Response
The retrieved evidence spans, ordered by relevance.
SHA-256 fingerprint of this pack. Identical inputs always produce the same hash.
The profile that was actually used. May differ from your request if the system escalated (e.g.,
balanced → high_recall) because initial results were sparse.Number of distinct source artifacts cited.
Examples
Response