LLM
RAG context packer API
Greedily packs the highest-scoring retrieval chunks under a token budget (relevance-ordered, accounting for separators), emits the assembled context string, which chunks were included vs dropped, and the budget utilization — the core RAG assembly step. Answers 'which retrieved chunks fit in my context budget?', 'how do I assemble the best context for RAG?'.
Price$0.04per request
MethodPOST
Route/v1/llm/rag-pack
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
llmragcontextretrievalpacktokensbudgetagent
API URL
Integration docshttps://x402.hexl.dev/v1/llm/rag-packExample request
{
"chunks": [
{
"id": "c1",
"text": "alpha beta gamma",
"score": 0.9
},
{
"id": "c2",
"text": "delta epsilon",
"score": 0.3
},
{
"id": "c3",
"text": "zeta",
"score": 0.95
}
],
"budget": 12
}Example response
{
"budget": 12,
"packedTokens": 11,
"utilizationPct": 91.7,
"includedCount": 3,
"excludedCount": 0,
"packed": [
{
"id": "c3",
"text": "zeta",
"tokens": 2,
"score": 0.95
},
{
"id": "c1",
"text": "alpha beta gamma",
"tokens": 3,
"score": 0.9
},
{
"id": "c2",
"text": "delta epsilon",
"tokens": 2,
"score": 0.3
}
],
"excluded": [],
"context": "zeta\n\n---\n\nalpha beta gamma\n\n---\n\ndelta epsilon"
}Input schema
{
"type": "object",
"required": [
"chunks",
"budget"
],
"properties": {
"chunks": {
"type": "array",
"items": {
"type": "object"
}
},
"budget": {
"type": "number",
"examples": [
12
]
},
"separator": {
"type": "string",
"default": "\n\n---\n\n"
},
"perChunkOverhead": {
"type": "number",
"default": 0
}
}
}Output schema
{
"type": "object",
"additionalProperties": true
}