Catalog/llm-truncate-context

LLM

Truncate chat to fit budget API

Truncates a message list to fit a token budget using drop-middle, sliding-window (keep latest), or keep-ends strategies, always pinning a leading system message — returns the kept messages in order, the tokens used, and which indices were dropped. Answers 'how do I trim a long conversation to fit the window?', 'which messages should I drop to stay under budget?'.

Price$0.03per request
MethodPOST
Route/v1/llm/truncate-context
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
llmcontexttruncatetokenssliding-windowchat-historybudgetagent
API URLhttps://x402.hexl.dev/v1/llm/truncate-context
Integration docs
Example request
{
  "messages": [
    {
      "role": "system",
      "content": "sys"
    },
    {
      "role": "user",
      "content": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
    },
    {
      "role": "assistant",
      "content": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
    },
    {
      "role": "user",
      "content": "latest question"
    }
  ],
  "budget": 40,
  "strategy": "sliding-window"
}
Example response
{
  "strategy": "sliding-window",
  "budget": 40,
  "keptTokens": 34,
  "droppedCount": 0,
  "kept": [
    {
      "index": 0,
      "role": "system",
      "content": "sys",
      "tokens": 5
    },
    {
      "index": 1,
      "role": "user",
      "content": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
      "tokens": 9
    },
    {
      "index": 2,
      "role": "assistant",
      "content": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
      "tokens": 14
    },
    {
      "index": 3,
      "role": "user",
      "content": "latest question",
      "tokens": 6
    }
  ],
  "dropped": []
}
Input schema
{
  "type": "object",
  "required": [
    "messages",
    "budget"
  ],
  "properties": {
    "messages": {
      "type": "array",
      "items": {
        "type": "object"
      }
    },
    "budget": {
      "type": "number",
      "examples": [
        40
      ]
    },
    "strategy": {
      "type": "string",
      "enum": [
        "drop-middle",
        "sliding-window",
        "keep-ends"
      ],
      "default": "sliding-window"
    },
    "perMessageOverhead": {
      "type": "number",
      "default": 4
    }
  }
}
Output schema
{
  "type": "object",
  "additionalProperties": true
}