Token-aware text chunking (RAG) API

Split text into chunks of at most N TOKENS with optional overlap, using the GPT tokenizer — the everyday RAG primitive agents need before embedding/retrieval. Chunking by characters is wrong because models count tokens. Answers 'chunk this document for RAG', 'split this into 500-token pieces', 'token-aware chunking with overlap'.

Price$0.01per request

MethodPOST

Route/v1/util/chunk

StatusLive

MIME typeapplication/json

Rate limit120/minute

CacheNo cache

utilchunkragtokenizertokensembeddingsplitnlp

API URLhttps://x402.hexl.dev/v1/util/chunk

Integration docs

Example request

{
  "text": "A long document to split…",
  "maxTokens": 256,
  "overlap": 20
}

Example response

{
  "totalTokens": 1024,
  "maxTokens": 256,
  "overlap": 20,
  "chunkCount": 5,
  "chunks": [
    {
      "index": 0,
      "tokens": 256,
      "text": "A long document to split…"
    }
  ]
}

Input schema

{
  "type": "object",
  "required": [
    "text"
  ],
  "properties": {
    "text": {
      "type": "string"
    },
    "maxTokens": {
      "type": "number",
      "default": 500
    },
    "overlap": {
      "type": "number",
      "default": 0
    }
  }
}

Output schema

{
  "type": "object",
  "additionalProperties": true
}