Catalog/rag-tfidf-vectorize

Retrieval

TF-IDF vectorize corpus API

Builds a TF-IDF model from a corpus: sorted vocabulary, smoothed IDF weights, and per-document tf-idf vectors (optional sublinear tf). Answers 'How do I TF-IDF vectorize these documents?', 'What are the IDF weights for this corpus?'.

Price$0.02per request
MethodPOST
Route/v1/retrieval/tfidf-vectorize
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
tfidftf-idfvectorizelexicalbag-of-wordscorpusretrievalrag
API URLhttps://x402.hexl.dev/v1/retrieval/tfidf-vectorize
Integration docs
Example request
{
  "corpus": [
    "the cat sat",
    "the dog ran"
  ]
}
Example response
{
  "vocabulary": [
    "cat",
    "dog",
    "ran",
    "sat",
    "the"
  ],
  "documentCount": 2,
  "idf": {
    "cat": 1.40546511,
    "dog": 1.40546511,
    "ran": 1.40546511,
    "sat": 1.40546511,
    "the": 1
  },
  "vectors": [
    [
      0.46848837,
      0,
      0,
      0.46848837,
      0.33333333
    ],
    [
      0,
      0.46848837,
      0.46848837,
      0,
      0.33333333
    ]
  ],
  "vocabularySize": 5
}
Input schema
{
  "type": "object",
  "required": [
    "corpus"
  ],
  "properties": {
    "corpus": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "sublinearTf": {
      "type": "boolean"
    }
  }
}
Output schema
{
  "type": "object",
  "additionalProperties": true
}