Retrieval
Dedupe vectors by cosine API
Greedily removes near-duplicate vectors above a cosine threshold (keep-first), reporting kept ids and what each dropped item duplicated. Answers 'How do I remove duplicate chunks from retrieval?', 'Which vectors are near-duplicates?'.
Price$0.02per request
MethodPOST
Route/v1/retrieval/dedupe-by-cosine
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
dedupededuplicationcosinenear-duplicateretrievalfiltervectorrag
API URL
Integration docshttps://x402.hexl.dev/v1/retrieval/dedupe-by-cosineExample request
{
"vectors": [
[
1,
0
],
[
1,
0
],
[
0,
1
]
],
"threshold": 0.99,
"ids": [
"d1",
"d2",
"d3"
]
}Example response
{
"threshold": 0.99,
"keptIndices": [
0,
2
],
"keptIds": [
"d1",
"d3"
],
"dropped": [
{
"index": 1,
"id": "d2",
"duplicateOf": "d1",
"similarity": 1
}
],
"keptCount": 2,
"droppedCount": 1,
"totalInput": 3
}Input schema
{
"type": "object",
"required": [
"vectors"
],
"properties": {
"vectors": {
"type": "array",
"items": {
"type": "array",
"items": {
"type": "number"
}
}
},
"threshold": {
"type": "number"
},
"ids": {
"type": "array"
}
}
}Output schema
{
"type": "object",
"additionalProperties": true
}