Retrieval
Shingle Jaccard near-dup API
Computes Jaccard similarity over character or word n-gram shingles of two texts for near-duplicate detection. Answers 'How similar are these two texts by shingling?', 'Are these documents near-duplicates?'.
Price$0.01per request
MethodPOST
Route/v1/retrieval/shingle-jaccard
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
shinglejaccardngramnear-duplicatededuptextsimilarityrag
API URL
Integration docshttps://x402.hexl.dev/v1/retrieval/shingle-jaccardExample request
{
"a": "the quick brown fox",
"b": "the quick red fox",
"shingleSize": 3,
"mode": "word"
}Example response
{
"jaccardSimilarity": 0,
"mode": "word",
"shingleSize": 3,
"shinglesA": 2,
"shinglesB": 2,
"intersection": 0,
"union": 4,
"isNearDuplicate": false
}Input schema
{
"type": "object",
"required": [
"a",
"b"
],
"properties": {
"a": {
"type": "string"
},
"b": {
"type": "string"
},
"shingleSize": {
"type": "integer"
},
"mode": {
"type": "string",
"enum": [
"char",
"word"
]
}
}
}Output schema
{
"type": "object",
"additionalProperties": true
}