Unicode-aware sentence/word/grapheme segmentation API

Segment text into sentences, words and extended grapheme clusters with the Unicode-aware Intl.Segmenter (emoji and combining marks count as one grapheme), returning counts at each level plus average words-per-sentence, chars-per-word, code-point and UTF-16 lengths, and reading/speaking time. Correct Unicode segmentation, not a naive space/period split. Answers 'count the sentences in this', 'how many words and characters', 'segment this text Unicode-aware', 'reading time of this passage'.

Price$0.01per request

MethodPOST

Route/v1/text/sentences

StatusLive

MIME typeapplication/json

Rate limit60/minute

CacheNo cache

textsegmentationsentenceswordsgraphemesunicodenlptokenize

API URLhttps://x402.hexl.dev/v1/text/sentences

Integration docs

Example request

{
  "text": "Hello world. How are you?"
}

Example response

{
  "sentenceCount": 2,
  "wordCount": 5,
  "graphemeCount": 25,
  "codePointCount": 25,
  "utf16Length": 25,
  "avgWordsPerSentence": 2.5,
  "avgCharsPerWord": 3.4,
  "readingTimeSeconds": 1.5,
  "speakingTimeSeconds": 2.31,
  "sentences": [
    "Hello world.",
    "How are you?"
  ]
}

Input schema

{
  "type": "object",
  "required": [
    "text"
  ],
  "properties": {
    "text": {
      "type": "string",
      "examples": [
        "Hello world. How are you?"
      ]
    }
  }
}

Output schema

{
  "type": "object",
  "additionalProperties": true
}