Article / main-content extraction API

Given a URL, pull out the clean title, author, published date, site name, and main text — stripping nav/ads/boilerplate (the hard part). Research agents feed this to themselves constantly. Answers 'extract the article', 'readable text of this URL', 'get the main content of this page', 'who wrote this and when'.

Price$0.01per request

MethodPOST

Route/v1/web/article

StatusLive

MIME typeapplication/json

Rate limit30/minute

Cache3600s public

webarticleextractionreadabilityscrapingcontentmain-textboilerplate

API URLhttps://x402.hexl.dev/v1/web/article

Integration docs

Example request

{
  "url": "https://en.wikipedia.org/wiki/Ethereum",
  "maxChars": 500
}

Example response

{
  "url": "https://en.wikipedia.org/wiki/Ethereum",
  "title": "Ethereum",
  "author": null,
  "published": null,
  "text": "Ethereum is a decentralized blockchain with smart contract functionality…",
  "truncated": true,
  "length": 48000
}

Input schema

{
  "type": "object",
  "required": [
    "url"
  ],
  "properties": {
    "url": {
      "type": "string",
      "examples": [
        "https://en.wikipedia.org/wiki/Ethereum"
      ]
    },
    "maxChars": {
      "type": "number",
      "default": 5000
    }
  }
}

Output schema

{
  "type": "object",
  "additionalProperties": true
}