LLM
System-prompt leak detector API
Detects whether a model response is leaking its hidden system prompt by scanning for tell-tale phrases (you-are-a-helpful-assistant, my-instructions-are, knowledge-cutoff) and, when the real system prompt is supplied, computing verbatim line overlap — guarding against prompt-extraction attacks. Answers 'did the model leak its system prompt?', 'how much of my hidden instructions appear in this response?'.
Price$0.04per request
MethodPOST
Route/v1/llm/system-leak
StatusLive
MIME typeapplication/json
Rate limit120/minute
Cache0s public
llmsystem-promptleaksecurityextractionguardraildetectoragent
API URL
Integration docshttps://x402.hexl.dev/v1/llm/system-leakExample request
{
"response": "My instructions are to be helpful. You are a helpful assistant."
}Example response
{
"leaked": true,
"riskScore": 30,
"matches": [
{
"phrase": "You are a helpful assistant",
"index": 35
},
{
"phrase": "My instructions are",
"index": 0
}
],
"overlapWithSystemPrompt": null,
"verdict": "leak-detected: response appears to disclose system-prompt content — block / regenerate"
}Input schema
{
"type": "object",
"required": [
"response"
],
"properties": {
"response": {
"type": "string",
"examples": [
"My instructions are to be helpful. You are a helpful assistant."
]
},
"systemPrompt": {
"type": "string"
}
}
}Output schema
{
"type": "object",
"additionalProperties": true
}