Quickstart
Get your API key from the Console, then make your first request.
curl
curl https://api.high-snr.com/v1/optimize \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document": "Your long document text goes here...",
"max_output_tokens": 2000
}'
Python
import requests
response = requests.post(
"https://api.high-snr.com/v1/optimize",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"document": "Your long document text goes here...",
"max_output_tokens": 2000,
},
)
chunks = response.json()["selected_chunks"]
Response
{
"selected_chunks": [
"Highest signal passage from your document...",
"Second highest signal passage..."
],
// present when return_metadata: true
"metadata": {
"input_tokens": 1840,
"output_tokens": 1200,
"compression_ratio": 0.6522
},
// present when return_chunk_metadata: true
"chunk_metadata": {
"selected_chunk_indices": [0, 2, 3],
"discarded_chunks": ["Low-signal passage..."],
"discarded_chunk_indices": [1]
}
}
Authentication
All API requests require a Bearer token in the Authorization header.
To get an API key:
- Sign up at console.high-snr.com
- Go to API Keys and click Create key
- Copy the key — it is shown only once
Concepts
A chunk is a contiguous passage of text — typically a paragraph or a group of related sentences. When you send a document, HighSNR splits it into chunks automatically. You can also send pre-split chunks directly. The API selects and returns the highest-signal chunks that fit within your token budget, preserving their original order.
The compression ratio is output_tokens / input_tokens — the fraction of the input that was kept.
A ratio of 0.8 means 80% of the input tokens were returned (20% discarded).
Lower is more aggressive compression; higher retains more of the original.
Available when return_metadata: true.
Endpoint
One endpoint. Pass a document or pre-split chunks and a token budget.
chunks.
document.
true.
metadata object. Default: false.
chunk_metadata object. Default: false.
Errors
All errors follow a standard shape:
{
"error": {
"code": "payment_required",
"message": "Token quota exhausted. Email hello@high-snr.com if you need more.",
"reset_at_utc": "2026-03-17T00:00:00Z"
}
}
- • Daily quota used up (resets at UTC midnight)
- • Free trial period expired
- • Total free tokens exhausted
reset_at_utc. Contact hello@high-snr.com for more tokens.
document and chunks supplied).
Quotas
Token usage is counted on input tokens sent to the API.
Free allocation
2M tokens or 14 days, whichever comes first. 250K tokens/day. No card required.
Daily quota
Resets at UTC midnight every day.
API keys
Up to 2 keys. Manage keys from the Console.
Current usage and remaining balances are visible in the Console dashboard. Need more tokens? hello@high-snr.com
Privacy & data retention
HighSNR stores zero document text. Request bodies are never logged, stored, or used for any purpose beyond producing the response.
Only counters and billing metadata are persisted (request count, token counts, timestamps). No content leaves the request/response cycle.
Benchmarks
HighSNR has been evaluated on LongBench v1 using GPT-4o across HotpotQA and Qasper (200 samples each). All benchmarks are fully reproducible — scripts, data, and results are published on GitHub.
HotpotQA · 90% budget · with hint
F1 71.57 vs full-doc baseline 69.71
Exceeds full-context GPT-4o
Qasper · 90% budget · with hint
F1 46.25 vs full-doc baseline 47.22
97.9% of full-context GPT-4o
Integrations
Use HighSNR directly from your existing LLM stack.
LangChain
Available
HighSNRDocumentCompressor and
HighSNRDocumentTransformer — drop HighSNR
into any LangChain pipeline with two lines of code.
Official LangChain docs listing under review.
LlamaIndex
Coming soon
Native TransformComponent and
BaseNodePostprocessor for LlamaIndex pipelines. Examples coming soon.
REST API
Works with any HTTP client. See the quickstart above.