Quickstart

Get your API key from the Console, then make your first request.

curl

POST https://api.high-snr.com/v1/optimize
curl https://api.high-snr.com/v1/optimize \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": "Your long document text goes here...",
    "max_output_tokens": 2000
  }'

Python

import requests

response = requests.post(
    "https://api.high-snr.com/v1/optimize",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "document": "Your long document text goes here...",
        "max_output_tokens": 2000,
    },
)
chunks = response.json()["selected_chunks"]

Response

{
  "selected_chunks": [
    "Highest signal passage from your document...",
    "Second highest signal passage..."
  ],
  // present when return_metadata: true
  "metadata": {
    "input_tokens": 1840,
    "output_tokens": 1200,
    "compression_ratio": 0.6522
  },
  // present when return_chunk_metadata: true
  "chunk_metadata": {
    "selected_chunk_indices": [0, 2, 3],
    "discarded_chunks": ["Low-signal passage..."],
    "discarded_chunk_indices": [1]
  }
}

Authentication

All API requests require a Bearer token in the Authorization header.

Authorization: Bearer co_...

To get an API key:

  1. Sign up at console.high-snr.com
  2. Go to API Keys and click Create key
  3. Copy the key — it is shown only once

Concepts

A chunk is a contiguous passage of text — typically a paragraph or a group of related sentences. When you send a document, HighSNR splits it into chunks automatically. You can also send pre-split chunks directly. The API selects and returns the highest-signal chunks that fit within your token budget, preserving their original order.

The compression ratio is output_tokens / input_tokens — the fraction of the input that was kept. A ratio of 0.8 means 80% of the input tokens were returned (20% discarded). Lower is more aggressive compression; higher retains more of the original. Available when return_metadata: true.

Endpoint

One endpoint. Pass a document or pre-split chunks and a token budget.

Request body
fieldtypedescription
document string Full text to compress. Mutually exclusive with chunks.
chunks string[] Pre-split passages. Mutually exclusive with document.
max_output_tokens integer Token budget for the response. Controls how much is returned.
context_hint string Optional. A query or topic to bias selection toward relevant chunks. Max 2,000 characters.
include_boundaries boolean Keep the first and last chunk in the output. Useful when intro/conclusion matter (e.g. summaries). Default: true.
return_metadata boolean Return token counts and compression ratio in a metadata object. Default: false.
return_chunk_metadata boolean Return selected/discarded chunk indices in a chunk_metadata object. Default: false.
Limit — Maximum input size is ~50K tokens (250K characters). Requests exceeding this are rejected with a 413.

Errors

All errors follow a standard shape:

{
  "error": {
    "code": "payment_required",
    "message": "Token quota exhausted. Email hello@high-snr.com if you need more.",
    "reset_at_utc": "2026-03-17T00:00:00Z"
  }
}
401 unauthorized Missing or invalid API key.
402 quota_exhausted Token quota exhausted. Returned when:
  • • Daily quota used up (resets at UTC midnight)
  • • Free trial period expired
  • • Total free tokens exhausted
Response includes reset_at_utc. Contact hello@high-snr.com for more tokens.
403 forbidden Key lacks the required scope.
429 rate_limited Too many requests. Limit is 60 requests/minute per key.
413 document_too_large Input exceeds ~50K tokens (250K characters).
422 validation_error Invalid request body (e.g. both document and chunks supplied).

Quotas

Token usage is counted on input tokens sent to the API.

Free allocation

2M tokens or 14 days, whichever comes first. 250K tokens/day. No card required.

Daily quota

Resets at UTC midnight every day.

API keys

Up to 2 keys. Manage keys from the Console.

Current usage and remaining balances are visible in the Console dashboard. Need more tokens? hello@high-snr.com

Privacy & data retention

HighSNR stores zero document text. Request bodies are never logged, stored, or used for any purpose beyond producing the response.

Only counters and billing metadata are persisted (request count, token counts, timestamps). No content leaves the request/response cycle.

Benchmarks

HighSNR has been evaluated on LongBench v1 using GPT-4o across HotpotQA and Qasper (200 samples each). All benchmarks are fully reproducible — scripts, data, and results are published on GitHub.

HotpotQA · 90% budget · with hint

F1 71.57 vs full-doc baseline 69.71

Exceeds full-context GPT-4o

Qasper · 90% budget · with hint

F1 46.25 vs full-doc baseline 47.22

97.9% of full-context GPT-4o

View full results and reproduction scripts on GitHub

Integrations

Use HighSNR directly from your existing LLM stack.

LangChain

Available

HighSNRDocumentCompressor and HighSNRDocumentTransformer — drop HighSNR into any LangChain pipeline with two lines of code.

Official LangChain docs listing under review.

LlamaIndex

Coming soon

Native TransformComponent and BaseNodePostprocessor for LlamaIndex pipelines. Examples coming soon.

REST API

Works with any HTTP client. See the quickstart above.