Key surface — what Quatarly hides

Six tools for inspecting, testing, and auditing any sl- key. Live tests hit api.slashed.pro and show the raw response. Mocked sections carry an explicit pending-backend badge. The Quatarly comparison is based on documented behavior per /vs-quatarly receipts, not a live call.

SURFACE LIVE
Endpoint LIVE
api.slashed.pro/v1
/models · returns real JSON
Models discovered LIVE
frontier
populated on first fetch
Sections rendered
6tools
vs Quatarly's 1 (key list only)
Data shape
3live3 stubbed
stubs carry BACKEND PENDING badge

AKey Registry

Prefix-only preview, label, scope, cap, status. Generate mints a real sl- key as soon as auth service is live.

BACKEND PENDING — generation stubbed
Label · Prefix Scope Created Last used Monthly cap Status Actions

BPer-key Usage

Prompt + completion tokens split, dollar spend per rate card, call/error/rate-limit counts. Range tabs filter.

BACKEND PENDING — illustrative shape
Tokens consumed shape only
tok
prompt:
completion:
Dollar spend · per rate card shape only
$
list:
saved:
computed from real rate card · /pricing receipts
Calls · errors · 429s shape only
errors:
429s:
success rate over range

Per-model breakdown · this key

CLive test · the killer feature

Paste any sl- key, hit Test. We actually call api.slashed.pro and show the raw response. Quatarly hides this; we surface it.

LIVE — actual fetch to api.slashed.pro
Paste key · never logged · sent only to api.slashed.pro

1. Models discovery GET /v1/models

Calls api.slashed.pro/v1/models with your key as Bearer. Returns the full model list + raw JSON.

2. Chat completion · 1 token POST /v1/chat/completions

Calls /v1/chat/completions with messages:[{role:"user",content:"ping"}], max_tokens:1. Shows tokens, cost (per rate card), response time, model returned.

Deeper proof · 6 tests minting demo key…

No paste needed. Page mints an ephemeral sl-demo-XXX key (60-sec TTL, auto-renews every 50s, rate-gated). Every test below hits the real gateway — what you see is what your code gets.

03Model A/B/C race 3 × POST /v1/chat/completions · Promise.all

gpt-5.4 · claude-opus-4-7 · gpt-5.3-codex

Same prompt, three frontier models, fired in parallel. Pick any model, same API surface, half the list price. The killer demo — drop us into your switch statement and route by latency, cost, or quality without writing three SDKs.

gpt-5.4openai
waiting…
$—
claude-opus-4-7anthropic
waiting…
$—
gpt-5.3-codexopenai · code
waiting…
$—

04Server-sent streaming POST /v1/chat/completions · stream:true

gpt-5.4 · SSE · time-to-first-token measured

Same shape as OpenAI's data: {...}\\ndata: [DONE] wire format. Tokens land as they're generated — no buffering, no rebuffering. Real bytes through the SLASHED proxy.

waiting…

05JSON mode · structured output response_format: json_object

gpt-5.4 · validated parse · pretty-printed

Ask the model for a JSON object, get a JSON object. We pass response_format: {type: "json_object"} through to the upstream, then parse + validate before the bytes ever hit your code. If the model returns malformed JSON, we surface that loudly instead of silently retrying.

waiting…

06Long context · large window ~100k chars · ~25k tokens

gpt-5.4 · needle-in-haystack

Paste a long doc, ask one question, prove the proxy handles full upstream context without truncation. Same envelope as a 200-byte ping — no special endpoint, no payload chunking, no special headers.

Doc length: 0 chars / ~0 tok Cap (gpt-5.4): large (validated up to ~25k prompt tok live)
waiting…

07Code generation · codex models gpt-5.3-codex

syntax-highlighted · ready-to-paste

The gpt-5.3-codex code model, same chat-completions envelope. Pre-filled with a small Python task — proves you can use SLASHED for inline code-gen at half the OpenAI list price.

waiting…

08Cleaning layer · raw vs clean diff debug-mode response

10 receipts applied · documented behavior

This is what you actually buy. Upstream models return inconsistent shapes — prompt-tax preamble ("Sure, here's…"), content: null when there's a tool call, missing usage on streaming, etc. We strip all of it on the wire so your code sees one canonical shape across all 11 frontier models.

RAW upstream as openai returned it
waiting…
SLASHED cleaned what your code sees
waiting…
Side-by-side · SLASHED vs Quatarly · same call DOCUMENTED BEHAVIOR — based on /vs-quatarly receipts, not live call
Surface
SLASHED · live fetch with your key, shown above
Quatarly · ledger-only, no public test affordance
Raw JSON
exposed in pane → JetBrains Mono viewer
hidden behind dashboard summary card
Prompt/completion split
surfaced per-call, both numbers
aggregated · single total only
Per-call cost
computed live · rate card transparent
rolled into monthly bill — no per-call line
Response ID
shown · clickable in audit (Section D)
not exposed to user
Latency
real-time wall-clock + envoy upstream-time header
not surfaced · single column in monthly report
Rate-limit headers
X-RateLimit-* surfaced verbatim (Section E)
stripped from upstream response

DPer-key audit log

Last 100 calls. Timestamp, model, tokens, status, latency, response_id. Filter by status class. CSV / JSON export.

BACKEND PENDING — gateway logging wires soon
Filter
off · 5s tick
Timestamp Key Model Prompt tok Compl tok Status Latency Response ID

ERate-limit dashboard

Token-bucket state per key. The exact X-RateLimit-* headers SLASHED returns. Burst tester to observe backoff.

BACKEND PENDING — bucket state stubbed
X-RateLimit-Limit-Requests: 500 # per-minute · per-key
X-RateLimit-Remaining-Requests: 387 # 113 used in current window
X-RateLimit-Reset-Requests: 42s # until window flush
X-RateLimit-Limit-Tokens: 300000 # per-minute · per-key
X-RateLimit-Remaining-Tokens: 218400
X-RateLimit-Reset-Tokens: 42s
Retry-After: — (only on 429)
Burst tester: 10 rapid GET /v1/models calls with the key from Section C. Watch backoff behavior live. Each call is real; the 429 ladder is what we actually return.

FHealth check · per key

One-button validation. Runs /v1/models, /v1/chat/completions probe, and scope verification against the model registry.

LIVE — actual fetch
Uses the key pasted in Section C. Each probe is shown with status, latency, and the truncated response. Scope verification compares the key's allowlist against the live model registry.
Models endpoint ·
GET /v1/models · validates key is reachable, returns model list.
Chat probe ·
POST /v1/chat/completions · minimal 1-token call. Validates auth + billing path.
Scope verification ·
Cross-checks key's documented scope against live model registry. Flags scope drift.

What this surface exposes · vs Quatarly Per /vs-quatarly receipts · documented Quatarly behavior

SLASHED · /dashboard-key-plug

  • Live test affordance hitting api.slashed.pro
  • Raw JSON viewer for every response
  • Prompt + completion token split per call
  • Per-model cost breakdown per key
  • Audit log w/ response_id, filterable, CSV+JSON export
  • X-RateLimit-* headers surfaced verbatim
  • Burst tester for backoff observation
  • Scope verification against live registry
  • Health check with truncated probe results
  • Generation modal w/ scope + cap + expiry

Quatarly · documented dashboard

  • · Key list view, no test affordance
  • · Hidden raw JSON behind summary cards
  • · Aggregated tokens only — no split
  • · Monthly total only — no per-model break
  • · No audit log surfaced to user
  • · Rate-limit headers stripped upstream
  • · No burst tester
  • · No scope verification UI
  • · Status page only, no per-key probe
  • · Generation gated behind sales call