Sentry for AI costs

Track every LLM call's cost, tokens, and latency with one line of code. Zero overhead. Auto-detects OpenAI, Anthropic, Google, and Azure.

1import { CostKey } from 'costkey'
2
3CostKey.init({ dsn: 'https://ck_...@costkey.dev/proj' })
4// That's it. Every AI call is now tracked.
View on GitHub →
How it works

From zero to full observability

One init call. CostKey patches fetch, intercepts AI calls, captures stack traces, and ships everything to your dashboard — automatically.

CostKey.init()

Patches globalThis.fetch
Like Sentry — one line

🔍

Auto-detect

Recognizes OpenAI, Anthropic,
Google, Azure by URL

📋

Extract

Tokens, cost, latency, TTFT
Stack trace for attribution

📊

Observable

Every call visible by function,
feature, trace, and model

Dashboard

Make every LLM call observable

Cost by function, auto-detected features, request-level traces — all from stack traces, zero manual tagging.

📊 Cost overview

Total spend, call volume, latency, TTFT — plus cost breakdown by model. See where your money goes at a glance.

app.costkey.dev — Overview
Cost overview with stats and model breakdown
🔗 Call chain grouping

Each feature expands to show its AI call sites — which functions, how many tokens, what cost — grouped by shared parent functions.

app.costkey.dev — Call chains
Call chain grouping view
⚡ Request-level traces

Every HTTP request gets a trace. See exactly which AI calls happened, their cost, duration, and models — all grouped per request.

app.costkey.dev — Traces
Request-level traces view
Features

Everything you need. Nothing you don't.

Auto-Detection

Patches fetch, detects AI providers by URL. OpenAI, Anthropic, Google, Azure — all automatic.

📍

Stack Trace Attribution

See which function costs what. Captures call site on every AI call. Zero manual tagging.

🔗

Request Tracing

Group all AI calls per request. See total cost, tokens, and latency per user action.

🌳

Feature Auto-Grouping

Detects features from call chain analysis. Your search pipeline's cost, broken down automatically.

Streaming Metrics

Time to first token, tokens/sec, chunk timing. Full observability for streaming responses.

🔒

Credential Scrubbing

Never captures API keys. Headers are never read. Auto-redacts secrets from captured bodies.

Comparison

Why CostKey?

CostKeyPortkeyLiteLLM
Setup1 lineProxy configProxy config
ApproachSDK (no proxy)ProxyProxy
Latency overhead~0ms20-40msVariable
CostFree$49+/monthFree (self-host)
Code attributionAutomatic (stack traces)Manual tagsManual tags
TypeScript native Python
Request tracing Auto Manual
Feature detection Auto