Offline Cost Telemetry
ai-core-kit measures the USD cost of a Claude Code run by reading transcript token-usage lines and multiplying them by a versioned pricing map. It attributes spend by model, feature, agent, and session. This is a post-run, offline tool — it is SHIPPED in both layers.
Transcripts under ~/.claude/projects/**/*.jsonl carry an assistant message.usage on every assistant turn; multiplied by the versioned pricing.json (USD per MTok, ÷ 1e6) and bucketed by model / feature / agent / session, then reconciled — sum of buckets must equal the grand total or it exits non-zero (MISMATCH). Hooks (PreToolUse / PostToolUse) carry NO token or cost fields (#11008), so live metering is impossible.
The one hard constraint. There is no live cost meter, and there never can be one for Claude Code spend. State this before promising any real-time number.
Why offline only: issue #11008
Claude Code hooks (PreToolUse, PostToolUse, …) receive only session_id,
transcript_path, cwd, permission_mode, and hook_event_name. They carry
no token or cost fields
(anthropics/claude-code#11008,
open). Consequences:
- A hook cannot emit a live cost number;
PostToolUsecannot meter spend. PostToolUseonly fires on tool turns. In a representative transcript, 61 of 98 assistant turns were text-only (no tool call) — invisible toPostToolUse. Apportioning cost by tool activity would silently drop the majority (~61%) of spend.
The fix the kit implements: compute all cost from the assistant
message.usage lines in the transcript — every assistant turn has one, tool or
not — multiplied by pricing.json. This captures 100% of spend and is fully
reproducible offline.
The two files
| File | Role |
|---|---|
telemetry/aggregate.py | stdlib-only offline aggregator (no third-party deps). |
telemetry/pricing.json | versioned model → USD/MTok map with an as_of date. |
The identical engine lives in the META repo (telemetry/) to measure the
cost of building ai-core-kit itself, and ships to forked CHILD projects under
templates/telemetry/, wired by /ack-init when telemetry.enabled: true.
How cost is computed
For each assistant line, message.usage provides the token counts and
message.model selects the price row:
| usage field | priced at pricing.json key |
|---|---|
input_tokens | input |
output_tokens | output |
cache_read_input_tokens | cache_read |
cache_creation.ephemeral_5m_input_tokens | cache_write_5m |
cache_creation.ephemeral_1h_input_tokens | cache_write_1h |
cache_creation_input_tokens (no split present) | cache_write_5m (default ephemeral) |
Prices are USD per 1,000,000 tokens (MTok); the aggregator divides by 1e6.
Attribution axes
--by selects one or more of model,feature,agent,session (default: all four).
- model — keyed on the exact
message.modelid. Always exact and reliable. - session — keyed on
sessionId. Always reliable. - agent — transcripts have no agent name, so the tool uses the one
agent-adjacent signal that exists:
isSidechain. A non-sidechain turn buckets tomain; a sidechain (subagent /Task) turn buckets tosubagent:<requestId>. This separates main-session spend from delegated spend. - feature — transcripts carry no native feature field, so a feature label
comes from one of two explicit conventions (below). Anything matching no rule
lands in the default bucket (
unattributed) — never silently dropped.
Feature attribution: branch_prefix vs sidecar_map
Two mutually exclusive ways to derive a feature label (set the default in the
manifest under telemetry.attribution.mode; CLI flags override):
1. branch_prefix (default; zero extra tooling). Work each feature on its own
branch named <prefix><feature>. With --branch-prefix feat/, the turn’s
gitBranch after the prefix becomes the bucket:
gitBranch = "feat/order-intake" --branch-prefix "feat/" → bucket "order-intake"
gitBranch = "main" → bucket "<default>"2. sidecar_map (precise; needs a tiny recorder). A JSON file maps time
windows to bucket labels; a turn whose timestamp falls in [from, to) buckets
to that entry. A SessionStart hook can legitimately record a
timestamp → contract_id mapping (it just can’t record cost). Pass it with
--sidecar-map sidecar.json; this overrides branch_prefix.
Reconciliation guarantee
For every axis the aggregator proves that the sum of per-bucket costs equals
the grand total (within float epsilon). The human table prints a
reconcile vs total … OK line per axis; the JSON output carries
"reconciled": true|false; a mismatch exits non-zero. The default bucket
guarantees the identity holds even when nothing matches a feature/agent rule. If
an axis shows MISMATCH, the report is not trustworthy — do not quote the numbers.
Fail-loud guarantees
- Unknown model → hard error naming the offending
message.model, listing the known ids, exit1(unknown_model_policy: error). Cost is never silently under-counted because a new model slipped in. The fix is to add a row topricing.json(copy a same-tier row, set USD/MTok values, bumpas_of). - Missing / invalid
pricing.json→ hard error, exit1. - Bucket sums don’t reconcile → hard error, exit
1. - Bad
--byaxis / missing sidecar file → usage error, exit2. - A single malformed JSONL line is skipped (not fatal) so one bad line can’t void an otherwise complete report.
Running it
# whole machine, default ~/.claude/projects, all four axes, table + JSON:
python3 telemetry/aggregate.py
# this build only, since a date, feature + model + agent, JSON only:
python3 telemetry/aggregate.py \
--project-dir ~/.claude/projects \
--since 2026-06-01 \
--by feature,model,agent \
--branch-prefix feat/ \
--format json
# precise feature attribution via a sidecar timestamp→contract map:
python3 telemetry/aggregate.py --sidecar-map telemetry/sidecar.local.json --by featureKey flags: --project-dir, --since YYYY-MM-DD (UTC), --pricing PATH, --by,
--branch-prefix, --default-bucket (default unattributed), --sidecar-map,
--manifest (CHILD only — reads telemetry.* defaults; CLI flags win),
--format table|json|both.
Budgets are advisory
pricing.json produces actuals. Budgets (advisory USD caps) live in the
CHILD manifest under telemetry.budgets[] (scope project|feature|contract|agent).
The aggregator’s per-bucket totals are what you compare against the caps — caps
flag overage, they never enforce or block anything live.
The two cost skills
Both ship as CHILD payload (MIT):
cost-telemetry— runsaggregate.pyand interprets its output. The single source of truth for the numbers. It confirmstelemetry.enabled: true, picks scope, reads attribution defaults from the manifest, runs the aggregator, verifiesreconciled: YES, and calls out a largeunattributedbucket. Rendered whenfeatures.cost_telemetry == true.cost-audit— evidence-first investigation of why spend spiked (runaway job/PR creation, quota bypass, premium-model leakage, duplicate fanout, retry burn). It delegates the numbers tocost-telemetry— it never re-derives pricing math — and adds the dimensions telemetry cannot see: provider API spend outside Claude Code, infrastructure cost, and the behavioral root cause.
Locality note
Transcripts live under ~/.claude/projects/<encoded-cwd>/…, so the tool reads
local transcripts only. Aggregating across machines requires first staging the
JSONL into one --project-dir. There is no network/collection step by design.
See also: Skills catalog.