How to Track Per-Agent LLM Costs (2026 Guide)

Most teams running AI agents have zero per-agent cost visibility. They see a monthly Anthropic bill but have no idea which agent generated which charges. One misconfigured agent can burn 60% of your budget before you notice. Here's how to actually fix that — three approaches compared.

The Problem: Your AI Bill Is a Black Box

You ship five agents to production. Sales assistant. Support bot. Data pipeline. Code reviewer. Document summarizer. By end of month, you've spent $840 on Anthropic. But which agent cost $400? Which one cost $12?

If you can't answer that question, you're flying blind.

Real scenario

A dev team at a mid-size SaaS shipped an internal research agent. It worked great in testing — a few hundred tokens per run. After a quiet code change bumped the context window, it started loading entire document libraries into every prompt. Within 10 days, it had consumed 68% of the month's LLM budget. Nobody noticed until the invoice arrived.

This is the default state for most teams: a single API key, all agents sharing it, one undifferentiated line on the provider invoice at the end of the month.

The problem isn't the cost itself — it's that you have no way to attribute it, no way to see it coming, and no way to stop it.

Why Per-Agent Tracking Actually Matters

Before getting into the how, it's worth being precise about why aggregate cost visibility isn't good enough.

1. You can't optimize what you can't measure

If you know your total monthly LLM spend is $1,200 but don't know which agent is responsible for what, you have no starting point for optimization. Should you compress prompts? Cache responses? Downgrade models for certain tasks? You can't make those calls without per-agent cost data.

2. Agent cost profiles change without warning

An agent that costs $0.002 per run today can cost $0.40 per run tomorrow. Input data grows. A new feature adds context. A dependency pulls in extra tokens. Without per-agent tracking, these regressions are invisible until the bill arrives.

3. Runaway loops are undetectable from aggregate data

Infinite loops, retry storms, and runaway recursive calls all look identical to "normal usage" in aggregate dashboards — just a higher number. Per-agent tracking with anomaly detection can flag a single agent consuming 10x its normal spend within minutes.

4. Billing allocation becomes impossible at scale

Once you're running 10+ agents across multiple teams or customers, you need per-agent cost data to do chargebacks, per-customer billing, and departmental budget allocation. Trying to reconstruct this from aggregate data after the fact is painful and error-prone.

3 Approaches to Track LLM Costs Per Agent

There's no single right answer here — the approach that works depends on your scale, your stack, and how much engineering time you want to spend on infrastructure vs. product.

Manual Logging (DIY)

Wrap every LLM call, capture token counts from the response object, and write cost attribution logic yourself. Works, but doesn't scale.

The pattern looks roughly like this:

JavaScript / Node.js

const response = await anthropic.messages.create({
  model: "claude-opus-4-5",
  messages: [...],
  metadata: { user_id: agentId }
});

// Capture usage from response
const inputTokens = response.usage.input_tokens;
const outputTokens = response.usage.output_tokens;

// Calculate cost (hardcoded — breaks when pricing changes)
const cost = (inputTokens * 0.000015) + (outputTokens * 0.000075);

// Log it somewhere
await db.insert('llm_usage_logs', {
  agent_id: agentId,
  model: response.model,
  input_tokens: inputTokens,
  output_tokens: outputTokens,
  cost_usd: cost,
  timestamp: new Date()
});

The problems with this approach compound quickly:

Token pricing changes — your hardcoded rates go stale
Every new model requires updating pricing tables manually
You still need to build dashboards, alerts, and aggregations
Multi-provider setups (OpenAI + Anthropic + Gemini) require separate schemas
No anomaly detection — still find out about runaway agents from the invoice

ⓘ Good for: 1–2 agents, single provider, team with bandwidth to maintain infra. Doesn't scale past that.

Provider Dashboards (OpenAI / Anthropic)

Both OpenAI and Anthropic offer usage dashboards with some filtering capabilities. They're free, require no integration, and show token consumption over time.

What you actually get:

Feature	OpenAI Dashboard	Anthropic Console
Total spend over time	✓	✓
Per-model breakdown	✓	✓
Per-API-key breakdown	~ (Projects)	✗
Per-agent attribution	✗	✗
Budget alerts	~ (Account level)	✗
Multi-provider view	✗	✗
Real-time anomaly detection	✗	✗

OpenAI's Projects feature gets you close to per-key isolation — if you're disciplined about creating a separate project per agent. The downsides: each project needs its own billing setup, key rotation is manual, and cross-project views don't exist.

Anthropic's console is cleaner but offers less granularity. You get account-level usage data. That's it.

⚠ Works for: Single-provider teams tracking total spend. Completely blind on per-agent attribution, multi-provider workloads, and real-time anomaly detection.

Dedicated Cost Tracking Tool

Tools built specifically for AI agent cost tracking handle the attribution layer for you — connecting to your providers, normalizing usage data, and surfacing per-agent cost breakdowns without requiring you to build and maintain the plumbing.

The tradeoff is integration overhead upfront (usually 15–30 minutes) vs. ongoing engineering maintenance with the DIY approach. For teams with more than 3 agents or multiple providers, the math typically favors the dedicated tool.

We covered the broader tool landscape in 8 Developer Tools for AI Agent Cost Tracking (2026) — worth reading if you're evaluating options. The rest of this article focuses on what the setup actually looks like with Costline.

✓ Best for: Teams with 3+ agents, multiple providers, or anyone who wants budget alerts and anomaly detection without building it themselves.

How Costline Handles Per-Agent Cost Attribution

Costline is built around a single premise: your LLM costs should be visible at the agent level, in real time, with budget controls per agent — not aggregated across everything and discovered at month-end.

Here's the core model:

One API key per provider — you connect your OpenAI, Anthropic (and other) API keys once. Costline reads usage directly from provider APIs.
Agent tagging via metadata or SDK — each LLM call gets tagged with an agent identifier. Costline maps usage to agents from that tag.
Real-time cost dashboard — per-agent spend updates continuously, not at end-of-month billing cycle.
Per-agent budget thresholds — set a spend limit per agent. Get alerted when it's hit. Optionally hard-cut the agent at limit.

Free tier

Costline's free tier covers 2 providers and unlimited agents. No credit card required. It stays free — the paid plans add team seats, longer retention, and API access for programmatic budget control.

Quick Start: 3 Steps to See Per-Agent Costs

Connect your providers

Sign up at costline.polsia.app/signup — no credit card needed. Add your OpenAI and/or Anthropic API keys in Settings → Providers. Costline uses read-only access to pull usage data. Your keys never proxy traffic.

Tag your agents

Pass a metadata tag on each LLM call — a single field identifying which agent generated the request. For OpenAI, use the user field or custom metadata. For Anthropic, use the metadata.user_id field. Costline maps these to agent identifiers you define in the dashboard.

OpenAI example

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [...],
  user: "agent:sales-assistant-v2"  // Costline reads this
});

Anthropic example

const response = await anthropic.messages.create({
  model: "claude-opus-4-5",
  messages: [...],
  metadata: { user_id: "agent:support-bot" }  // Costline reads this
});

Set per-agent budgets and alerts

In the Costline dashboard, define each agent and set a monthly spend threshold. You'll get an alert (email or webhook) when an agent approaches or hits its limit. For production agents, you can configure a hard cutoff — Costline will reject requests once the budget is exhausted.

That's the core loop. After setup, the dashboard shows each agent's daily, weekly, and monthly cost trends — and flags anomalies when an agent's spending pattern deviates from its baseline.

What Good Per-Agent Visibility Actually Looks Like

Once you have per-agent cost attribution working, a few things become immediately actionable:

Model right-sizing. You'll likely find that 2–3 agents are using GPT-4o or Claude Opus for tasks that don't require that level of capability. Downgrading those agents to a smaller model is a mechanical cost reduction — often 5–10x cheaper per call with minimal quality impact for structured tasks.

Prompt compression. The agents burning the most tokens are usually doing something inefficient with context — loading full documents, repeating system prompts on every turn, or not caching stable context. Per-agent data makes these targets obvious.

Caching opportunities. If an agent has a high call volume but repetitive inputs, semantic caching can dramatically reduce token consumption. You can't prioritize this without knowing which agents have the right profile for it.

Cost per outcome. Once you know what each agent costs, you can start connecting it to value — cost per ticket resolved, cost per deal qualified, cost per document processed. That's the metric that actually matters for AI ROI conversations.

Summary: Which Approach Is Right for You?

Your situation	Recommended approach
1–2 agents, single provider, have engineering bandwidth	Manual logging — total control, maintenance cost acceptable at this scale
Just want to see total spend, no per-agent need	Provider dashboard — free, no setup, good enough for aggregate visibility
3+ agents, multiple providers, or need budget controls	Dedicated tool (Costline) — per-agent attribution without maintaining the infra
Production agents, need anomaly detection + hard cutoffs	Dedicated tool — this is exactly the use case manual logging and provider dashboards can't cover
Free tier requirement, unlimited agents	Costline — 2 providers, unlimited agents, no credit card

Not sure what your fleet costs? Try our free AI agent cost calculator — estimate your monthly LLM spend by agent count, call volume, and model. No signup needed.

See Your Agent Costs in 15 Minutes

Connect your providers, tag your agents, and get per-agent cost breakdowns — free for 2 providers and unlimited agents.

Start for free →

No credit card. 2 providers free. Unlimited agents. Cancel anytime.

How to Track Per-Agent LLM Costs (2026 Guide)

The Problem: Your AI Bill Is a Black Box

Why Per-Agent Tracking Actually Matters

1. You can't optimize what you can't measure

2. Agent cost profiles change without warning

3. Runaway loops are undetectable from aggregate data

4. Billing allocation becomes impossible at scale

3 Approaches to Track LLM Costs Per Agent

How Costline Handles Per-Agent Cost Attribution

Quick Start: 3 Steps to See Per-Agent Costs

Connect your providers

Tag your agents

Set per-agent budgets and alerts

What Good Per-Agent Visibility Actually Looks Like

Summary: Which Approach Is Right for You?

See Your Agent Costs in 15 Minutes

Related

Ready to see your actual costs?