The Problem: Your AI Bill Is a Black Box

You ship five agents to production. Sales assistant. Support bot. Data pipeline. Code reviewer. Document summarizer. By end of month, you've spent $840 on Anthropic. But which agent cost $400? Which one cost $12?

If you can't answer that question, you're flying blind.

Real scenario

A dev team at a mid-size SaaS shipped an internal research agent. It worked great in testing — a few hundred tokens per run. After a quiet code change bumped the context window, it started loading entire document libraries into every prompt. Within 10 days, it had consumed 68% of the month's LLM budget. Nobody noticed until the invoice arrived.

This is the default state for most teams: a single API key, all agents sharing it, one undifferentiated line on the provider invoice at the end of the month.

The problem isn't the cost itself — it's that you have no way to attribute it, no way to see it coming, and no way to stop it.

Why Per-Agent Tracking Actually Matters

Before getting into the how, it's worth being precise about why aggregate cost visibility isn't good enough.

1. You can't optimize what you can't measure

If you know your total monthly LLM spend is $1,200 but don't know which agent is responsible for what, you have no starting point for optimization. Should you compress prompts? Cache responses? Downgrade models for certain tasks? You can't make those calls without per-agent cost data.

2. Agent cost profiles change without warning

An agent that costs $0.002 per run today can cost $0.40 per run tomorrow. Input data grows. A new feature adds context. A dependency pulls in extra tokens. Without per-agent tracking, these regressions are invisible until the bill arrives.

3. Runaway loops are undetectable from aggregate data

Infinite loops, retry storms, and runaway recursive calls all look identical to "normal usage" in aggregate dashboards — just a higher number. Per-agent tracking with anomaly detection can flag a single agent consuming 10x its normal spend within minutes.

4. Billing allocation becomes impossible at scale

Once you're running 10+ agents across multiple teams or customers, you need per-agent cost data to do chargebacks, per-customer billing, and departmental budget allocation. Trying to reconstruct this from aggregate data after the fact is painful and error-prone.


3 Approaches to Track LLM Costs Per Agent

There's no single right answer here — the approach that works depends on your scale, your stack, and how much engineering time you want to spend on infrastructure vs. product.

1
Manual Logging (DIY)

Wrap every LLM call, capture token counts from the response object, and write cost attribution logic yourself. Works, but doesn't scale.

The pattern looks roughly like this:

JavaScript / Node.js
const response = await anthropic.messages.create({
  model: "claude-opus-4-5",
  messages: [...],
  metadata: { user_id: agentId }
});

// Capture usage from response
const inputTokens = response.usage.input_tokens;
const outputTokens = response.usage.output_tokens;

// Calculate cost (hardcoded — breaks when pricing changes)
const cost = (inputTokens * 0.000015) + (outputTokens * 0.000075);

// Log it somewhere
await db.insert('llm_usage_logs', {
  agent_id: agentId,
  model: response.model,
  input_tokens: inputTokens,
  output_tokens: outputTokens,
  cost_usd: cost,
  timestamp: new Date()
});

The problems with this approach compound quickly:

  • Token pricing changes — your hardcoded rates go stale
  • Every new model requires updating pricing tables manually
  • You still need to build dashboards, alerts, and aggregations
  • Multi-provider setups (OpenAI + Anthropic + Gemini) require separate schemas
  • No anomaly detection — still find out about runaway agents from the invoice
ⓘ  Good for: 1–2 agents, single provider, team with bandwidth to maintain infra. Doesn't scale past that.
2
Provider Dashboards (OpenAI / Anthropic)

Both OpenAI and Anthropic offer usage dashboards with some filtering capabilities. They're free, require no integration, and show token consumption over time.

What you actually get:

Feature OpenAI Dashboard Anthropic Console
Total spend over time
Per-model breakdown
Per-API-key breakdown ~ (Projects)
Per-agent attribution
Budget alerts ~ (Account level)
Multi-provider view
Real-time anomaly detection

OpenAI's Projects feature gets you close to per-key isolation — if you're disciplined about creating a separate project per agent. The downsides: each project needs its own billing setup, key rotation is manual, and cross-project views don't exist.

Anthropic's console is cleaner but offers less granularity. You get account-level usage data. That's it.

⚠  Works for: Single-provider teams tracking total spend. Completely blind on per-agent attribution, multi-provider workloads, and real-time anomaly detection.
3
Dedicated Cost Tracking Tool

Tools built specifically for AI agent cost tracking handle the attribution layer for you — connecting to your providers, normalizing usage data, and surfacing per-agent cost breakdowns without requiring you to build and maintain the plumbing.

The tradeoff is integration overhead upfront (usually 15–30 minutes) vs. ongoing engineering maintenance with the DIY approach. For teams with more than 3 agents or multiple providers, the math typically favors the dedicated tool.

We covered the broader tool landscape in 8 Developer Tools for AI Agent Cost Tracking (2026) — worth reading if you're evaluating options. The rest of this article focuses on what the setup actually looks like with Costline.

✓  Best for: Teams with 3+ agents, multiple providers, or anyone who wants budget alerts and anomaly detection without building it themselves.

How Costline Handles Per-Agent Cost Attribution

Costline is built around a single premise: your LLM costs should be visible at the agent level, in real time, with budget controls per agent — not aggregated across everything and discovered at month-end.

Here's the core model:

Free tier

Costline's free tier covers 2 providers and unlimited agents. No credit card required. It stays free — the paid plans add team seats, longer retention, and API access for programmatic budget control.

Quick Start: 3 Steps to See Per-Agent Costs

1

Connect your providers

Sign up at costline.polsia.app/signup — no credit card needed. Add your OpenAI and/or Anthropic API keys in Settings → Providers. Costline uses read-only access to pull usage data. Your keys never proxy traffic.

2

Tag your agents

Pass a metadata tag on each LLM call — a single field identifying which agent generated the request. For OpenAI, use the user field or custom metadata. For Anthropic, use the metadata.user_id field. Costline maps these to agent identifiers you define in the dashboard.

OpenAI example
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [...],
  user: "agent:sales-assistant-v2"  // Costline reads this
});
Anthropic example
const response = await anthropic.messages.create({
  model: "claude-opus-4-5",
  messages: [...],
  metadata: { user_id: "agent:support-bot" }  // Costline reads this
});
3

Set per-agent budgets and alerts

In the Costline dashboard, define each agent and set a monthly spend threshold. You'll get an alert (email or webhook) when an agent approaches or hits its limit. For production agents, you can configure a hard cutoff — Costline will reject requests once the budget is exhausted.

That's the core loop. After setup, the dashboard shows each agent's daily, weekly, and monthly cost trends — and flags anomalies when an agent's spending pattern deviates from its baseline.


What Good Per-Agent Visibility Actually Looks Like

Once you have per-agent cost attribution working, a few things become immediately actionable:

Model right-sizing. You'll likely find that 2–3 agents are using GPT-4o or Claude Opus for tasks that don't require that level of capability. Downgrading those agents to a smaller model is a mechanical cost reduction — often 5–10x cheaper per call with minimal quality impact for structured tasks.

Prompt compression. The agents burning the most tokens are usually doing something inefficient with context — loading full documents, repeating system prompts on every turn, or not caching stable context. Per-agent data makes these targets obvious.

Caching opportunities. If an agent has a high call volume but repetitive inputs, semantic caching can dramatically reduce token consumption. You can't prioritize this without knowing which agents have the right profile for it.

Cost per outcome. Once you know what each agent costs, you can start connecting it to value — cost per ticket resolved, cost per deal qualified, cost per document processed. That's the metric that actually matters for AI ROI conversations.

Further reading

If you're earlier in the evaluation process and want to understand how Costline compares to observability tools like Portkey, Helicone, and LangSmith, see: Costline vs Portkey vs Helicone vs LangSmith. The short version: those tools solve observability and tracing — per-agent cost attribution is not their core model.


Summary: Which Approach Is Right for You?

Your situation Recommended approach
1–2 agents, single provider, have engineering bandwidth Manual logging — total control, maintenance cost acceptable at this scale
Just want to see total spend, no per-agent need Provider dashboard — free, no setup, good enough for aggregate visibility
3+ agents, multiple providers, or need budget controls Dedicated tool (Costline) — per-agent attribution without maintaining the infra
Production agents, need anomaly detection + hard cutoffs Dedicated tool — this is exactly the use case manual logging and provider dashboards can't cover
Free tier requirement, unlimited agents Costline — 2 providers, unlimited agents, no credit card

Not sure what your fleet costs? Try our free AI agent cost calculator — estimate your monthly LLM spend by agent count, call volume, and model. No signup needed.

See Your Agent Costs in 15 Minutes

Connect your providers, tag your agents, and get per-agent cost breakdowns — free for 2 providers and unlimited agents.

Start for free →

No credit card. 2 providers free. Unlimited agents. Cancel anytime.