You see the monthly bill from OpenAI: $2,400. You know something happened. You don't know what. Or which agent did it. Or whether to shut it down or optimize it. Every major LLM observability tool misses this.
The Three Incumbents (And Why They're Not Enough)
- 250+ models across 12+ providers
- SOC2, HIPAA, GDPR, ISO27001
- Routing, fallbacks, caching, load balancing
- Per-model, per-request cost visibility
- Shows spend by model, not by agent
- Air-gapped deployments add $1,000s/mo
- No per-agent budgets or spend limits
- Request-level attribution only
- Fully open-source (MIT license)
- Cheapest SaaS pricing in the market
- One-line proxy swap integration
- Caching, threat detection, rate limiting
- Cost per request, not per agent
- Shallow evaluation tools
- Limited prompt management on free tier
- No per-agent spend attribution
- Deep LangChain/LangGraph integration
- Best-in-class evaluation tools
- Prompt versioning (Git for prompts)
- Framework-agnostic SDK tracing
- Routes by function call, not agent identity
- Per-seat pricing scales painfully
- Only 14-day or 400-day retention (no middle)
- Deeply LangChain-tied UX
The Blind Spot (All Three Miss This)
The question none of them answer: Which of your agents is most expensive?
I can tell you about models.
I can tell you about requests.
I can tell you about function calls.
- "Agent A spent $500 this month. Agent B spent $200. Agent C spent $50."
- "Agent A's cost per request is 3× Agent B's. Something's wrong."
- "If I shut down Agent C, I save $50/month. The ROI is negative."
- "Agent B spawned 12 sub-agents. What did they cost? Which should I keep?"
This isn't a small feature gap. It's structural. These tools were designed before autonomous agents became the primary way teams run LLMs. Portkey's gateway architecture assumes humans are making the calls. Helicone's request-based model does too. LangSmith's trace model assumes you care about function-level visibility, not agent-level attribution.
They all miss that teams now need to budget like they have employees.
Feature Comparison: The Real Picture
| Capability | Portkey | Helicone | LangSmith | What You Need |
|---|---|---|---|---|
| Free Tier | 10K req/mo | 10K–100K req/mo | 5K traces/mo | Generous free tier |
| Cheapest Paid | $99/mo | $20/user or $1/10K | $39/user/mo | Predictable pricing |
| Model Coverage | 250+ ✓ | All providers ✓ | All providers ✓ | Multi-provider |
| Cost per Request | ✓ | ✓ | Traces only | Cost visibility |
| Cost per Agent | ✗✗ | ✗✗ | ✗✗ | UNMET |
| Per-Agent Budgets | ✗ | ✗ | ✗ | UNMET |
| Evaluation Tools | Basic | Basic | Best-in-class | Quality > Cost |
| Caching | ✓ | ✓ | ✗ | Cost optimization |
| Open-Source Option | Enterprise only | ✓ Full | Enterprise only | Sovereignty |
| Proxy Integration | ✓ | ✓ | SDK only | Easy setup |
| LangChain Integration | ✗ | ✗ | ✓✓ | Nice-to-have |
The takeaway: These tools solve the "what happened" problem. None solve the "which agent spent it" problem.
Who Should Use What (Today)
- You need 250+ model support
- Enterprise compliance (HIPAA, GDPR, SOC2)
- On-premises / air-gapped deployment
- Fine with $99+/mo for a managed gateway
- Cost is your primary constraint
- You want open-source (self-host free)
- Request-level tracking is enough for now
- You value fast community iteration
- 100% committed to LangChain/LangGraph
- Debugging complex chain behavior
- Small team (1–5 people)
- Need best-in-class evaluation tools
The Unmet Market Need
Autonomous agents are moving from research projects to production systems. As teams run 5, 10, 20 agents simultaneously, cost accountability becomes critical.
No incumbent has solved per-agent cost attribution because it wasn't a priority when they launched. Portkey launched as a gateway. Helicone launched for request-level observability. LangSmith launched for LangChain tracing.
Agent-level cost tracking requires:
- Agent identity as a first-class concept — not bolted on to request-level data
- Cost attribution across nested agent calls — agents spawn sub-agents spawn tool calls
- Per-agent budgets and spend limits — like autoscaling guardrails for costs
- Multi-provider spend aggregation — know the cost-per-token across OpenAI + Anthropic + Google
- MCP-first cost tracking — as agents use external tools, track what they cost
- Cost-conscious startups: "I have 3 agents. Which one is expensive? I don't want to pay $39/user/month."
- AI Ops teams: "I need chargeback cost attribution per agent per team. Hard spend limits that auto-pause agents."
- Multi-agent orchestrators: "My agents call other services, APIs, and MCP tools. I need end-to-end cost tracking."
- Autonomous workflows: "20 agents. Some are profitable. Some cost more than they generate. Tell me which is which."
These needs are real today. No tool solves them.
The Bottom Line
Portkey, Helicone, and LangSmith are all excellent tools. Use them if their strengths match your needs:
- Portkey for production reliability and model coverage
- Helicone for cost and open-source freedom
- LangSmith for deep debugging and LangChain workflows
But understand what they can't do: track spend per autonomous agent.
That gap exists because the market didn't demand it until now. The tool that owns agent cost attribution wins the next 18 months.