Agent cost simulator

Estimate the real cost of your AI agent

Most calculators model a single API call. Real agents run loops with tool calls, context growth, reasoning tokens, retries, and prompt caching. Pick a workload template, tweak the numbers, see what it actually costs.

Estimated monthly cost
$0
Per task
$0
Annualized
$0
System + tools become cacheable
Cache hit rate 75%
Workload parameters 10 inputs
Cost breakdown / task
  • Cacheable input $0
  • Variable input $0
  • Output $0
  • Reasoning $0
  • Subtotal × retry $0
With vs without caching (monthly)
  • Without caching $0
  • With caching $0
  • Caching saves $0

Stable prefix (system + tools) is cacheable. Variable input (user + tool results) pays full input price. Output includes generated tokens + reasoning tokens. Total scales by (1 + retry rate). Steady-state cache approximation, with writes amortized across hits.

What this models

Every turn in an agent loop sends a stable prefix plus variable input, and generates output that may include reasoning. Caching changes the math on the prefix. Retries multiply everything.

Per-turn input

  • System prompt: stable, cacheable
  • Tools schema: stable, cacheable
  • User input: varies per turn
  • Tool results: varies per turn

Per-turn output

  • Reasoning tokens: billed as output (o3, Claude thinking, Gemini thinking)
  • Tool call args: small, billed as output
  • Final generation: the model's response

Workload templates

The dropdown above pre-fills representative numbers for common shapes. Adjust freely.

Reality check. Agent costs are dominated by two things: how many turns the loop takes, and how much repeated context you send. If your monthly bill looks scary, look at those two before swapping models. Often the answer is "send less context per turn" or "cap the loop earlier," not "swap GPT-5 for Haiku."

For an explicit caching-vs-no-caching breakdown across providers, see the cache calculator.
Using Claude Code specifically? The general agent calculator above is the right tool for "any agent, any model" workloads. For a Claude-Code-specific view that handles Pro vs Max vs API recommendations and bakes in Claude Code's typical session shape, see the Claude Code cost calculator.