โ† Back to AI Cost Calculator

๐Ÿฆž AI API Pricing Explained

How LLM per-token pricing works in 2026 โ€” and how to estimate your costs before you build.

What Are Tokens?

AI models don't process English words โ€” they process tokens. A token is roughly 4 characters or 0.75 English words. When you send a prompt, the provider counts the tokens in your input (prompt + conversation history) and in the model's response (output). You're billed for both.

Quick reference:
1 token โ‰ˆ 4 characters โ‰ˆ 0.75 English words
250 words (1 page) โ‰ˆ 333 tokens
75,000 words (300-page book) โ‰ˆ 100,000 tokens
1M tokens โ‰ˆ 750,000 words โ‰ˆ 3,000 pages โ‰ˆ 10 books

Input vs Output Pricing

Every AI API charges different rates for input and output. Output is almost always more expensive โ€” typically 3โ€“5ร— the input rate โ€” because generating text requires more compute than reading it.

ModelInput $/1MOutput $/1MOutput multiplier
GPT-4o$2.50$10.004.0ร—
Claude Sonnet 4$3.00$15.005.0ร—
Claude Opus 4$15.00$75.005.0ร—
Gemini 2.5 Flash$0.15$0.604.0ร—
DeepSeek V3$0.27$1.104.1ร—
GLM-4 Air (cheapest)$0.01$0.011.0ร—

This matters because most applications are output-heavy โ€” you send a short prompt, get a long response. A coding assistant that generates 1,500 tokens of code per prompt will spend more on output than input even if input is cheaper.

How Pricing Works: The Math

The formula is straightforward:

Monthly cost = requests/day ร— 30 ร— (input_tokens ร— input_price + output_tokens ร— output_price) รท 1,000,000

Example: 1,000 requests/day to GPT-4o with 2,000 input + 500 output tokens each:
= 1,000 ร— 30 ร— (2,000 ร— $2.50 + 500 ร— $10.00) รท 1,000,000
= 30,000 ร— ($5.00 + $5.00) รท 1,000,000
= $150/month

Same volume on cheapest models (GLM-4 Air at $0.01/$0.01):
= 30,000 ร— ($0.02 + $0.005) รท 1,000,000
= $0.75/month โ€” 200ร— cheaper.

Price Range: What Models Cost in 2026

Across 402 per-token models, pricing spans four orders of magnitude:

Budget tier
$0.01โ€“0.10
GLM-4 Air, Qwen Turbo
Mid-tier
$0.15โ€“2.00
Gemini Flash, DeepSeek V3
Premium
$2.50โ€“15.00
GPT-4o, Claude Sonnet
Frontier
$15.00โ€“75.00
Claude Opus 4, o4-mini

Prompt Caching: The 50% Discount You're Missing

OpenAI and Anthropic support prompt caching โ€” when the model recognizes that some input tokens are identical to a previous request, it reuses the cached computation and charges ~50% less for those tokens.

Caching works best for:

At 80% cache hit rate, your effective input cost drops by 40%. Enable caching in the AI Cost Calculator to see the difference.

Batch Processing: Half Price, 24-Hour Turnaround

Anthropic and OpenAI offer batch APIs at 50% discount. You submit a file of requests and get results within 24 hours. Ideal for:

Subscription Pricing: Ollama Cloud

Not all models charge per token. Ollama Cloud uses flat monthly subscriptions:

PlanPriceConcurrencyUse case
Free$0/mo1 modelDevelopment, personal projects
Pro$20/mo3 modelsFreelance, side projects, prototyping
Max$100/mo10 modelsProduction teams, parallel workloads

If you're running high-volume agent workloads (1M+ tokens/day), subscription pricing is usually cheaper than per-token. Run both sides in the calculator to compare.

OpenRouter: The Aggregator Discount

OpenRouter aggregates 361 community-hosted models โ€” many are open-weight models (Llama, Qwen, Gemma) served at near-cost pricing. The platform adds a small margin but gives you one API key for access to hundreds of models. OpenRouter prices tend to be 30-60% cheaper than direct API for equivalent models because of competitive hosting.

Estimating Your Actual Bill

Don't guess. Use real numbers:

  1. Count your requests. Track daily API calls from your application logs.
  2. Measure your token usage. Every API response includes a usage field with exact input/output token counts.
  3. Plan for growth. Multiply current usage by your projected user growth over 3 months.
  4. Compare providers. Different models at different price points โ€” use the calculator for a side-by-side comparison with your numbers.

๐Ÿฆž Stop guessing. Start calculating.

Enter your usage volume and compare costs across all 461 models โ€” daily, monthly, and annual breakdowns.

Launch AI Cost Calculator โ†’