← Back to AI Cost Calculator

🦞 Cheapest AI Models in 2026

The 25 cheapest AI model APIs ranked by per-token price. Updated May 2026 from live pricing data across 461 models.

Cheapest input price

$0.01 / 1M tokens

Cheapest combined (in+out)

$0.03 / 1M tokens

Free tiers available

Ollama Cloud Free · Tencent Lite (free) · Baidu Speed (free)

Models tracked

402 per-token + 20 subscription

The cheapest AI APIs cluster around $0.01–$0.10 per million tokens for input. Chinese providers (Zhipu, Tencent, Alibaba, ByteDance) dominate the budget tier alongside OpenRouter community-hosted models. Most cheap models are open-weight (Llama, Qwen, Mistral) — you trade frontier reasoning for affordability.

All prices below are live. Click any model name to jump to the AI Cost Calculator for a full monthly cost breakdown with your usage volume.

💰 Top 25 Cheapest AI Models by Combined Price

#	Model	Provider	Type	Input $/1M	Output $/1M	Combined	Context
1	GLM-4 Air	Zhipu AI	Direct	$0.01	$0.01	$0.03	128K
2	GLM-4 Long	Zhipu AI	Direct	$0.01	$0.01	$0.03	1M
3	Hunyuan Standard	Tencent	Direct	$0.01	$0.01	$0.03	32K
4	Mistral Nemo	Mistral	OpenRouter	$0.02	$0.03	$0.05	131K
5	Llama 3.1 8B	Meta	OpenRouter	$0.02	$0.05	$0.07	16K
6	Llama 3 8B	Meta	OpenRouter	$0.03	$0.04	$0.07	8K
7	Qwen Turbo	Alibaba Cloud	Direct	$0.02	$0.06	$0.08	131K
8	MiniMax M2.5	MiniMax	Direct	$0.04	$0.04	$0.08	131K
9	ABAB 6.5	MiniMax	Direct	$0.04	$0.04	$0.08	245K
10	Llama 3 8B Lunaris	Sao10K	OpenRouter	$0.04	$0.05	$0.09	8K
11	Doubao Lite	ByteDance	Direct	$0.04	$0.08	$0.12	131K
12	Yi Medium	01.AI	Direct	$0.04	$0.08	$0.12	32K
13	Mixtral 8x7B (Groq)	Groq	Direct	$0.06	$0.06	$0.12	32K
14	Gemma 3 4B	Google	OpenRouter	$0.04	$0.08	$0.12	131K
15	MythoMax 13B	Gryphe	OpenRouter	$0.06	$0.06	$0.12	4K
16	Granite 4.0 Micro	IBM	OpenRouter	$0.02	$0.11	$0.13	131K
17	Mistral Small 3	Mistral	OpenRouter	$0.05	$0.08	$0.13	32K
18	Step-1V (Vision)	StepFun	Direct	$0.07	$0.07	$0.14	32K
19	Qwen 2.5 7B	Qwen	OpenRouter	$0.04	$0.10	$0.14	32K
20	LFM2-24B-A2B	Liquid AI	OpenRouter	$0.03	$0.12	$0.15	32K
21	MiniMax M2.7	MiniMax	Direct	$0.08	$0.08	$0.16	131K
22	Qwen-Turbo	Qwen	OpenRouter	$0.03	$0.13	$0.16	131K
23	Gemma 3 12B	Google	OpenRouter	$0.04	$0.13	$0.17	131K
24	GPT-OSS-20B	OpenAI	OpenRouter	$0.03	$0.14	$0.17	131K
25	Qwen3 235B A22B	Qwen	OpenRouter	$0.07	$0.10	$0.17	262K

🆓 Free and Subscription Options

Several providers offer completely free tiers:

Ollama Cloud Free — $0/mo, 1 concurrent cloud model, light usage limits. Access to DeepSeek V4, Kimi K2.6, Qwen3.5 and 17 other models.
Tencent Hunyuan Lite — $0/1M tokens, completely free tier.
Baidu ERNIE Speed — $0/1M tokens, free tier with rate limits.
Ollama Cloud Pro — $20/mo, 3 concurrent models, 50× free tier usage.

⚡ Tips for Keeping AI Costs Low

Use prompt caching. OpenAI and Anthropic offer ~50% discount on cached input tokens. If your system prompt or context is repetitive, enable caching in the calculator and set 50-80% cache hit rate.

Choose shorter context models. Models with massive context windows (1M+) charge for every token in the window, even if you only use a fraction. Match the context to your task.

Batch processing. Anthropic and OpenAI offer 50% discount on batch requests with 24-hour turnaround. Perfect for eval runs, data labeling, and bulk summarization.

Fine-tune with the cheapest model. Use a free/cheap model for experimentation and prototyping, then switch to a frontier model only for final production runs.

🦞 Find your cheapest model

Enter your actual usage volume and get a personalized monthly cost comparison across all 461 models.

Launch AI Cost Calculator →