Stop paying the MCP token tax

35x

fewer tokens per task with OnlyCLI vs MCP

MCP (GitHub, 93 tools)

55,000

tokens loaded on every turn

OnlyCLI (GitHub, 1,107 commands)

~200

tokens for --help discovery

3 MCP services connected

72%

of 200K context window gone on idle

SKILL.md agent summary

~400

tokens for full command discovery

Real task: "What languages does octocat/Hello-World use?"

Approach	Tokens	Cost (Claude Sonnet)	Ratio
MCP (GitHub server)	44,026	$0.132	---
OnlyCLI	1,365	$0.004	32x cheaper

Cost at scale

Daily requests	MCP overhead / month	CLI overhead / month	Monthly savings
100	$510	~$0	>99%
1,000	$5,100	~$12	99.8%
10,000	$51,000	~$120	99.8%

MCP overhead: schema injection on every completion request (default behavior).
CLI overhead: occasional --help calls at conversation start.

Why is MCP so expensive?

Each MCP tool definition costs 550–1,400 tokens. When an agent connects to an MCP server, the host injects every tool’s JSON Schema into the system prompt—whether the model uses zero tools or ten. There is no standard lazy-loading mechanism across providers.

With a CLI, the “schema” is the --help text: ~80 tokens for one subcommand, read on demand. The agent pulls what it needs instead of carrying everything.

	MCP	CLI (OnlyCLI)
Discovery model	Push all schemas every turn	Pull `--help` on demand
Per-tool cost	550–1,400 tokens	80–150 tokens (only when read)
Idle cost	Full catalog in every request	Zero
Scaling	Linear with tool count	Constant (only read what you need)

The ecosystem agrees

Multiple independent projects have measured the same gap:

mcp2cli: 96–99% token savings via lazy CLI discovery
CLIHub: 92–98% savings converting MCP servers to CLIs
Anthropic Tool Search: ~85% reduction (Anthropic-only, still loads full schemas on use)
Vensas benchmark: MCP 4–32x more expensive per task

OnlyCLI goes further: instead of wrapping MCP at runtime, it generates a native, compiled CLI from your OpenAPI spec. No runtime dependency. No MCP server. Just a binary.

Get started with OnlyCLI