Stop paying the MCP token tax

35x
fewer tokens per task with OnlyCLI vs MCP

MCP (GitHub, 93 tools)

55,000
tokens loaded on every turn

OnlyCLI (GitHub, 1,107 commands)

~200
tokens for --help discovery

3 MCP services connected

72%
of 200K context window gone on idle

SKILL.md agent summary

~400
tokens for full command discovery

Real task: "What languages does octocat/Hello-World use?"

Approach Tokens Cost (Claude Sonnet) Ratio
MCP (GitHub server) 44,026 $0.132 ---
OnlyCLI 1,365 $0.004 32x cheaper

Cost at scale

Daily requests MCP overhead / month CLI overhead / month Monthly savings
100 $510 ~$0 >99%
1,000 $5,100 ~$12 99.8%
10,000 $51,000 ~$120 99.8%

MCP overhead: schema injection on every completion request (default behavior).
CLI overhead: occasional --help calls at conversation start.


Why is MCP so expensive?

Each MCP tool definition costs 550–1,400 tokens. When an agent connects to an MCP server, the host injects every tool’s JSON Schema into the system prompt—whether the model uses zero tools or ten. There is no standard lazy-loading mechanism across providers.

With a CLI, the “schema” is the --help text: ~80 tokens for one subcommand, read on demand. The agent pulls what it needs instead of carrying everything.

  MCP CLI (OnlyCLI)
Discovery model Push all schemas every turn Pull --help on demand
Per-tool cost 550–1,400 tokens 80–150 tokens (only when read)
Idle cost Full catalog in every request Zero
Scaling Linear with tool count Constant (only read what you need)

The ecosystem agrees

Multiple independent projects have measured the same gap:

  • mcp2cli: 96–99% token savings via lazy CLI discovery
  • CLIHub: 92–98% savings converting MCP servers to CLIs
  • Anthropic Tool Search: ~85% reduction (Anthropic-only, still loads full schemas on use)
  • Vensas benchmark: MCP 4–32x more expensive per task

OnlyCLI goes further: instead of wrapping MCP at runtime, it generates a native, compiled CLI from your OpenAPI spec. No runtime dependency. No MCP server. Just a binary.