Stop paying the MCP token tax
MCP (GitHub, 93 tools)
OnlyCLI (GitHub, 1,107 commands)
3 MCP services connected
SKILL.md agent summary
Real task: "What languages does octocat/Hello-World use?"
| Approach | Tokens | Cost (Claude Sonnet) | Ratio |
|---|---|---|---|
| MCP (GitHub server) | 44,026 | $0.132 | --- |
| OnlyCLI | 1,365 | $0.004 | 32x cheaper |
Cost at scale
| Daily requests | MCP overhead / month | CLI overhead / month | Monthly savings |
|---|---|---|---|
| 100 | $510 | ~$0 | >99% |
| 1,000 | $5,100 | ~$12 | 99.8% |
| 10,000 | $51,000 | ~$120 | 99.8% |
MCP overhead: schema injection on every completion request (default behavior).
CLI overhead: occasional --help calls at conversation start.
Why is MCP so expensive?
Each MCP tool definition costs 550–1,400 tokens. When an agent connects to an MCP server, the host injects every tool’s JSON Schema into the system prompt—whether the model uses zero tools or ten. There is no standard lazy-loading mechanism across providers.
With a CLI, the “schema” is the --help text: ~80 tokens for one subcommand, read on demand. The agent pulls what it needs instead of carrying everything.
| MCP | CLI (OnlyCLI) | |
|---|---|---|
| Discovery model | Push all schemas every turn | Pull --help on demand |
| Per-tool cost | 550–1,400 tokens | 80–150 tokens (only when read) |
| Idle cost | Full catalog in every request | Zero |
| Scaling | Linear with tool count | Constant (only read what you need) |
The ecosystem agrees
Multiple independent projects have measured the same gap:
- mcp2cli: 96–99% token savings via lazy CLI discovery
- CLIHub: 92–98% savings converting MCP servers to CLIs
- Anthropic Tool Search: ~85% reduction (Anthropic-only, still loads full schemas on use)
- Vensas benchmark: MCP 4–32x more expensive per task
OnlyCLI goes further: instead of wrapping MCP at runtime, it generates a native, compiled CLI from your OpenAPI spec. No runtime dependency. No MCP server. Just a binary.