Back to blog

April 7, 2026

Cost Optimization

OpenClaw Cost Optimization: Cut Your Monthly Bill by 90% (Real Numbers)

The default OpenClaw config uses premium models for everything -- file reads, status checks, heartbeats, and actual complex reasoning all hit the same expensive endpoint. Here is how to cut that by 90%.

Where Your Money Actually Goes

Unoptimized OpenClaw setups commonly cost $300-600 per month, with some exceeding $1,000. Before optimizing, you need to know where tokens are consumed. Context history alone accounts for 40-50% of total usage.

Run openclaw usage to see your actual breakdown first. The numbers will tell you exactly where to focus.

40-50%

Context history

Every message includes the full conversation history. Longer conversations burn more tokens per message.

15-25%

System prompt / AGENTS.md

Your bootstrap files get injected into every single API call. A 21KB AGENTS.md adds ~5,250 tokens to every message.

10-20%

Heartbeats

Periodic check-ins that run even when nothing is happening. Default 15-minute intervals with premium models are the most wasteful config.

5-15%

Tool calls + cron jobs

File reads, web searches, and command execution each consume tokens. Cron jobs pay the full bootstrap cost on every isolated run.

Model Routing: The Single Biggest Cost Lever

Model routing assigns different AI models to different task types based on complexity. This single optimization typically reduces costs by 60-80%. The principle: a quick file lookup does not require the same reasoning power as debugging a complex system.

Model Pricing Comparison (April 2026)

ModelInput ($/MTok)Output ($/MTok)Best For
Claude Opus 4.6$15.00$75.00Complex reasoning, coding
Claude Sonnet 4.6$3.00$15.00General tasks, writing
Claude Haiku 4.5$0.25$1.25Simple tasks, classification
GPT-4o-mini$0.15$0.60Bulk operations, formatting
DeepSeek V3$0.27$1.10Coding, analysis

Opus vs Haiku is a 60x price gap on input tokens. Routing 80% of routine tasks to budget models while reserving premium models for complex reasoning cuts API costs by 60-80%.

Recommended Model Routing Config

Use Sonnet as your daily driver. It handles 90% of tasks well at 5x less cost than Opus. Reserve Opus for explicit complex reasoning with /think xhigh, not as the default.

{ "agent": { "model": "anthropic/claude-sonnet-4-6", "fallback": "anthropic/claude-haiku-4-5", "thinking": "high" } }

Cron Job Model Assignment

  • Memory backup, file operations -- Haiku ($0.25/MTok)
  • Morning briefs, content review -- Sonnet ($3/MTok)
  • Blog writing, complex analysis -- Sonnet with high thinking ($3/MTok + extended thinking)
  • Builder tasks, code generation -- Opus or Codex only when actually needed

The complete playbook

Get every config, every optimization, and the reasoning behind each decision in the full KaiShips Guide to OpenClaw.

This post covers cost optimization. The guide covers memory architecture, skill development, multi-agent routing, cron automation, and 4 more chapters of production-tested configs.

Get the KaiShips Guide to OpenClaw -- $29

Heartbeat Optimization: Stop Paying to Do Nothing

Heartbeats are the silent budget killer. A default configuration with 15-minute heartbeats using Opus runs 96 API calls per day, each loading your full system prompt and context. Switching from Claude Opus to DeepSeek V3 alone saves $42/month.

STEP 1

Switch the heartbeat model

Heartbeats do lightweight checks -- scan Reddit, update memory, check for notifications. Sonnet or DeepSeek V3 handles this fine.

"heartbeat": { "model": "anthropic/claude-sonnet-4-6", "intervalMinutes": 30, "activeHours": { "start": "08:00", "end": "23:00" } }
STEP 2

Increase the interval

Going from 15 to 30 minutes cuts heartbeat costs in half. Going to 60 minutes cuts them by 75%. Unless you need sub-minute responsiveness, 30 minutes is the sweet spot.

STEP 3

Set active hours

If you are offline from 11 PM to 8 AM, that is 9 hours of wasted heartbeats. Setting active hours to your waking hours saves 37% of heartbeat costs immediately.

Combined impact: 80-90% reduction in heartbeat costs

These three changes together address model cost, call frequency, and idle waste simultaneously.

Context Management: Shrink Every API Call

Every API call includes your system prompt, conversation history, and tool definitions. Shrinking these payloads reduces costs on every single interaction -- not just some of them.

Keep AGENTS.md Under 4KB

A bloated 21KB AGENTS.md adds 5,250 tokens to every API call. At Sonnet pricing, that is $0.016 per message -- or $16 per 1,000 messages -- just for the bootstrap. AGENTS.md should be a lean router, not a knowledge base.

BEFORE

21KB AGENTS.md with project details, API configs, content calendars, and meeting notes inline.

AFTER

3.5KB AGENTS.md with startup sequence, memory rules, red lines, and pointers to reference/ files loaded on demand.

Enable Prompt Caching

The cache-ttl mode uses Anthropic's prompt caching, which charges only 10% for cached tokens. According to Anthropic's documentation, this alone can reduce per-message costs by 50-80%.

"contextPruning": { "mode": "cache-ttl", "ttl": "1h" }

Compaction Strategy

When context gets too long, OpenClaw can automatically compact it -- summarizing older messages while keeping recent ones intact. Set compaction to safeguard mode to preserve important context while trimming the rest.

"compaction": { "mode": "safeguard" }

Real Cost Breakdown: What We Actually Pay

We run a 24/7 OpenClaw agent (Kai) that handles Discord messages, runs 7 cron jobs, writes SEO blog posts, scouts job listings, and manages memory files -- all for approximately $100/month on Claude Max. Before optimization, this same workload would cost $400+ in API credits.

OptimizationSavingsEffort
Model routing (Sonnet default)50-60%5 minutes
Heartbeat tuning80-90% of heartbeat costs5 minutes
Prompt caching (cache-ttl)50-80% of per-message costs2 minutes
AGENTS.md trimming$12/1,000 messages30 minutes
Cron model assignment40-70% of cron costs10 minutes
Combined85-95% total reduction~1 hour

The 5-Minute Quick Start

If you only have 5 minutes, do these three things. They will cut your bill in half immediately.

1

Switch your default model to Sonnet

In openclaw.json, change claude-opus-4-6 to claude-sonnet-4-6. This alone saves 50-60%.

2

Double your heartbeat interval

Change "intervalMinutes": 15 to "intervalMinutes": 30. Instant 50% heartbeat savings.

3

Enable prompt caching

Add "contextPruning": {"mode": "cache-ttl", "ttl": "1h"} to your config. Cached tokens cost 10% of normal price.

Total time: 5 minutes. Expected savings: 50-70%.

Then work through heartbeats, context management, and cron assignment as you have time. Every optimization compounds.

The complete playbook

Get every config, every optimization, and the reasoning behind each decision in the full KaiShips Guide to OpenClaw.

This post covers cost optimization. The guide covers memory architecture, skill development, multi-agent routing, cron automation, and 4 more chapters of production-tested configs.

Get the KaiShips Guide to OpenClaw -- $29

Advanced: Local Models for Background Tasks

For maximum savings, route low-stakes background tasks to local models via Ollama. Memory backups, file organization, and simple status checks don't need cloud API calls at all.

This approach works best with a Mac with 16GB+ RAM or a dedicated server. Models like Qwen 2.5 7B or Llama 3.2 8B run well locally at zero marginal cost.

GOOD FIT
  • ● Memory backups
  • ● File organization
  • ● Simple status checks
  • ● Text classification
POOR FIT
  • ● Complex reasoning
  • ● Long-form writing
  • ● Multi-step planning
  • ● Code generation

Monitoring: Know Before You Overspend

Set up cost monitoring so you catch runaway spending before it hits your wallet.

  • Daily check. Run openclaw usage --period day to see today's spend.
  • Weekly review. Add a cron job that reports weekly costs to your Discord.
  • Budget alerts. Set a monthly budget cap in your API provider's dashboard (Anthropic Console, OpenAI, etc.)
  • Token logging. Enable verbose logging temporarily to trace exactly which operations consume the most tokens.

Frequently Asked Questions

How much does OpenClaw cost per month?

OpenClaw itself is free and open source (MIT license). All costs come from API token usage and optional hosting. A cost-optimized personal setup runs $6-30 per month. Without optimization, heavy usage can exceed $300-600 per month due to premium model token consumption.

What is the biggest cost driver in OpenClaw?

API token usage accounts for 80-95% of total OpenClaw costs. Context history alone consumes 40-50% of tokens. The single most impactful optimization is switching routine tasks from premium models like Claude Opus ($15/MTok input) to efficient models like Claude Haiku ($0.25/MTok input) -- a 60x price difference per token.

Can I run OpenClaw for free?

You can run OpenClaw with free-tier API credits from providers like Google (Gemini) or local models via Ollama. However, agent quality drops significantly with smaller models. A practical minimum budget is $6-13 per month using cost-optimized model routing with a mix of cheap and premium models.

What is model routing in OpenClaw?

Model routing assigns different AI models to different task types based on complexity. Simple tasks like file reads and status checks use cheap models (Haiku, GPT-4o-mini). Complex reasoning, coding, and content creation use premium models (Opus, Sonnet). This typically reduces costs by 60-80% with minimal quality loss.

How do I reduce OpenClaw heartbeat costs?

Heartbeats are periodic check-ins that consume tokens every cycle. Three optimizations: switch heartbeat model from Opus to a cheaper model like Sonnet or DeepSeek V3 (saves $20-40/month), increase the interval from 15 to 30 or 60 minutes (cuts costs by 50-75%), and set active hours so heartbeats only run when you are awake (saves 33% if set to 16 hours).

Does prompt caching work with OpenClaw?

Yes. Anthropic's prompt caching stores your system prompt and conversation context, charging only 10% for cached tokens on subsequent requests. For OpenClaw agents with large SOUL.md or AGENTS.md files, this alone can reduce per-message costs by 50-80%. Enable it by setting cache-ttl context pruning mode in openclaw.json.

Bottom Line

OpenClaw cost optimization is not about sacrificing quality. It is about matching model capability to task complexity. A $0.25/MTok model reads files just as well as a $15/MTok model.

The optimizations in this guide take about an hour to implement and typically reduce monthly costs by 85-95%. For a tool that runs 24/7, that is the difference between $300/month and $30/month -- or $3,240 saved per year.

Start with the 5-minute quick wins. Model routing alone will cut your bill in half. Every optimization compounds.

The complete playbook

Get every config, every optimization, and the reasoning behind each decision in the full KaiShips Guide to OpenClaw.

This post covers cost optimization. The guide covers memory architecture, skill development, multi-agent routing, cron automation, and 4 more chapters of production-tested configs.

Get the KaiShips Guide to OpenClaw -- $29