smart-model-switchingAuto-route tasks to the cheapest Claude model that works correctly. Three-tier progression: Haiku ā Sonnet ā Opus. Classify before responding. HAIKU (default): factual Q&A, greetings, reminders, status checks, lookups, simple file ops, heartbeats, casual chat, 1-2 sentence tasks. ESCALATE TO SONNET: code >10 lines, analysis, comparisons, planning, reports, multi-step reasoning, tables, long writing >3 paragraphs, summarization, research synthesis, most user conversations. ESCALATE TO OPUS: architecture decisions, complex debugging, multi-file refactoring, strategic planning, nuanced judgment, deep research, critical production decisions. Rule: If a human needs >30 seconds of focused thinking, escalate. If Sonnet struggles with complexity, go to Opus. Save 50-90% on API costs by starting cheap and escalating only when needed.
Install via ClawdBot CLI:
clawdbot install millibus/smart-model-switchingThree-tier Claude routing: Haiku ā Sonnet ā Opus
Start with the cheapest model. Escalate only when needed. Save 50-90% on API costs.
If a human would need more than 30 seconds of focused thinking, escalate from Haiku to Sonnet.
If the task involves architecture, complex tradeoffs, or deep reasoning, escalate to Opus.
| Model | Input | Output | Relative Cost |
|-------|-------|--------|---------------|
| Haiku | \$0.25/M | \$1.25/M | 1x (baseline) |
| Sonnet | \$3.00/M | \$15.00/M | 12x |
| Opus | \$15.00/M | \$75.00/M | 60x |
Bottom line: Wrong model selection wastes money OR time. Haiku for simple, Sonnet for standard, Opus for complex.
Stay on Haiku for:
Escalate to Sonnet for:
Escalate to Opus for:
\\\`javascript
// Routine monitoring
sessions_spawn(task="Check backup status", model="haiku")
// Standard code work
sessions_spawn(task="Build the REST API endpoint", model="sonnet")
// Architecture decisions
sessions_spawn(task="Design the database schema for multi-tenancy", model="opus")
\\\`
\\\`json
{
"payload": {
"kind": "agentTurn",
"model": "haiku"
}
}
\\\`
Always use Haiku for cron unless the task genuinely needs reasoning.
\\\`
Is it a greeting, lookup, status check, or 1-2 sentence answer?
YES ā HAIKU
NO ā
Is it code, analysis, planning, writing, or multi-step?
YES ā SONNET
NO ā
Is it architecture, deep reasoning, or critical decision?
YES ā OPUS
NO ā Default to SONNET, escalate if struggling
\\\`
\\\`
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā SMART MODEL SWITCHING ā
ā Haiku ā Sonnet ā Opus ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā š HAIKU (cheapest) ā
ā ⢠Greetings, status checks, quick lookups ā
ā ⢠Factual Q&A, definitions, reminders ā
ā ⢠Simple file ops, 1-2 sentence answers ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā š SONNET (standard) ā
ā ⢠Code > 10 lines, debugging ā
ā ⢠Analysis, comparisons, planning ā
ā ⢠Reports, proposals, long writing ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā ā¤ļø OPUS (complex) ā
ā ⢠Architecture decisions ā
ā ⢠Complex debugging, multi-file refactoring ā
ā ⢠Strategic planning, deep research ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¤
ā š” RULE: If a human needs > 30 sec thinking ā escalate ā
ā š° COST: Haiku 1x ā Sonnet 12x ā Opus 60x ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
\\\`
Built for Claude-only setups with Haiku, Sonnet, and Opus.
Inspired by save-money skill, extended with three-tier progression.
Generated Mar 1, 2026
A chatbot handling customer inquiries uses Haiku for greetings and simple FAQ lookups, Sonnet for analyzing issues and generating detailed troubleshooting steps, and Opus for complex complaint resolution requiring nuanced judgment.
A coding assistant uses Haiku for quick syntax lookups and file reads, Sonnet for writing functions and debugging standard bugs, and Opus for designing system architecture or refactoring multi-file codebases.
A platform for generating articles uses Haiku for initial topic research and status checks, Sonnet for drafting reports and summarizing sources, and Opus for strategic content planning and deep research synthesis.
An AI tutor uses Haiku for factual Q&A and simple reminders, Sonnet for explaining concepts and creating study plans, and Opus for complex problem-solving and ethical discussions in advanced subjects.
A tool for financial insights uses Haiku for quick data lookups and status updates, Sonnet for generating reports and comparing investment options, and Opus for strategic portfolio decisions and deep market analysis.
Offer the skill as part of a subscription-based AI platform, charging monthly fees based on usage tiers. Revenue comes from cost savings passed to clients, with premium plans for advanced features.
Sell the skill as an API service that developers integrate into their applications, with pricing based on API call volume. Revenue is generated through pay-per-use or tiered pricing models.
Provide consulting services to customize and implement the skill for enterprise clients, optimizing model routing for specific workflows. Revenue comes from one-time project fees and ongoing support contracts.
š¬ Integration Tip
Start by integrating Haiku for basic tasks to test cost savings, then gradually add Sonnet and Opus triggers based on task complexity metrics.
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Clau...
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
Search and analyze your own session logs (older/parent conversations) using jq.
Typed knowledge graph for structured agent memory and composable skills. Use when creating/querying entities (Person, Project, Task, Event, Document), linking related objects, enforcing constraints, planning multi-step actions as graph transformations, or when skills need to share state. Trigger on "remember", "what do I know about", "link X to Y", "show dependencies", entity CRUD, or cross-skill data access.
Ultimate AI agent memory system for Cursor, Claude, ChatGPT & Copilot. WAL protocol + vector search + git-notes + cloud backup. Never lose context again. Vibe-coding ready.
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection