Self-Improving Agent: The Claude Skill That Learns From Its Own Mistakes
With over 32,000 downloads and 137 stars, the self-improving skill by @ivangdavila is one of the most popular agent-enhancement tools on ClawHub. It gives Claude a persistent, tiered memory system that automatically learns from user corrections, evaluates its own work, and permanently improves over time.
The Problem It Solves
Every developer who has worked with AI assistants for more than a week knows the frustration: you correct the same mistake on Monday, and by Friday the assistant does it again. You prefer 2-space indentation — you've said so three times. You want bullet points, not paragraphs. You always use Tailwind. And yet, every new session starts from zero.
The underlying issue is that most AI assistants treat memory as a single flat file, or worse, don't persist it at all. They have no mechanism to evaluate their own outputs, no structure for distinguishing what's worth remembering long-term versus what's session-specific, and no way to automatically promote a repeated correction into a permanent rule.
The self-improving skill fixes this with a three-tier memory architecture and a disciplined learning protocol.
Core Concept: The HOT / WARM / COLD Memory System
The skill stores everything in a local directory at ~/self-improving/ with three tiers:
~/self-improving/
├── memory.md ← HOT: always loaded, ≤100 lines
├── corrections.md ← running correction log
├── index.md ← memory index across all tiers
├── projects/ ← WARM: project-specific patterns
│ ├── my-app.md
│ └── side-project.md
├── domains/ ← WARM: domain patterns (code, writing, comms)
│ ├── code.md
│ └── writing.md
└── archive/ ← COLD: inactive patterns, preserved
└── 2025.md
- HOT: The
memory.mdfile is loaded every session. Kept ≤100 lines. Contains only confirmed, high-frequency preferences. - WARM: Domain or project files, loaded on demand when context is detected. A file for
my-apploads when you're working in that project. - COLD: Archived patterns that haven't been used in 90+ days. Preserved but not loaded automatically.
Think of it like RAM vs SSD vs tape storage — the most-used patterns stay in the fastest tier, unused ones migrate down, and nothing is lost.
Deep Dive
Learning From Corrections
The skill defines precise triggers for what counts as a learnable correction:
| Trigger | Confidence | Action |
|---|---|---|
| "No, do X instead" | High | Log correction immediately |
| "I told you before..." | High | Bump priority, flag as repeated |
| "Always/Never do X" | Confirmed | Promote to preference |
| Same correction × 3 | Pending | Ask to make permanent |
| User edits your output | Medium | Log as tentative pattern |
Crucially, silence is not confirmation. The skill only learns from explicit corrections or deliberate user affirmations, which prevents garbage from accumulating.
The Confirmation Flow
When the same correction appears three times, the skill asks before making it permanent:
Agent: "I've noticed you prefer X over Y (corrected 3 times).
Should I always do this?
- Yes, always
- Only in [context]
- No, case by case"
User: "Yes, always"
Agent: → Promoted to Confirmed Preferences in memory.md
→ Correction counter reset
→ Cites source on future use: "Using X (from memory.md:15)"
Pattern Evolution Stages
Every learned item goes through five defined stages:
- Tentative — single correction, watching for repetition
- Emerging — 2 corrections, likely pattern
- Pending — 3 corrections, confirmation requested
- Confirmed — user approved, permanent
- Archived — unused 90+ days, moved to COLD
Self-Reflection After Work
Beyond corrections, the skill supports proactive self-evaluation. After completing significant work, Claude reviews its own output and logs what it could have done better:
## 2026-02-25 — Flutter UI Build
**What I did:** Built a settings screen with toggle switches
**Outcome:** User said "spacing looks off"
**Reflection:** Focused on functionality, didn't visually check the result
**Lesson:** Always evaluate visual balance before presenting
**Status:** ✅ promoted to domains/flutter.mdThese reflections can be promoted to permanent patterns in the relevant domain namespace.
Security Boundaries Built In
The skill explicitly defines what it never stores:
- Credentials, API keys, passwords
- Financial data (card numbers, bank accounts)
- Medical information
- Biometric or behavioral fingerprints
- Third-party personal information
There's also a kill switch: saying "forget everything" exports current memory to a file (for review), then wipes all learned data.
User Commands
"What do you know about X?" → Search all tiers, return matches with sources
"Show my memory" → Display memory.md
"Show [project] patterns" → Load and display specific namespace
"Forget X" → Remove from all tiers, confirm deletion
"Forget everything" → Full wipe with export option
"Memory status" → Show tier sizes, last compaction, health
"Export memory" → Generate downloadable archive
Self-Improving vs. Other Memory Approaches
| Feature | self-improving | Plain CLAUDE.md | MCP memory tools |
|---|---|---|---|
| Learns automatically | ✅ | ❌ (manual) | ⚠️ varies |
| Three-tier (HOT/WARM/COLD) | ✅ | ❌ | ❌ |
| Correction tracking | ✅ | ❌ | ❌ |
| Self-reflection log | ✅ | ❌ | ❌ |
| Project namespacing | ✅ | ⚠️ manual | ⚠️ varies |
| Security boundaries | ✅ explicit | ❌ | ⚠️ varies |
| Audit trail | ✅ | ❌ | ⚠️ varies |
| No external server needed | ✅ | ✅ | ❌ |
How to Install
clawdbot install self-improvingThen create the memory directory structure:
mkdir -p ~/self-improving/{projects,domains,archive}
touch ~/self-improving/{memory.md,index.md,corrections.md}Initialize memory.md with the template from skillAssets/memory-template.md:
# Self-Improving Memory
## Confirmed Preferences
<!-- Patterns confirmed by user, never decay -->
## Active Patterns
<!-- Patterns observed 3+ times, subject to decay -->
## Recent (last 7 days)
<!-- New corrections pending confirmation -->Practical Tips
-
Use it with a daily workflow. The skill's value compounds over time. The more you correct it, the better it gets. Expect the first week to feel the same — the first month to feel significantly different.
-
Be explicit with corrections. Say "always use 2-space indentation" not just "indent this." The skill learns from deliberate signals, not implied ones.
-
Add a project namespace early. When you start a new project, tell Claude: "For this project, always use Tailwind." The skill will create
projects/[name].mdand namespace the rule correctly. -
Run "memory status" weekly. This shows you what's been learned, what's stale, and whether anything needs promotion or cleanup.
-
Use "forget X" liberally. When a preference changes, explicitly remove the old one. The skill keeps an archive, so nothing is permanently lost — it just won't apply anymore.
-
Trust the three-correction threshold. The skill won't promote a pattern permanently until you've corrected it three times. This prevents over-fitting to one-off situations.
Considerations
- Token overhead: Loading
memory.mdinto every session adds ~100 lines of context. For users with very long sessions or tight context budgets, this is worth monitoring. - Setup effort: The first-time directory setup is manual. It's not complex, but it requires following the initialization steps carefully.
- No cloud sync: Memory lives in
~/self-improving/on your local machine. If you work across multiple devices, you'll need to sync this directory yourself. - Learning pace: The three-correction threshold is conservative by design. If you want something learned immediately, you need to explicitly confirm it rather than hoping repetition alone will suffice.
- Best for long-term users: The skill delivers the most value to people using Claude daily over weeks and months. Short-term or occasional users may find the overhead outweighs the benefit.
The Bigger Picture
The self-improving skill represents a different philosophy for AI assistance: the assistant should adapt to you, not the other way around. Most AI tools expect you to re-explain your preferences every session, write elaborate system prompts, or manually curate memory files. This skill inverts that relationship — it observes, tracks, confirms, and learns, with you staying in control of what it retains.
With 32,000+ downloads, it's clearly resonating with users who want their AI tools to behave more like a colleague who actually remembers — and one who knows when to ask before assuming something is permanent.
View the skill on ClawHub: self-improving