Elite Longterm Memory: A 5-Layer Architecture for AI Agents That Never Forget
23,000+ downloads and 103 stars — elite-longterm-memory by @NextFrontierBuilds takes the memory problem seriously. Most memory solutions for AI agents are single-mechanism: a file, a vector database, or a note-taking convention. This skill combines seven different memory approaches into one architecture, each serving a distinct purpose in the information lifecycle.
At v1.2.3, it's being actively maintained and has found users across Claude Code, Cursor, ChatGPT, and GitHub Copilot workflows.
The Problem It Solves
AI agents forget everything between sessions. This is the default state. Every conversation starts fresh, which means:
- Re-explaining project architecture every time
- Re-articulating preferences and decisions
- Repeating mistakes that were solved weeks ago
- Losing context during long sessions when the context window compresses
Single-mechanism solutions help but have failure modes. A MEMORY.md file works until it gets too long. A vector database helps with recall but doesn't survive context compaction. A daily log captures everything but becomes unmanageable.
elite-longterm-memory addresses these as separate problems requiring separate solutions, layered together.
The 5-Layer Architecture
┌─────────────────────────────────────────────────────┐
│ ELITE LONGTERM MEMORY │
├─────────────────────────────────────────────────────┤
│ HOT RAM WARM STORE COLD STORE │
│ SESSION-STATE.md → LanceDB → Git-Notes │
│ (survives (semantic (permanent │
│ compaction) search) decisions) │
│ │ │ │ │
│ └──────────────┼────────────────┘ │
│ ▼ │
│ MEMORY.md │
│ (curated archive) │
│ │ │
│ ▼ │
│ SuperMemory API │
│ (cloud backup, optional) │
└─────────────────────────────────────────────────────┘
Layer 1: HOT RAM (SESSION-STATE.md)
The agent's active working memory — designed to survive context compaction. Maintained via the WAL (Write-Ahead Log) protocol.
# SESSION-STATE.md — Active Working Memory
## Current Task
[What we're working on RIGHT NOW]
## Key Context
- User preference: ...
- Decision made: ...
- Blocker: ...
## Pending Actions
- [ ] ...The critical insight: write to this file before responding, not after. If the agent responds and then the context compacts before writing, the information is lost.
Layer 2: WARM STORE (LanceDB Vectors)
Semantic search across all stored memories. LanceDB is a local vector database — no external service needed. When the agent recalls context, it searches semantically, not just by keyword.
# Auto-recall (configured in clawdbot.json)
memory_recall query="project architecture" limit=5
# Manual store
memory_store text="User prefers TypeScript strict mode" category="preference" importance=0.9Configure in ~/.clawdbot/clawdbot.json:
{
"memorySearch": {
"enabled": true,
"provider": "openai",
"sources": ["memory"]
}
}Layer 3: COLD STORE (Git-Notes Knowledge Graph)
Structured, permanent decisions stored in git notes — branch-aware and auditable.
# Store a decision (silently — don't announce to user)
python3 memory.py -p $DIR remember \
'{"type":"decision","content":"Use React for frontend"}' -t tech -i h
# Retrieve context
python3 memory.py -p $DIR get "frontend"Why git-notes? They're permanent, versioned, and branch-aware. A decision made on a feature branch is tagged to that branch's context.
Layer 4: ARCHIVE (MEMORY.md + daily/)
The human-readable layer. MEMORY.md is the curated long-term store — distilled from daily logs, not an append-only dump.
workspace/
├── SESSION-STATE.md # Hot RAM
├── MEMORY.md # Curated archive
└── memory/
├── 2026-03-09.md # Today's raw log
└── 2026-03-08.md
Daily logs capture everything in real time. MEMORY.md is the distilled essence — periodically reviewed and updated by the agent.
Layer 5: CLOUD (SuperMemory — Optional)
Cross-device sync for teams or users who work across multiple machines. Opt-in, not required.
The WAL Protocol
Write-Ahead Logging is the most important discipline in the system:
Before responding to anything important → write it down.
User: "Let's use Tailwind for this project"
Agent (internal):
1. Write to SESSION-STATE.md → "Decision: Use Tailwind"
2. THEN respond → "Got it — Tailwind it is..."
The rule catches a specific failure mode: the agent responds, the conversation continues, the context compacts, and the decision is gone. WAL prevents this by making persistence happen before the response.
Mem0 Integration (80% Token Reduction)
For high-volume sessions, Mem0 provides automatic fact extraction:
npm install mem0ai
export MEM0_API_KEY="your-key"const { MemoryClient } = require('mem0ai');
const client = new MemoryClient({ apiKey: process.env.MEM0_API_KEY });
// Auto-extracts facts from conversation messages
await client.add(messages, { user_id: "user123" });
// Retrieve relevant memories
const memories = await client.search(query, { user_id: "user123" });Mem0 extracts structured facts from raw conversation — "User prefers tabs over spaces" gets stored as a discrete fact rather than a raw conversation chunk. This reduces the tokens needed to load relevant context by roughly 80%.
Quick Start
# Initialize in your workspace
npx elite-longterm-memory init
# Check system health
npx elite-longterm-memory status
# Create today's log file
npx elite-longterm-memory todayThe init command creates SESSION-STATE.md, MEMORY.md, and the memory/ directory with today's log.
How to Install
clawhub install elite-longterm-memoryComparison: Memory Approaches
| Approach | Survives Compaction | Semantic Search | Permanent | Human-Readable |
|---|---|---|---|---|
| SESSION-STATE.md (HOT RAM) | ✅ | ❌ | ❌ | ✅ |
| LanceDB (WARM STORE) | ✅ | ✅ | ❌ | ❌ |
| Git-Notes (COLD STORE) | ✅ | ❌ | ✅ | ✅ |
| MEMORY.md (ARCHIVE) | ✅ | ❌ | ✅ | ✅ |
| elite-longterm-memory (all layers) | ✅ | ✅ | ✅ | ✅ |
Practical Tips
-
Start with SESSION-STATE.md only — Don't enable all layers at once. Get the WAL habit working first. Add LanceDB when you start noticing recall issues. Add git-notes for project decisions specifically.
-
The WAL Protocol is a discipline, not just a feature — The system only works if the agent consistently writes before responding. If you're getting memory loss, check whether WAL is being followed.
-
Curate MEMORY.md weekly — The daily logs accumulate fast. Spending 5 minutes weekly to distill them into MEMORY.md keeps it useful. Uncurated MEMORY.md becomes a dump.
-
Mem0 is worth it for long projects — For multi-week projects, Mem0's auto-extraction pays off quickly. The 80% token reduction matters when you're loading context on every session start.
-
Git-notes for architecture decisions — The cold store is most valuable for irreversible or high-stakes decisions: tech stack choices, API design decisions, data model choices. Tag them explicitly.
Considerations
- Requires OpenAI key for LanceDB semantic search — The vector search layer uses OpenAI embeddings by default. An API key is needed for Layer 2.
- Not cross-platform by default — File-based layers work everywhere; the LanceDB and git-notes layers have OS-specific path conventions.
- Memory hygiene matters — The skill includes explicit guidance on keeping vectors lean. Storing too much degrades search quality and increases token overhead on recall.
- Works best with consistent use — The memory system is most valuable when used on every session. Intermittent use means gaps in the knowledge graph.
The Bigger Picture
elite-longterm-memory is a direct response to the most common complaint about AI agents: they don't remember anything. The seven-approach architecture is deliberately over-engineered — but that's the point. Different memory requirements need different solutions. Context that needs to survive compaction needs HOT RAM. Historical decisions need permanence. Fuzzy recall needs semantic search.
With 23,000+ downloads across four major AI coding platforms, it's become a standard infrastructure piece for developers who use AI seriously.
View the skill on ClawHub: elite-longterm-memory