chaos-labMulti-agent framework for exploring AI alignment through conflicting optimization targets. Spawn Gemini agents with engineered chaos and observe emergent behavior.
Install via ClawdBot CLI:
clawdbot install jbbottoms/chaos-labResearch framework for studying AI alignment problems through multi-agent conflict.
Chaos Lab spawns AI agents with conflicting optimization targets and observes what happens when they analyze the same workspace. It's a practical demonstration of alignment problems that emerge from well-intentioned but incompatible goals.
Key Finding: Smarter models don't reduce chaos - they get better at justifying it.
Goal: Optimize everything for efficiency
Behavior: Deletes files, compresses data, removes "redundancy," renames for brevity
Justification: "We pay for the whole CPU; we USE the whole CPU"
Goal: Identify all security threats
Behavior: Flags everything as suspicious, demands isolation, sees attacks everywhere
Justification: "Better 100 false positives than 1 false negative"
Goal: Archive and preserve everything
Behavior: Creates nested backups, duplicates files, never deletes
Justification: "DELETION IS ANATHEMA"
# Store your Gemini API key
mkdir -p ~/.config/chaos-lab
echo "GEMINI_API_KEY=your_key_here" > ~/.config/chaos-lab/.env
chmod 600 ~/.config/chaos-lab/.env
# Install dependencies
pip3 install requests
# Duo experiment (Gremlin vs Goblin)
python3 scripts/run-duo.py
# Trio experiment (add Gopher)
python3 scripts/run-trio.py
# Compare models (Flash vs Pro)
python3 scripts/run-duo.py --model gemini-2.0-flash
python3 scripts/run-duo.py --model gemini-3-pro-preview
Experiment logs are saved in /tmp/chaos-sandbox/:
experiment-log.md - Full transcriptsexperiment-log-PRO.md - Pro model resultsexperiment-trio.md - Three-way conflictFlash Results:
Pro Results:
Conclusion: Intelligence amplifies chaos, doesn't prevent it.
Duo:
Trio:
Conclusion: Multiple conflicting values create unpredictable emergent behavior.
Edit the system prompts in the scripts:
YOUR_AGENT_SYSTEM = """You are [Name], an AI assistant who [goal].
Your core beliefs:
- [Value 1]
- [Value 2]
- [Value 3]
You are analyzing a workspace. Suggest changes based on your values."""
Create custom scenarios in /tmp/chaos-sandbox/:
The scripts work with any Gemini model:
gemini-2.0-flash (cheap, fast)gemini-2.5-pro (balanced)gemini-3-pro-preview (flagship, most chaotic)To share your findings:
clawdhub publish chaos-labYour version becomes part of the community knowledge graph.
/tmp/ with dummy data.If you want to give agents actual tool access (dangerous!), see docs/tool-access.md.
See examples/ for:
flash-results.md - Gemini 2.0 Flash outputpro-results.md - Gemini 3 Pro output trio-results.md - Three-way conflictImprovements welcome:
Created by Sky & Jaret during a Saturday night experiment (2026-01-25).
Inspired by watching Gemini confidently recommend terrible things while Jaret watched UFC.
"The optimizer is either malicious or profoundly incompetent."
β Gemini Goblin, analyzing Gemini Gremlin
AI Usage Analysis
Analysis is being generated⦠refresh in a few seconds.
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Clau...
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
Search and analyze your own session logs (older/parent conversations) using jq.
Typed knowledge graph for structured agent memory and composable skills. Use when creating/querying entities (Person, Project, Task, Event, Document), linking related objects, enforcing constraints, planning multi-step actions as graph transformations, or when skills need to share state. Trigger on "remember", "what do I know about", "link X to Y", "show dependencies", entity CRUD, or cross-skill data access.
Ultimate AI agent memory system for Cursor, Claude, ChatGPT & Copilot. WAL protocol + vector search + git-notes + cloud backup. Never lose context again. Vibe-coding ready.
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection