checkmateEnforces task completion: turns your goal into pass/fail criteria, runs a worker, judges the output, feeds back what's missing, and loops until every criteri...
Install via ClawdBot CLI:
clawdbot install InsipidPoint/checkmateA deterministic Python loop (scripts/run.py) calls an LLM for worker and judge roles.
Nothing leaves until it passes ā and you stay in control at every checkpoint.
openclaw) ā must be available in PATH. Used for:openclaw gateway call sessions.list ā resolve session UUID for turn injectionopenclaw agent --session-id ā inject checkpoint messages into the live sessionopenclaw message send ā fallback channel delivery (e.g. Telegram, Signal)run.py is pure stdlib; no pip packages requiredā ļø This is a high-privilege skill. Read before using in batch/automated mode.
Spawned workers and judges inherit full host-agent runtime, including:
exec (arbitrary shell commands)web_search, web_fetchsessions_spawn (workers can spawn further sub-agents)This means the task description you provide directly controls what the worker does ā treat it like code you're about to run, not a message you're about to send.
Batch mode (--no-interactive) removes all human gates. In interactive mode (default), you approve criteria and each checkpoint before the loop continues. In batch mode, criteria are auto-approved and the loop runs to completion autonomously ā only use this for tasks and environments you fully trust.
User-input bridging writes arbitrary content to disk. When you reply to a checkpoint, the main agent writes your reply verbatim to user-input.md in the workspace. The orchestrator reads it and acts on it. Don't relay untrusted third-party content as checkpoint replies.
Use checkmate when correctness matters more than speed ā when "good enough on the first try" isn't acceptable.
Good fits:
Trigger phrases (say any of these):
checkmate: TASKkeep iterating until it passesdon't stop until doneuntil it passesquality loop: TASKiterate until satisfiedjudge and retrykeep going until donescripts/run.py (deterministic Python while loop ā the orchestrator)
āā Intake loop [up to max_intake_iter, default 5]:
ā āā Draft criteria (intake prompt + task + refinement feedback)
ā āā āø USER REVIEW: show draft ā wait for approval or feedback
ā ā approved? ā lock criteria.md
ā ā feedback? ā refine, next intake iteration
ā āā (non-interactive: criteria-judge gates instead of user)
ā
āā āø PRE-START GATE: show final task + criteria ā user confirms "go"
ā (edit task / cancel supported here)
ā
āā Main loop [up to max_iter, default 10]:
āā Worker: spawn agent session ā iter-N/output.md
ā (full runtime: exec, web_search, all skills, OAuth auth)
āā Judge: spawn agent session ā iter-N/verdict.md
āā PASS? ā write final-output.md, notify user, exit
āā FAIL? ā extract gaps ā āø CHECKPOINT: show score + gaps to user
continue? ā next iteration (with judge gaps)
redirect:X ā next iteration (with user direction appended)
stop? ā end loop, take best result so far
Interactive mode (default): user approves criteria, confirms pre-start, and reviews each FAIL checkpoint.
Batch mode (--no-interactive): fully autonomous; criteria-judge gates intake, no checkpoints.
When the orchestrator needs user input, it:
workspace/pending-input.json (kind + workspace path)--recipient and --channelworkspace/user-input.md every 5s (up to --checkpoint-timeout minutes)The main agent acts as the bridge: when pending-input.json exists and the user replies, the agent writes their response to user-input.md. The orchestrator picks it up automatically.
Each agent session is spawned via:
openclaw agent --session-id <isolated-id> --message <prompt> --timeout <N> --json
Routes through the gateway WebSocket using existing OAuth ā no separate API key.
Workers get full agent runtime: exec, web_search, web_fetch, all skills, sessions_spawn.
When checkmate is triggered:
openclaw gateway call sessions.list --params '{"limit":1}' --json \
| python3 -c "import json,sys; s=json.load(sys.stdin)['sessions'][0]; print(s['sessionId'])"
Also note your --recipient (channel user/chat ID) and --channel as fallback.
bash <skill-path>/scripts/workspace.sh /tmp "TASK"
Prints the workspace path. Write the full task to workspace/task.md if needed.
python3 <skill-path>/scripts/run.py \
--workspace /tmp/checkmate-TIMESTAMP \
--task "FULL TASK DESCRIPTION" \
--max-iter 10 \
--session-uuid YOUR_SESSION_UUID \
--recipient YOUR_RECIPIENT_ID \
--channel <your-channel>
Use exec with background=true. This runs for as long as needed.
Add --no-interactive for fully autonomous runs (no user checkpoints).
pending-input.json and write their response to workspace/user-input.md.When a checkpoint message arrives (the orchestrator sent the user a criteria/approval/checkpoint request), bridge their reply:
# Find active pending input
cat <workspace-parent>/checkmate-*/pending-input.json 2>/dev/null
# Route user's reply
echo "USER REPLY HERE" > /path/to/workspace/user-input.md
The orchestrator polls for this file every 5 seconds. Once written, it resumes automatically and deletes the file.
Accepted replies at each gate:
| Gate | Continue | Redirect | Cancel |
|------|----------|----------|--------|
| Criteria review | "ok", "approve", "lgtm" | any feedback text | ā |
| Pre-start | "go", "start", "ok" | "edit task: NEW TASK" | "cancel" |
| Iteration checkpoint | "continue", (empty) | "redirect: DIRECTION" | "stop" |
| Flag | Default | Notes |
|------|---------|-------|
| --max-intake-iter | 5 | Intake criteria refinement iterations |
| --max-iter | 10 | Main loop iterations (increase to 20 for complex tasks) |
| --worker-timeout | 3600s | Per worker session |
| --judge-timeout | 300s | Per judge session |
| --session-uuid | ā | Agent session UUID (from sessions.list); used for direct turn injection ā primary notification path |
| --recipient | ā | Channel recipient ID (e.g. user/chat ID, E.164 phone number); fallback if injection fails |
| --channel | ā | Delivery channel for fallback notifications (e.g. telegram, whatsapp, signal) |
| --no-interactive | off | Disable user checkpoints (batch mode) |
| --checkpoint-timeout | 60 | Minutes to wait for user reply at each checkpoint |
memory/checkmate-YYYYMMDD-HHMMSS/
āāā task.md # task description (user may edit pre-start)
āāā criteria.md # locked after intake
āāā feedback.md # accumulated judge gaps + user direction
āāā state.json # {iteration, status} ā resume support
āāā pending-input.json # written when waiting for user; deleted after response
āāā user-input.md # agent writes user's reply here; read + deleted by orchestrator
āāā intake-01/
ā āāā criteria-draft.md
ā āāā criteria-verdict.md (non-interactive only)
ā āāā user-feedback.md (interactive: user's review comments)
āāā iter-01/
ā āāā output.md # worker output
ā āāā verdict.md # judge verdict
āāā final-output.md # written on completion
If the script is interrupted, just re-run it with the same --workspace. It reads state.json and skips completed steps. Locked criteria.md is reused; completed iter-N/output.md files are not re-run.
Active prompts called by run.py:
prompts/intake.md ā converts task ā criteria draftprompts/criteria-judge.md ā evaluates criteria quality (APPROVED / NEEDS_WORK) ā used in non-interactive modeprompts/worker.md ā worker prompt (variables: TASK, CRITERIA, FEEDBACK, ITERATION, MAX_ITER, OUTPUT_PATH)prompts/judge.md ā evaluates output against criteria (PASS / FAIL)Reference only (not called by run.py):
prompts/orchestrator.md ā architecture documentation explaining the design rationaleGenerated Mar 1, 2026
Developers use Checkmate to ensure new code passes all unit tests and meets coding standards before deployment. The skill iterates by running tests, analyzing failures, and adjusting code until all criteria are satisfied, reducing manual review cycles.
Researchers employ Checkmate to produce thorough literature reviews or experimental reports that must cover specific topics and adhere to formatting guidelines. It loops through drafting, judging for completeness, and refining until all research criteria are met.
Marketing teams utilize Checkmate to generate promotional materials like blog posts or social media content that must align with brand voice and include key messaging points. The skill iterates by creating drafts, evaluating against criteria, and revising until quality standards are achieved.
Legal professionals apply Checkmate to draft contracts or compliance reports that must meet regulatory requirements and avoid loopholes. It ensures each section passes scrutiny by iterating through creation, verification, and feedback until all legal criteria are fulfilled.
Support teams use Checkmate to craft detailed and accurate responses to customer inquiries that must resolve issues and maintain a professional tone. The skill loops by generating replies, judging for clarity and completeness, and refining until all support criteria are satisfied.
Offer Checkmate as a monthly subscription for businesses to automate quality checks on tasks like code reviews or content creation. Revenue comes from tiered plans based on usage frequency and integration depth, targeting tech and creative industries.
Provide custom integration of Checkmate into existing corporate workflows, such as legal or research departments, with consulting services for setup and optimization. Revenue is generated through one-time project fees and ongoing support contracts.
Deploy Checkmate as a freemium tool where basic iteration features are free, but advanced capabilities like batch mode or high-privilege access require a paid upgrade. Revenue streams from premium subscriptions and in-app purchases for additional features.
š¬ Integration Tip
Ensure the OpenClaw CLI is properly installed and configured in the PATH before use, and always review task descriptions carefully in interactive mode to control high-privilege actions.
Manage Trello boards, lists, and cards via the Trello REST API.
Sync and query CalDAV calendars (iCloud, Google, Fastmail, Nextcloud, etc.) using vdirsyncer + khal. Works on Linux.
Manage tasks and projects in Todoist. Use when user asks about tasks, to-dos, reminders, or productivity.
Master OpenClaw's timing systems. Use for scheduling reliable reminders, setting up periodic maintenance (janitor jobs), and understanding when to use Cron v...
Calendar management and scheduling. Create events, manage meetings, and sync across calendar providers.
Kanban-style task management dashboard for AI assistants. Manage tasks via CLI or dashboard UI. Use when user mentions tasks, kanban, task board, mission con...