
peer-reviewMulti-model peer review layer using local LLMs via Ollama to catch errors in cloud model output. Fan-out critiques to 2-3 local models, aggregate flags, synthesize consensus. Use when: validating trade analyses, reviewing agent output quality, testing local model accuracy, checking any high-stakes Claude output before publishing or acting on it. Don't use when: simple fact-checking (just search the web), tasks that don't benefit from multi-model consensus, time-critical decisions where 60s latency is unacceptable, reviewing trivial or low-stakes content. Negative examples: - "Check if this date is correct" → No. Just web search it. - "Review my grocery list" → No. Not worth multi-model inference. - "I need this answer in 5 seconds" → No. Peer review adds 30-60s latency. Edge cases: - Short text (<50 words) → Models may not find meaningful issues. Consider skipping. - Highly technical domain → Local models may lack domain knowledge. Weight flags lower. - Creative writing → Factual review doesn't apply well. Use only for logical consistency.
reportingStandardized templates for periodic reports, system audits, revenue tracking, and progress logs. All output goes to workspace/artifacts/ directory. Use when: generating periodic reports, system audits, performance reviews, revenue tracking, weekly retrospectives, daily progress logs, full workspace audits. Don't use when: ad-hoc status updates in chat, quick summaries in Discord, one-off answers to "how's it going?", real-time dashboards. Negative examples: - "Give me a quick update" → No. Just answer in chat. - "What's the weather?" → No. This is for structured reports. - "Post a status to Discord" → No. Just send a message. Edge cases: - Mid-week report requested → Use weekly template but note partial week. - Audit requested for single subsystem → Use full audit template, mark other sections N/A. - Revenue snapshot with $0 revenue → Still generate it. Zeros are data.