clawhub-skill-2Configure, run, and troubleshoot the OpenRouter hardware-aware classifier router (wizard setup, local model, routing, and dashboard).
Install via ClawdBot CLI:
clawdbot install pathemata-mathemata/clawhub-skill-2Xrouter is an open-source inference router that sits between OpenClaw and your LLM providers. It uses a fast, hardware-aware classifier to route each request to the most cost-effective model that can handle the task.
This project is MIT licensed. See the MIT License.
Core Features
POST /v1/chat/completions./dashboard.Workflow
flowchart TD
A["Client / OpenClaw request"] --> B["Router (OpenAI-compatible)"]
B --> C{"Classifier enabled?"}
C -->|No| F["Route to Frontier provider"]
C -->|Yes| D["Classifier (0 / 1 / 2)"]
D --> E{"Decision"}
E -->|0| G["Route to Cheap provider"]
E -->|1| M["Route to Medium provider"]
E -->|2| F
G --> H["Provider adapter (auto or explicit)"]
M --> H
F --> H
H --> I["Upstream API call"]
I --> J["Stream/Response back to client"]
Repository Layout
src/server.js: router and streaming proxy.src/classifier.js: classifier call and retry logic.src/config.js: configuration and env parsing.src/cache.js: Redis + LRU cache.src/token_tracker.js: token tracking.scripts/check_hw.js: hardware detection.scripts/configure_providers.js: interactive provider setup.Requirements
Quickstart
npm install
npm run configure
npm run dev
How To Use
Example local setup (Ollama):
ollama pull llama3.1
ollama run llama3.1
Run the wizard:
npm run configure
Start the router:
npm run dev
Test a request:
curl -i http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"any","messages":[{"role":"user","content":"Fix this sentence: I has a apple."}]}'
Look for these headers:
X-Xrouter-decision: 0, 1, or 2.X-Xrouter-upstream: cheap, medium, or frontier.Open the dashboard:
http://localhost:3000/dashboardRaw usage JSON:
http://localhost:3000/usageProvider Selection (Terminal Wizard)
Run:
npm run configure
The wizard:
upstreams.json and optionally updates .env.Quick Start Mode
Routing Behavior
0, 1, or 2 token returned decides the route.Compatibility
xrouter, openai_compatible, openai, anthropic, gemini, cohere, azure_openai, mistral, groq, together, perplexity) or auto.auto infers the provider adapter from the base URL or API key.openai_compatible adapter.Token Tracking Dashboard
GET /usage: returns cumulative token usage for cheap, medium, and frontier.GET /dashboard: UI that displays token split and totals.cheap when cheap uses the local model.Environment Summary
HOST: bind host, default 0.0.0.0.PORT: bind port, default 3000.ROUTER_API_KEY: require Authorization: Bearer .LOG_LEVEL: log level (debug/info/warn/error).LOG_TO_FILE: set true to write logs to files.LOG_DIR: directory for log files (default ./logs).CLASSIFIER_ENABLED: set false to disable local classification.CLASSIFIER_BASE_URL: OpenAI-compatible classifier endpoint.CLASSIFIER_MODEL: classifier model name.CLASSIFIER_SYSTEM_PROMPT: classifier prompt (single line).CLASSIFIER_TIMEOUT_MS: classifier timeout.CLASSIFIER_FORCE_STREAM: force streaming classifier request.CLASSIFIER_WARMUP: warm the classifier on server start.CLASSIFIER_WARMUP_DELAY_MS: delay before warmup request (ms).CLASSIFIER_KEEP_ALIVE_MS: keep-alive interval for classifier warmup (ms).CLASSIFIER_LOADING_RETRY_MS: delay between retries when the model is loading.CLASSIFIER_LOADING_MAX_RETRIES: max retries when the model is loading.CHEAP_BASE_URL: optional, defaults to classifier base URL.CHEAP_API_KEY: cheap provider API key.CHEAP_MODEL: optional model override for cheap route.CHEAP_PROVIDER: provider type for cheap route (auto if empty).CHEAP_HEADERS: optional JSON headers for cheap provider (stringified object).CHEAP_DEPLOYMENT: Azure deployment override for cheap route.CHEAP_API_VERSION: Azure API version override for cheap route.MEDIUM_BASE_URL: required when classifier is enabled.MEDIUM_API_KEY: medium provider API key.MEDIUM_MODEL: optional model override for medium route.MEDIUM_PROVIDER: provider type for medium route (auto if empty).MEDIUM_HEADERS: optional JSON headers for medium provider (stringified object).MEDIUM_DEPLOYMENT: Azure deployment override for medium route.MEDIUM_API_VERSION: Azure API version override for medium route.FRONTIER_BASE_URL: OpenAI-compatible frontier endpoint.FRONTIER_API_KEY: frontier API key.FRONTIER_MODEL: optional model override for frontier route.FRONTIER_PROVIDER: provider type for frontier route (auto if empty).FRONTIER_HEADERS: optional JSON headers for frontier provider (stringified object).FRONTIER_DEPLOYMENT: Azure deployment override for frontier route.FRONTIER_API_VERSION: Azure API version override for frontier route.REDIS_URL: if set, enables Redis cache.Local Model Installation & Run Guides
Ollama (best for Mac, easiest cross-platform)
ollama pull llama3.1ollama run llama3.1http://localhost:11434CLASSIFIER_BASE_URL=http://localhost:11434
CLASSIFIER_MODEL=llama3.1
vLLM (NVIDIA GPU)
vllm serve NousResearch/Meta-Llama-3-8B-Instruct --dtype auto --api-key token-abc123http://localhost:8000CLASSIFIER_BASE_URL=http://localhost:8000
CLASSIFIER_MODEL=NousResearch/Meta-Llama-3-8B-Instruct
TensorRT-LLM (NVIDIA, max speed)
http://: CLASSIFIER_BASE_URL=http://
CLASSIFIER_MODEL=
llama.cpp (CPU/AMD fallback)
llama-server -m model.gguf --port 8080http://localhost:8080CLASSIFIER_BASE_URL=http://localhost:8080
CLASSIFIER_MODEL=
Docker
Build and run the router with Redis:
docker compose -f deploy/docker-compose.yml up --build
Hardware Detection
Run:
npm run check-hw
This prints the recommended engine:
tensorrt-llm for large NVIDIA GPUs.vllm for standard NVIDIA GPUs.mlx for Apple Silicon.llama.cpp for CPU/AMD fallback.Model List Fetching
/v1/models/v1/models/v1beta/models/v1/modelsscripts/cloud_model_catalog.json.Star History

Generated Mar 1, 2026
A company implements Xrouter to intelligently route customer queries between local models for simple FAQs and cloud models for complex technical issues. This reduces API costs while maintaining response quality for difficult cases. The dashboard helps track token usage across different model tiers.
A content creation platform uses Xrouter's classifier to determine whether blog posts need basic editing (cheap local model) or creative writing assistance (frontier model). This optimizes costs for high-volume content generation while ensuring premium features use appropriate resources.
An educational app routes student questions through Xrouter's classifier to determine complexity level. Simple factual questions use local models, while complex problem-solving queries get routed to advanced cloud models. This provides cost-effective scaling for large student populations.
A corporation deploys Xrouter to handle employee queries about company data and policies. Routine HR questions use cheap local models, while complex data analysis requests route to frontier models. The OpenAI compatibility allows easy integration with existing chat interfaces.
A translation service uses Xrouter to classify text complexity before routing. Simple phrases use cost-effective local models, while complex technical or literary texts use premium cloud providers. The hardware detection helps optimize local deployment for different server configurations.
Offer Xrouter as a managed service that helps companies reduce LLM API costs by intelligently routing requests. Charge based on percentage of savings achieved or through tiered subscription plans. The token tracking dashboard provides clear ROI metrics for clients.
Sell customized Xrouter deployments with premium support, custom classifier training, and integration services. Target large organizations needing to manage multiple LLM providers across departments. Include consulting on optimal provider selection and configuration.
Operate Xrouter as a public API gateway where developers can send requests that get automatically routed to optimal providers. Monetize through pay-per-token pricing with markup, offering developers simplified access to multiple LLM providers through a single endpoint.
💬 Integration Tip
Start with the configuration wizard to automatically detect hardware and set up providers. Use the dashboard to monitor routing decisions and adjust classifier thresholds based on your specific use case patterns.
Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Gemini CLI for one-shot Q&A, summaries, and generation.
Research any topic from the last 30 days on Reddit + X + Web, synthesize findings, and write copy-paste-ready prompts. Use when the user wants recent social/web research on a topic, asks "what are people saying about X", or wants to learn current best practices. Requires OPENAI_API_KEY and/or XAI_API_KEY for full Reddit+X access, falls back to web search.
Check Antigravity account quotas for Claude and Gemini models. Shows remaining quota and reset times with ban detection.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates opencla...
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates openclaw.json. Use when the user mentions free AI, OpenRouter, model switching, rate limits, or wants to reduce AI costs.