litellmCall 100+ LLM providers through LiteLLM's unified API. Use when you need to call a different model than your primary (e.g., use GPT-4 for code review while running on Claude), compare outputs from multiple models, route to cheaper models for simple tasks, or access models your runtime doesn't natively support.
Install via ClawdBot CLI:
clawdbot install ishaan-jaff/litellmUse LiteLLM when you need to call LLMs beyond your primary model.
import litellm
# Call any model with unified API
response = litellm.completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain this code"}]
)
print(response.choices[0].message.content)
import litellm
prompt = [{"role": "user", "content": "What's the best approach to X?"}]
models = ["gpt-4o", "claude-sonnet-4-20250514", "gemini/gemini-1.5-pro"]
for model in models:
resp = litellm.completion(model=model, messages=prompt)
print(f"{model}: {resp.choices[0].message.content[:200]}...")
import litellm
def smart_call(task_type: str, prompt: str) -> str:
model_map = {
"code": "gpt-4o", # Strong at code
"writing": "claude-sonnet-4-20250514", # Strong at prose
"simple": "gpt-4o-mini", # Cheap for simple tasks
"reasoning": "o1-preview", # Deep reasoning
}
model = model_map.get(task_type, "gpt-4o")
resp = litellm.completion(
model=model,
messages=[{"role": "user", "content": prompt}]
)
return resp.choices[0].message.content
If a LiteLLM proxy is available, point to it for caching, rate limiting, and observability:
import litellm
litellm.api_base = "https://your-litellm-proxy.com"
litellm.api_key = "sk-your-key"
response = litellm.completion(
model="gpt-4o", # Proxy routes to configured provider
messages=[{"role": "user", "content": "Hello"}]
)
Ensure litellm is installed and API keys are set:
pip install litellm
# Set provider keys (or configure in proxy)
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-..."
Common model identifiers:
gpt-4o, gpt-4o-mini, o1-preview, o1-miniclaude-sonnet-4-20250514, claude-opus-4-20250514gemini/gemini-1.5-pro, gemini/gemini-1.5-flashmistral/mistral-large-latestFull list: https://docs.litellm.ai/docs/providers
Generated Feb 24, 2026
A software development tool that uses LiteLLM to route code snippets to specialized models like GPT-4 for analysis, comparing outputs from multiple models to provide comprehensive feedback. This helps developers catch bugs and improve code quality efficiently across different programming languages.
An agency leverages LiteLLM to route writing tasks to models like Claude for prose generation and GPT-4 for editing, optimizing costs by using cheaper models for draft content. This enables scalable content creation with consistent quality and reduced operational expenses.
A customer service platform uses LiteLLM to route simple queries to cost-effective models like GPT-4o-mini and complex issues to advanced models like o1-preview for reasoning. This ensures accurate responses while minimizing API costs and improving user satisfaction.
Researchers utilize LiteLLM to compare outputs from models like Gemini and Claude for literature reviews and data analysis, accessing models not natively supported by their tools. This facilitates cross-model validation and deeper insights in scientific studies.
Offer a platform where users pay a monthly fee to access LiteLLM-powered tools for model comparison and routing, with tiered plans based on usage limits and advanced features. Revenue is generated through recurring subscriptions and potential enterprise contracts.
Provide a managed LiteLLM proxy service with caching and rate limiting, charging customers based on API call volume or data processed. This model targets businesses needing reliable, scalable access to multiple LLMs without infrastructure management.
Deliver custom integration services to help companies implement LiteLLM for specialized routing and cost optimization, with revenue from project-based fees and ongoing support contracts. This caters to enterprises with specific AI workflow needs.
💬 Integration Tip
Start by setting up a LiteLLM proxy for centralized management, which simplifies API key handling and provides caching to reduce costs and latency.
Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Gemini CLI for one-shot Q&A, summaries, and generation.
Research any topic from the last 30 days on Reddit + X + Web, synthesize findings, and write copy-paste-ready prompts. Use when the user wants recent social/web research on a topic, asks "what are people saying about X", or wants to learn current best practices. Requires OPENAI_API_KEY and/or XAI_API_KEY for full Reddit+X access, falls back to web search.
Check Antigravity account quotas for Claude and Gemini models. Shows remaining quota and reset times with ban detection.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates opencla...
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates openclaw.json. Use when the user mentions free AI, OpenRouter, model switching, rate limits, or wants to reduce AI costs.