gemini-readerUnderstand local non-text files (PDF, video, audio) using Gemini API. Use when the user asks to read, summarize, or analyze a PDF document, video file (mp4/m...
Install via ClawdBot CLI:
clawdbot install Shigo-45/gemini-readerAnalyze local PDF, video, and audio files via Gemini API (Python SDK google-genai).
google-genai Python package installed (pip install google-genai)GEMINI_API_KEY environment variable setpython3 scripts/gemini_read.py <file> "<prompt>" [--model MODEL] [--output PATH]
# Summarize a PDF
python3 scripts/gemini_read.py paper.pdf "Summarize the key findings of this paper"
# Analyze a video
python3 scripts/gemini_read.py lecture.mp4 "List the main topics covered in this video"
# Transcribe audio
python3 scripts/gemini_read.py recording.m4a "Transcribe this audio verbatim"
# Save output to file
python3 scripts/gemini_read.py report.pdf "Extract all data tables" --output tables.txt
| Alias | Full name | Best for |
|-------|-----------|----------|
| 3-flash (default) | gemini-3-flash-preview | Fast, cheap, everyday use |
| 2.5-flash | gemini-2.5-flash | Stable, good balance |
| 2.5-pro | gemini-2.5-pro | Deep analysis, long docs |
| 3-pro | gemini-3-pro-preview | Advanced reasoning |
| 3.1-pro | gemini-3.1-pro-preview | Latest pro capabilities |
Use alias with -m: gemini_read.py file.pdf "prompt" -m 2.5-pro
GEMINI_API_KEY env var or google-genai configured authGenerated Feb 26, 2026
Researchers can upload PDFs of academic papers to quickly summarize key findings, extract methodologies, or identify data tables. This accelerates literature reviews and data extraction for meta-analyses, saving hours of manual reading.
Companies can analyze internal training videos (e.g., mp4 files) to generate summaries of covered topics or create transcripts for accessibility. This helps in onboarding new employees and ensuring compliance with training documentation requirements.
Law firms can use this skill to process PDFs of contracts or case documents, extracting specific clauses or summarizing lengthy legal texts. It aids in due diligence and case preparation by highlighting critical information efficiently.
Media producers and podcasters can upload audio files (e.g., mp3 or wav) to generate verbatim transcripts for subtitles, content repurposing, or archival purposes. This streamlines post-production workflows and enhances content accessibility.
Healthcare providers can analyze audio recordings of patient consultations (e.g., m4a files) to transcribe discussions or summarize key health concerns. This supports accurate record-keeping and follow-up care planning while saving administrative time.
Offer this skill as part of a cloud-based platform where users pay a monthly fee for API access and file processing quotas. Include tiered plans based on usage volume (e.g., number of files or processing minutes) to cater to different business sizes.
Monetize by charging users per file processed or per API call, with pricing based on file type and size. This model appeals to occasional users or businesses with fluctuating needs, providing flexibility without long-term commitments.
Sell customized licenses to large organizations for on-premises or private cloud deployment, including enhanced security features and priority support. This targets industries like legal or healthcare that handle sensitive data and require compliance with regulations.
💬 Integration Tip
Ensure the GEMINI_API_KEY is securely stored as an environment variable and test with sample files to verify MIME type detection before full deployment.
Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Gemini CLI for one-shot Q&A, summaries, and generation.
Research any topic from the last 30 days on Reddit + X + Web, synthesize findings, and write copy-paste-ready prompts. Use when the user wants recent social/web research on a topic, asks "what are people saying about X", or wants to learn current best practices. Requires OPENAI_API_KEY and/or XAI_API_KEY for full Reddit+X access, falls back to web search.
Check Antigravity account quotas for Claude and Gemini models. Shows remaining quota and reset times with ban detection.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates opencla...
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates openclaw.json. Use when the user mentions free AI, OpenRouter, model switching, rate limits, or wants to reduce AI costs.