openai-whisperLocal speech-to-text with the Whisper CLI (no API key).
Install via ClawdBot CLI:
clawdbot install steipete/openai-whisperInstall OpenAI Whisper (brew):
brew install openai-whisperRequires:
Use whisper to transcribe audio locally.
Quick start
whisper /path/audio.mp3 --model medium --output_format txt --output_dir .whisper /path/audio.m4a --task translate --output_format srtNotes
~/.cache/whisper on first run.--model defaults to turbo on this install.Generated Feb 25, 2026
Content creators can transcribe podcast episodes locally for accessibility and SEO. This enables generating text versions for websites and subtitles without relying on cloud services.
Researchers transcribe qualitative interviews for analysis while keeping sensitive data offline. This supports data privacy in fields like sociology or market research.
Businesses transcribe internal meetings to create accurate minutes and action items. It helps teams review discussions without manual note-taking.
Video producers translate and transcribe audio for subtitles in multiple languages. This aids in making content globally accessible without API costs.
Legal professionals transcribe depositions and hearings locally for case documentation. It ensures confidentiality and reduces reliance on external services.
Offer the CLI tool for free while charging for advanced features like batch processing or custom model training. Revenue comes from subscriptions for enterprise support and updates.
Provide APIs or plugins that integrate Whisper into existing software platforms. Revenue is generated through licensing fees and per-use charges for high-volume clients.
Offer consulting services to help organizations implement and optimize Whisper for specific needs. Revenue comes from project-based fees and ongoing maintenance contracts.
💬 Integration Tip
Ensure the Whisper CLI is installed via brew and models are cached locally for offline use; test with sample audio files to verify setup before integration.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
End-to-end encrypted agent-to-agent private messaging via Moltbook dead drops. Use when agents need to communicate privately, exchange secrets, or coordinate without human visibility.
Text-to-speech via OpenAI Audio Speech API.
Control Amazon Alexa devices and smart home via the `alexacli` CLI. Use when a user asks to speak/announce on Echo devices, control lights/thermostats/locks, send voice commands, or query Alexa.