whisper-transcribeTranscribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
Install via ClawdBot CLI:
clawdbot install JosunLP/whisper-transcribeGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Mar 20, 2026
Transcribe podcast episodes to create show notes, improve SEO, and generate subtitles for video platforms. This enables content repurposing and accessibility for hearing-impaired audiences.
Convert recorded university lectures into text for study materials, closed captions, and archiving. Supports multiple languages for international student accessibility and research documentation.
Transcribe audio from corporate meetings to produce accurate minutes, action items, and compliance records. Facilitates remote collaboration and knowledge sharing across teams.
Generate subtitles in SRT or VTT formats for videos to reach global audiences. Enables quick translation workflows and enhances viewer engagement on platforms like YouTube.
Transcribe audio recordings of legal proceedings for official records, evidence preparation, and accessibility. Ensures verbatim accuracy with timestamped outputs for reference.
Offer free basic transcription with limited features and charge for advanced options like batch processing, high-accuracy models, or API access. Targets small businesses and individual creators.
Sell customized packages to companies for internal use, such as meeting transcription or content localization. Includes support, integration services, and volume discounts.
Build a platform where freelancers use this tool to provide transcription services to clients. Charge a commission on transactions or offer premium listings for service providers.
💬 Integration Tip
Ensure ffmpeg is installed for audio decoding and use batch processing for handling multiple files efficiently to save time.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.