voice-recognitionLocal speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.
Install via ClawdBot CLI:
clawdbot install gykdly/voice-recognitionGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Mar 20, 2026
Transcribe and translate international business meetings from audio recordings into English text. Useful for teams with diverse language speakers to create accessible minutes and summaries.
Convert recorded university lectures or seminars into text for note-taking and summarization. Helps students and educators generate study materials and quick overviews in multiple languages.
Transcribe podcast episodes for subtitles, show notes, or translation into English. Enables creators to repurpose audio content into written formats for broader audience reach.
Analyze recorded customer service calls by transcribing them and generating summaries. Assists in identifying common issues and improving service quality across different languages.
Transcribe legal depositions or interviews from audio files into accurate text records. Supports multiple languages and can translate to English for international cases.
Offer basic transcription for free with local processing, then charge for advanced features like batch processing, API integration, or premium support. Targets individual users and small teams.
License the skill to businesses for embedding into their platforms, such as video conferencing tools or CRM systems. Provides scalable transcription and translation services with custom branding.
Provide tailored solutions for specific industries, like healthcare or education, with custom scripts, training, and integration support. Focuses on high-value clients needing specialized workflows.
💬 Integration Tip
Ensure Whisper CLI is installed via brew and Python 3.10+ is available; set up the alias in shell config for quick command-line access.
Scored Apr 15, 2026
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.