local-whisper-cppLocal speech-to-text using whisper-cli (whisper.cpp).
Install via ClawdBot CLI:
clawdbot install wuxxin/local-whisper-cppTranscribe audio files locally using whisper-cli and the large-v3-turbo model.
You can use the wrapper script:
scripts/whisper-local.sh Or call the binary directly:
whisper-cli -m /usr/share/whisper.cpp-model-large-v3-turbo/ggml-large-v3-turbo.bin -f -l auto -nt scripts/whisper-local.sh (inside skill folder)/usr/share/whisper.cpp-model-large-v3-turbo/ggml-large-v3-turbo.binwhisper-cli.Download the model to /usr/share/whisper.cpp-model-large-v3-turbo/:
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin?download=true -O /usr/share/whisper.cpp-model-large-v3-turbo/ggml-large-v3-turbo.bin
Generated Feb 23, 2026
Doctors and healthcare professionals can use this skill to transcribe patient consultations and medical notes locally, ensuring data privacy and HIPAA compliance. It enables quick documentation without relying on cloud services, improving workflow efficiency in clinics and hospitals.
Researchers in fields like linguistics or social sciences can transcribe interviews and focus group discussions locally for qualitative analysis. This avoids data transfer to external servers, protecting sensitive participant information and reducing costs associated with cloud transcription services.
Law firms and legal professionals can transcribe court proceedings, client meetings, and depositions on-premises to maintain confidentiality and adhere to legal data protection standards. It provides a reliable, offline solution for creating accurate records without internet dependency.
Content creators and journalists can transcribe audio recordings from interviews or podcasts locally for subtitling, editing, and archiving. This skill allows for fast processing without uploading files to external platforms, enhancing privacy and control over media assets.
Companies can use this skill to transcribe customer service calls locally for quality assurance and training purposes. It helps analyze interactions without exposing sensitive customer data to third-party services, supporting internal compliance and improvement initiatives.
Sell licenses for the skill as part of a bundled AI toolkit to businesses requiring local speech-to-text, such as healthcare or legal firms. Revenue comes from one-time purchases or annual subscriptions, with support for custom integrations and updates.
Offer consulting services to help organizations deploy and customize the skill within their existing infrastructure, ensuring compatibility and optimal performance. Revenue is generated through project-based fees and ongoing maintenance contracts.
Partner with hardware vendors to pre-install the skill on dedicated transcription devices or servers for industries like media or education. Revenue comes from sales of bundled solutions, including hardware and software support.
💬 Integration Tip
Ensure the whisper-cli binary is installed and the model file is correctly placed in /usr/share/whisper.cpp-model-large-v3-turbo/ for seamless script execution.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
End-to-end encrypted agent-to-agent private messaging via Moltbook dead drops. Use when agents need to communicate privately, exchange secrets, or coordinate without human visibility.
Text-to-speech via OpenAI Audio Speech API.