voice-transcribeTranscribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
Install via ClawdBot CLI:
clawdbot install darinkishore/voice-transcribetranscribe audio files using openai's gpt-4o-mini-transcribe model.
when receiving voice memos (especially via whatsapp), just run:
uv run /Users/darin/clawd/skills/voice-transcribe/transcribe <audio-file>
then respond based on the transcribed content.
if darin says a word was transcribed wrong, add it to vocab.txt (for hints) or replacements.txt (for guaranteed fix). see sections below.
# transcribe a voice memo
transcribe /tmp/voice-memo.ogg
# pipe to other tools
transcribe /tmp/memo.ogg | pbcopy
/Users/darin/clawd/skills/voice-transcribe/.env:
OPENAI_API_KEY=sk-...
add words to vocab.txt (one per line) to help the model recognize names/jargon:
Clawdis
Clawdbot
if the model still gets something wrong, add a replacement to replacements.txt:
wrong spelling -> correct spelling
Generated Mar 1, 2026
Transcribe customer voice memos from WhatsApp to analyze feedback and complaints. Enables quick response by converting audio to text for ticket creation and sentiment analysis.
Transcribe doctor's voice notes for patient records using custom medical vocabulary. Ensures accuracy with replacements for specialized terms, aiding in documentation and billing.
Transcribe audio recordings from legal interviews or depositions. Uses vocabulary hints for legal jargon and replacements for accurate transcriptions to support case preparation.
Transcribe lectures or training sessions from audio files for note-taking and accessibility. Custom vocabulary helps with subject-specific terms, enhancing study materials.
Transcribe field interviews or voice memos for news articles. Enables quick text extraction with replacements for names and technical terms, streamlining content creation.
Offer free basic transcription with limited features, then charge for advanced options like custom vocabulary and replacements. Targets small businesses and individuals needing occasional transcription.
Integrate the skill into enterprise systems like CRM or healthcare software via API. Charge licensing fees based on usage volume and provide support for industry-specific vocabularies.
License the transcription tool to marketing or legal agencies as a branded service. Includes customization options and bulk processing, generating revenue through setup and maintenance fees.
💬 Integration Tip
Integrate with messaging apps like WhatsApp via webhooks to automate transcription of incoming voice memos, reducing manual file handling.
Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.
Local Voice Input/Output for Agents using the AI Voice Agent API.
本地生成 Telegram 语音消息,支持自动清洗、分段与临时文件管理。
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
向指定 Telegram 群组发送语音消息
Generate Russian male voice audio using ComfyUI with Qwen3 TTS node and save as MP3 for voice messages.