speech-to-textTranscribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
Install via ClawdBot CLI:
clawdbot install okaris/speech-to-textGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://inference.shAudited Apr 16, 2026 · audit v1.0
Generated Mar 1, 2026
Transcribe recorded meetings for businesses to create searchable archives and minutes. This enables efficient review and compliance documentation, especially useful for remote teams and legal proceedings.
Generate accurate transcripts for podcast episodes to improve accessibility and SEO. This helps content creators reach wider audiences, including those with hearing impairments, and enhances discoverability through text-based search.
Create synchronized subtitles for educational videos to support diverse learners and language accessibility. This aids in comprehension for non-native speakers and complies with accessibility standards in online courses.
Transcribe doctor's voice notes into structured text for electronic health records. This streamlines documentation, reduces manual entry errors, and improves patient record management in clinical settings.
Transcribe qualitative interviews for academic or market research projects. This facilitates data analysis, coding, and reporting by converting audio recordings into text for detailed review and insights extraction.
Offer a monthly subscription for API access to transcription services, with tiered pricing based on usage volume. This provides predictable revenue and caters to businesses needing regular transcription for meetings or content creation.
Provide free limited transcription with paid upgrades for higher accuracy, faster processing, or additional features like translation. This attracts individual users and small teams, converting them to paying customers as needs grow.
License the transcription technology to enterprises for integration into their own platforms, such as video conferencing tools or content management systems. This generates high-value contracts through customization and support services.
💬 Integration Tip
Use the provided CLI examples to test basic transcription first, then automate workflows by scripting the commands for batch processing or integrating with webhooks.
Scored Apr 16, 2026
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.