local-voskLocal speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
Install via ClawdBot CLI:
clawdbot install sfkiwi/local-voskGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://alphacephei.com/vosk/modelsAudited Apr 17, 2026 · audit v1.0
Generated Mar 1, 2026
Automatically transcribe Telegram voice messages (.ogg format) into text for users who prefer reading over listening. This is particularly useful in noisy environments or for users with hearing impairments, providing instant text conversion without cloud dependency.
Transcribe business meetings or interviews recorded as audio files in various formats (mp3, wav, m4a) without requiring internet connectivity. Ideal for confidential discussions where cloud-based services pose security risks or in areas with poor internet access.
Convert lecture recordings or educational audio materials into text for students who benefit from written content. Enables offline study materials creation and supports learners with different preferences without recurring API costs.
Transcribe interviews or field recordings made by journalists in remote locations with limited internet. Provides quick text drafts for article preparation while maintaining source confidentiality through local processing.
Convert personal voice memos and recordings into searchable text archives. Helps users organize thoughts, reminders, or creative ideas captured verbally without relying on cloud storage or subscription services.
Sell the skill package as a standalone offline transcription tool with lifetime usage rights. Customers pay once for perpetual access to local STT capabilities without recurring fees, appealing to privacy-conscious users and organizations.
Offer custom integration services to businesses wanting to embed offline speech-to-text into their existing applications (e.g., customer service platforms, internal tools). Charge for implementation, customization, and technical support.
Package the skill with optimized hardware (single-board computers, dedicated devices) for specific use cases like offline transcription kiosks or secure recording devices. Target industries requiring turnkey offline solutions.
💬 Integration Tip
Ensure ffmpeg is installed on the target system for format compatibility, and guide users to download appropriate Vosk models for their language needs from the official repository.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.