voice-messageSend voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...
Install via ClawdBot CLI:
clawdbot install xmanrui/voice-messageGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Mar 20, 2026
AI agents can convert text-based support replies into voice messages for platforms like Telegram or WhatsApp, providing a more personal touch. This is especially useful for visually impaired users or in scenarios where reading text is inconvenient, such as while driving.
Educational platforms can use this skill to generate voice messages with correct pronunciation in multiple languages, aiding learners in listening practice. It supports various voices like Chinese and English, making it versatile for global language apps.
Companies using Feishu/Lark for collaboration can send voice messages instead of text for quick updates or instructions, as this skill ensures voice bubbles are displayed correctly. It bypasses the limitation of Feishu's message tool not supporting asVoice=true.
Community managers on Discord can send voice messages with embedded waveforms for enhanced engagement, such as announcements or interactive content. This requires generating waveforms to meet Discord's API requirements for voice messages.
News or content creators can broadcast voice updates in different languages to Signal groups, leveraging edge-tts for text-to-speech conversion. This allows for automated, timely dissemination of information without manual recording.
Offer a cloud-based service where businesses pay a monthly fee to integrate this skill into their AI agents for automated voice messaging across multiple platforms. Revenue comes from tiered plans based on usage volume and supported channels.
Provide consulting and development services to customize this skill for specific enterprise needs, such as integrating with proprietary chat systems or adding custom voices. Revenue is generated through one-time project fees or ongoing support contracts.
Distribute the skill as open-source with basic functionality, while offering premium features like advanced voice options, higher quality audio, or priority support for a fee. Revenue streams include upgrades and donations from users.
💬 Integration Tip
Ensure ffmpeg and edge-tts are installed, and test with Feishu's API for token handling to avoid file attachment issues.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.