kokoro-ttsGenerate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.
Install via ClawdBot CLI:
clawdbot install edkief/kokoro-ttsThis skill allows you to generate high-quality AI speech using a local or remote Kokoro-TTS instance.
The skill uses the KOKORO_API_URL environment variable to locate the API.
http://localhost:8880/v1/audio/speechKOKORO_API_URL=http://your-server:port/v1/audio/speech to your .env file or environment.To generate speech, run the included Node.js script.
node skills/kokoro-tts/scripts/tts.js "<text>" [voice] [speed]
af_heart.1.0.node skills/kokoro-tts/scripts/tts.js "Hello Ed, this is Theosaurus speaking." af_nova
The script will output a single line starting with MEDIA: followed by the path to the generated MP3 file. OpenClaw will automatically pick this up and send it as an audio attachment.
Example Output:
MEDIA: media/tts_1706745000000.mp3
Common choices:
af_heart (Default, Female, Warm)af_nova (Female, Professional)am_adam (Male, Deep)bf_alice (British Female)For a full list, see references/voices.md or query the API.
Generated Mar 1, 2026
Integrate Kokoro TTS into chatbots or IVR systems to generate natural-sounding voice responses for customer inquiries. This enhances user experience by providing clear, consistent audio feedback without relying on pre-recorded clips, reducing operational costs.
Use the skill to convert textbooks, articles, or study materials into spoken audio for e-learning platforms. This aids accessibility for visually impaired students and supports auditory learners, enabling scalable production of audio resources.
Generate voiceovers for podcasts, audiobooks, or video content using customizable voices and speeds. This allows creators to quickly produce high-quality audio without hiring voice actors, streamlining content workflows.
Implement TTS in healthcare apps to provide medication reminders, appointment notifications, or health tips in spoken form. This improves patient engagement and accessibility, especially for elderly or disabled individuals.
Embed Kokoro TTS into smart devices like speakers or home assistants to deliver voice alerts, weather updates, or news summaries. This enhances user interaction by offering personalized, real-time audio feedback.
Offer Kokoro TTS as a cloud-based API service with tiered pricing based on usage volume or features like custom voices. This generates recurring revenue from developers and businesses integrating speech synthesis into their applications.
License the TTS technology to other companies for embedding into their products, such as call center software or educational tools. This provides upfront or ongoing licensing fees while expanding market reach through partnerships.
Provide a free tier with basic TTS features and limited usage, then charge for advanced options like higher-quality voices, faster speeds, or priority support. This attracts a broad user base and converts a portion to paid plans.
💬 Integration Tip
Ensure the KOKORO_API_URL is correctly set in your environment variables and test with sample scripts to verify audio output before full deployment.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
End-to-end encrypted agent-to-agent private messaging via Moltbook dead drops. Use when agents need to communicate privately, exchange secrets, or coordinate without human visibility.
Text-to-speech via OpenAI Audio Speech API.