openai-ttsText-to-speech via OpenAI Audio Speech API.
Install via ClawdBot CLI:
clawdbot install pors/openai-ttsRequires:
Generate speech from text via OpenAI's /v1/audio/speech endpoint.
{baseDir}/scripts/speak.sh "Hello, world!"
{baseDir}/scripts/speak.sh "Hello, world!" --out /tmp/hello.mp3
Defaults:
tts-1 (fast) or tts-1-hd (quality)alloy (neutral), also: echo, fable, onyx, nova, shimmermp3| Voice | Description |
|-------|-------------|
| alloy | Neutral, balanced |
| echo | Male, warm |
| fable | British, expressive |
| onyx | Deep, authoritative |
| nova | Female, friendly |
| shimmer | Female, soft |
{baseDir}/scripts/speak.sh "Text" --voice nova --model tts-1-hd --out speech.mp3
{baseDir}/scripts/speak.sh "Text" --format opus --speed 1.2
Options:
--voice : alloy|echo|fable|onyx|nova|shimmer (default: alloy)--model : tts-1|tts-1-hd (default: tts-1)--format : mp3|opus|aac|flac|wav|pcm (default: mp3)--speed : 0.25-4.0 (default: 1.0)--out : output file (default: stdout or auto-named)Set OPENAI_API_KEY, or configure in ~/.clawdbot/clawdbot.json:
{
skills: {
entries: {
"openai-tts": {
apiKey: "sk-..."
}
}
}
}
Very affordable for short responses!
Generated Mar 1, 2026
Authors and publishers can convert written manuscripts into high-quality audio versions using different voices for characters or narration. This reduces production costs and time compared to hiring voice actors, making audiobook creation more accessible.
Developers integrate this skill into apps and websites to provide text-to-speech functionality for visually impaired users or those preferring audio content. It enhances user experience by converting articles, notifications, or instructions into speech in real-time.
Language learning platforms use the skill to generate pronunciation examples and listening exercises with various accents and speeds. Students can practice by hearing correct intonations, improving their auditory comprehension and speaking skills.
Businesses implement this skill in IVR systems or chatbots to provide spoken responses to customer inquiries, reducing wait times and operational costs. It allows for natural-sounding voice prompts in multiple languages and tones.
Marketing teams create audio ads, promotional videos, or social media content by converting scripts into speech with customizable voices and emotions. This streamlines production without needing professional recording studios.
Offer this skill as part of a subscription-based platform where users pay monthly for API access to generate speech. Include tiered pricing based on usage volume, such as characters processed, with premium features like HD voices.
Provide a free web app with basic TTS features and limited monthly usage, then charge for advanced options like custom voices, higher speed limits, or bulk processing. Monetize through in-app purchases or enterprise plans.
License the skill to companies for embedding into their own products, such as educational software or customer service tools. Charge a one-time setup fee plus ongoing support or usage-based royalties.
💬 Integration Tip
Ensure the OPENAI_API_KEY is securely stored in environment variables or configuration files, and test voice outputs with different formats to match application requirements.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
End-to-end encrypted agent-to-agent private messaging via Moltbook dead drops. Use when agents need to communicate privately, exchange secrets, or coordinate without human visibility.
Control Amazon Alexa devices and smart home via the `alexacli` CLI. Use when a user asks to speak/announce on Echo devices, control lights/thermostats/locks, send voice commands, or query Alexa.