voice-ai-voicesHigh-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.
Install via ClawdBot CLI:
clawdbot install gizmoGremlin/voice-ai-voicesRequires:
Set your API key as an environment variable:
export VOICE_AI_API_KEY="your-api-key"
Get your API key: Voice.ai Dashboard
No install step is required. This skill bundles a Node.js CLI and SDK (no external npm dependencies).
scripts/tts.js - CLI entrypointvoice-ai-tts-sdk.js - Node.js SDK used by the CLIvoices.json - Voice definitions used by the CLIvoice-ai-tts.yaml - API specificationpackage.json - Skill metadata for toolingSee SECURITY.md for the full security and privacy overview.
This skill:
https://dev.voice.aivoices.json--output path (default output.mp3)The SDK and spec use https://dev.voice.ai, which is the official Voice.ai production API domain.
OpenClaw can invoke the CLI script directly if your environment exposes VOICE_AI_API_KEY. Use the /tts commands as configured by your OpenClaw installation.
These chat commands work with OpenClaw:
| Command | Description |
|---------|-------------|
| /tts | Generate speech with default voice |
| /tts --voice ellie | Generate speech with specific voice |
| /tts --stream | Generate with streaming mode |
| /voices | List available voices |
Examples:
/tts Hello, welcome to Voice.ai!
/tts --voice oliver Good morning, everyone.
/tts --voice lilith --stream This is a long story that will stream as it generates...
| Voice | ID | Gender | Persona | Best For |
|---------|-----|--------|-------------|----------------------------|
| ellie | d1bf0f33-8e0e-4fbf-acf8-45c3c6262513 | female | youthful | Vlogs, social content |
| oliver | f9e6a5eb-a7fd-4525-9e92-75125249c933 | male | british | Narration, tutorials |
| lilith | 4388040c-8812-42f4-a264-f457a6b2b5b9 | female | soft | ASMR, calm content |
| smooth | dbb271df-db25-4225-abb0-5200ba1426bc | male | deep | Documentaries, audiobooks |
| shadow | 72d2a864-b236-402e-a166-a838ccc2c273 | male | distinctive | Gaming, entertainment |
| sakura | 559d3b72-3e79-4f11-9b62-9ec702a6c057 | female | anime | Character voices |
| zenith | ed751d4d-e633-4bb0-8f5e-b5c8ddb04402 | male | deep | Gaming, dramatic content |
| flora | a931a6af-fb01-42f0-a8c0-bd14bc302bb1 | female | cheerful | Kids content, upbeat |
| commander | bd35e4e6-6283-46b9-86b6-7cfa3dd409b9 | male | heroic | Gaming, action content |
| Code | Language |
|------|------------|
| en | English |
| es | Spanish |
| fr | French |
| de | German |
| it | Italian |
| pt | Portuguese |
| pl | Polish |
| ru | Russian |
| nl | Dutch |
| sv | Swedish |
| ca | Catalan |
Use the multilingual model for non-English languages:
const audio = await client.generateSpeech({
text: 'Bonjour le monde!',
voice_id: 'ellie-voice-id',
model: 'voiceai-tts-multilingual-v1-latest',
language: 'fr'
});
Customize voice output with these parameters:
| Parameter | Range | Default | Description |
|-----------|-------|---------|-------------|
| temperature | 0-2 | 1.0 | Higher = more expressive, lower = more consistent |
| top_p | 0-1 | 0.8 | Controls randomness in speech generation |
Example:
const audio = await client.generateSpeech({
text: 'This will sound very expressive!',
voice_id: 'ellie-voice-id',
temperature: 1.8,
top_p: 0.9
});
Generate audio with real-time streaming (recommended for long texts):
# Stream audio as it generates
node scripts/tts.js --text "This is a long story..." --voice ellie --stream
# Streaming with custom output
node scripts/tts.js --text "Chapter one..." --voice oliver --stream --output chapter1.mp3
SDK streaming:
const stream = await client.streamSpeech({
text: 'Long text here...',
voice_id: 'ellie-voice-id'
});
// Pipe to file
stream.pipe(fs.createWriteStream('output.mp3'));
// Or handle chunks
stream.on('data', chunk => {
// Process audio chunk
});
| Format | Description | Use Case |
|--------|-------------|----------|
| mp3 | Standard MP3 (32kHz) | General use |
| wav | Uncompressed WAV | High quality |
| pcm | Raw PCM audio | Processing |
| opus_48000_128 | Opus 128kbps | Streaming |
| mp3_44100_192 | High-quality MP3 | Professional |
See voice-ai-tts-sdk.js for all format options.
# Set API key
export VOICE_AI_API_KEY="your-key-here"
# Generate speech
node scripts/tts.js --text "Hello world!" --voice ellie
# Choose different voice
node scripts/tts.js --text "Good morning!" --voice oliver --output morning.mp3
# Use streaming for long texts
node scripts/tts.js --text "Once upon a time..." --voice lilith --stream
# Show help
node scripts/tts.js --help
voice-ai-tts/
โโโ SKILL.md # This documentation
โโโ README.md # Quick start
โโโ CHANGELOG.md # Version history
โโโ LICENSE.md # MIT license
โโโ SECURITY.md # Security & privacy notes
โโโ voices.json # Voice definitions
โโโ voice-ai-tts.yaml # OpenAPI specification
โโโ voice-ai-tts-sdk.js # JavaScript/Node.js SDK
โโโ package.json # OpenClaw metadata
โโโ scripts/
โ โโโ tts.js # CLI tool
Voice.ai uses a credit-based system. Check your usage:
// The SDK tracks usage via API responses
const voices = await client.listVoices();
// Check response headers for rate limit info
Tips to reduce costs:
metadata.clawdbot so ClawHub shows required env varsVOICE_AI_API_KEY as primary env var in metadataVOICE_AI_API_KEY via environment variable onlySECURITY.md and LICENSE.md for provenance and transparencyvoices.json for voice dataconst VoiceAI = require('./voice-ai-tts-sdk');
const client = new VoiceAI(process.env.VOICE_AI_API_KEY);
// List voices
const voices = await client.listVoices({ limit: 10 });
// Get voice details
const voice = await client.getVoice('voice-id');
// Generate speech
const audio = await client.generateSpeech({
text: 'Hello, world!',
voice_id: 'voice-id',
audio_format: 'mp3'
});
// Generate to file
await client.generateSpeechToFile(
{ text: 'Hello!', voice_id: 'voice-id' },
'output.mp3'
);
// Stream speech
const stream = await client.streamSpeech({
text: 'Long text...',
voice_id: 'voice-id'
});
// Delete voice
await client.deleteVoice('voice-id');
| Error | Cause | Solution |
|-------|-------|----------|
| AuthenticationError | Invalid API key | Check your VOICE_AI_API_KEY |
| PaymentRequiredError | Out of credits | Add credits at voice.ai/dashboard |
| RateLimitError | Too many requests | Wait and retry, or upgrade plan |
| ValidationError | Invalid parameters | Check text length and voice_id |
Made with โค๏ธ by Nick Gill
Generated Mar 1, 2026
Vloggers and influencers can use the skill to generate voiceovers for videos in multiple languages, leveraging the 9 personas for different tones. The streaming mode allows real-time audio generation for long-form content, enhancing production efficiency.
Educational platforms can integrate the skill to produce multilingual audio for courses, using voices like Oliver for clear tutorials. Customizable parameters like temperature help tailor the narration to match educational content styles.
Game developers can utilize voices like Shadow or Commander for character dialogues and in-game announcements, with support for 11 languages to reach global audiences. Streaming mode enables dynamic audio generation during gameplay.
Publishers can generate high-quality audiobooks using voices like Smooth for deep narration, with multilingual capabilities for international distribution. The skill's audio formats like MP3 and WAV ensure compatibility with various platforms.
Businesses can deploy the skill for automated voice responses in customer support systems, using cheerful voices like Flora for upbeat interactions. Integration with OpenClaw allows easy command-based triggering for real-time use.
Offer the skill as a service with tiered subscriptions based on usage limits, targeting developers and businesses needing high-quality TTS. Revenue streams include monthly fees and pay-per-use options for scalable access.
License the skill to e-learning or gaming platforms as an embedded TTS solution, providing custom branding and voice options. Revenue is generated through licensing fees and revenue-sharing agreements with partners.
Provide basic voice synthesis for free to attract individual creators, while charging for advanced features like streaming mode, additional languages, or premium voices. Revenue comes from upgrades and in-app purchases.
๐ฌ Integration Tip
Ensure the VOICE_AI_API_KEY is set as an environment variable and test with simple commands like /tts before scaling to complex streaming scenarios.
Turn your AI into JARVIS. Voice, wit, and personality โ the complete package. Humor cranked to maximum.
Local Voice Input/Output for Agents using the AI Voice Agent API.
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
ๆฌๅฐ็ๆ Telegram ่ฏญ้ณๆถๆฏ๏ผๆฏๆ่ชๅจๆธ ๆดใๅๆฎตไธไธดๆถๆไปถ็ฎก็ใ
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
ๅๆๅฎ Telegram ็พค็ปๅ้่ฏญ้ณๆถๆฏ