audio-genGenerate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple formats (audiobook, podcast, educational), custom lengths, and voice effects. Use when asked to create audio content, make a podcast, generate an audiobook, or produce educational audio. Returns MP3 audio file via MEDIA token.
Install via ClawdBot CLI:
clawdbot install udiedrichsen/audio-genGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://github.com/clawdbot/clawdbotAudited Apr 16, 2026 · audit v1.0
Generated Mar 20, 2026
Authors can generate audiobook chapters from their written content or new ideas, enabling rapid production without hiring voice actors. This reduces costs and time-to-market for indie authors looking to expand into audio formats.
Educators and e-learning platforms can create audio lessons, tutorials, or summaries on complex topics like science or history. This supports auditory learners and provides supplementary materials for online courses.
Podcasters can quickly produce episodes on trending topics or fill gaps in their content calendar. The tool helps maintain consistent output with scripted, high-quality audio in conversational styles.
Businesses can generate audio guides for employee training, onboarding processes, or compliance updates. This allows for scalable, engaging content that employees can listen to on-the-go.
Organizations can convert written materials like articles, reports, or newsletters into audio formats. This improves accessibility for visually impaired audiences and complies with inclusivity standards.
Offer monthly subscriptions for users to generate a set number of audio files per month, such as audiobook chapters or podcast episodes. This provides recurring revenue and caters to regular content creators.
Charge users per minute of audio generated, with tiered pricing based on length or quality. This model appeals to occasional users or those with variable content needs, like event organizers.
License the skill to companies for internal use, such as in e-learning platforms or media agencies. Customize features like voice branding or integration with existing workflows for a premium fee.
💬 Integration Tip
Ensure API keys for Anthropic and ElevenLabs are securely stored and accessible; test script formatting to avoid TTS errors before full deployment.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.