alicloud-ai-audio-ttsGenerate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech,...
Install via ClawdBot CLI:
clawdbot install cinience/alicloud-ai-audio-ttsGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://dashscope-intl.aliyuncs.com/api/v1Audited Apr 16, 2026 · audit v1.0
Generated Mar 1, 2026
Generate voiceovers for social media videos like TikTok or Instagram Reels, where quick, engaging narration is needed. Useful for creators producing drama skits, news recaps, or educational snippets without recording equipment.
Convert written course materials or training scripts into audio for online learning platforms. Helps create accessible content for visual learners or multilingual audiences by adjusting language and tone.
Produce pre-recorded audio responses for IVR systems or chatbots to enhance user interactions. Can be used for announcements, instructions, or feedback prompts in call centers.
Automate the generation of audio summaries from text reports or logs, such as converting daily news articles or technical documentation into spoken format for hands-free consumption.
Create audio versions of websites, books, or documents to assist individuals with visual impairments. Supports multiple languages and customizable voices for better user experience.
Offer a cloud-based platform where users pay a monthly fee to access TTS generation with advanced features like custom voices and high-volume usage. Targets influencers, marketers, and small businesses needing regular audio content.
License the TTS technology to large companies for integration into their internal systems, such as e-learning platforms or customer service tools. Includes support and customization based on usage volume.
Provide basic TTS generation for free to attract individual users, then upsell premium features like faster processing, exclusive voices, or advanced streaming capabilities. Focuses on building a user base and converting to paid plans.
💬 Integration Tip
Ensure the DASHSCOPE_API_KEY is set in environment variables for seamless authentication, and cache audio outputs to reduce API costs and latency.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.