alicloud-ai-audio-tts-voice-cloneVoice cloning workflows with Alibaba Cloud Model Studio Qwen TTS VC models. Use when creating cloned voices from sample audio and synthesizing text with clon...
Install via ClawdBot CLI:
clawdbot install cinience/alicloud-ai-audio-tts-voice-cloneCategory: provider
Use voice cloning models to replicate timbre from enrollment audio samples.
Use one of these exact model strings:
qwen3-tts-vc-2026-01-22qwen3-tts-vc-realtime-2026-01-15python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.text (string, required)voice_sample (string | bytes, required) enrollment samplevoice_name (string, optional)stream (bool, optional)audio_url (string) or streaming PCM chunksvoice_id (string)request_id (string)voice_id and reuse for future synthesis requests.Prepare a normalized request JSON and validate response schema:
.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-voice-clone/scripts/prepare_voice_clone_request.py \
--text "欢迎来到语音复刻演示" \
--voice-sample "https://example.com/voice-sample.wav"
output/ai-audio-tts-voice-clone/audio/OUTPUT_DIR.references/sources.mdGenerated Mar 1, 2026
Enables creation of custom voice assistants for smart home devices or mobile apps, allowing users to clone their own voice or a preferred voice for interactions. This enhances user engagement by providing a familiar and personalized auditory experience.
Allows authors or publishers to clone a narrator's voice from sample recordings, enabling efficient synthesis of audiobook content without requiring continuous studio sessions. This reduces production costs and time while maintaining consistent voice quality.
Integrates cloned voices into automated customer service systems to provide a more human-like and brand-consistent interaction. This improves customer satisfaction by using a recognizable voice for IVR systems or chatbots.
Supports educators in generating voiceovers for online courses or tutorials by cloning their voice from lecture samples. This ensures a consistent teaching presence across multimedia materials, enhancing learning accessibility.
Assists individuals with speech impairments by cloning their pre-recorded voice samples to synthesize new speech for communication devices. This helps restore a natural-sounding voice for daily use in healthcare settings.
Monetizes the voice cloning functionality by offering API access to developers and businesses on a pay-per-use or subscription basis. This model scales with usage and targets enterprises needing custom voice solutions.
Provides branded voice cloning tools to media companies or software vendors for integration into their own products. This generates revenue through licensing fees and customization services.
Offers a web-based platform where creators can upload voice samples and generate synthesized audio for podcasts, videos, or ads. Revenue comes from premium features, storage, or ad-supported free tiers.
💬 Integration Tip
Ensure voice samples are high-quality and free of background noise to optimize cloning accuracy, and manage API keys securely in environment variables.
Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.
Local Voice Input/Output for Agents using the AI Voice Agent API.
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
本地生成 Telegram 语音消息,支持自动清洗、分段与临时文件管理。
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
向指定 Telegram 群组发送语音消息