telegram-offline-voice本地生成 Telegram 语音消息,支持自动清洗、分段与临时文件管理。
Install via ClawdBot CLI:
clawdbot install sanwecn/telegram-offline-voice本地生成,一键封装 — 使用 Microsoft Edge-TTS 生成高质量中文语音,完全离线处理。
原生的 TTS 方案通常只能生成 MP3 附件,且无法处理 Markdown 标记和超长文本。本项目通过工程化封装,将“语音合成”进化为“语音交互”:
**, #, [link] 等 Markdown 符号,避免 AI 读出这些“代码噪音”。voice_gen.py 脚本,自动完成“文本->MP3->OGG”的全过程。# 需要 Python 环境和 FFmpeg
sudo apt update && sudo apt install ffmpeg python3-pip -y
# 推荐安装 uv 以极速运行封装脚本
curl -LsSf https://astral.sh/uv/install.sh | sh
直接调用封装脚本,一键生成 Telegram 原生语音气泡路径:
uv run {baseDir}/scripts/voice_gen.py --text "您的待播报内容"
--text / -t: 待生成的文本(必填)。--voice: 声线选择,默认 zh-CN-XiaoxiaoNeural(晓晓)。--rate: 语速调节,默认 +5%。--outdir: 临时文件存放目录,默认 /tmp。脚本会自动移除以下内容以确保朗读流畅:
*, , _, ` `, #`文本 以及所有 http/https 开头的链接---, *** 等由 @sanwe 调优并维护。
欢迎关注我的推特获取更多 OpenClaw 进阶玩法:https://x.com/sanwe
Generated Mar 1, 2026
AI-powered Telegram bots can use this skill to generate voice messages from text responses, enhancing user engagement with natural-sounding audio. It's ideal for news updates, educational content, or customer service bots where privacy and cost efficiency are priorities.
Educational platforms can integrate this skill to create offline voice exercises for Chinese language learners, providing clear pronunciation without internet dependency. It supports custom voice and rate settings for tailored learning experiences.
Developers can build tools that convert text-based content, such as articles or messages, into voice messages for visually impaired users on Telegram. The offline nature ensures data privacy and reduces reliance on cloud services.
Companies can deploy this skill in internal Telegram groups to generate voice summaries from reports or announcements, improving accessibility and engagement. The concurrent safety feature allows multiple teams to use it simultaneously without conflicts.
Content creators can automate the generation of voiceovers for podcasts or audio snippets shared on Telegram, using the cleaning features to remove markdown from scripts. This saves time and resources compared to manual recording.
Offer a free tier with basic voice generation and limited features, then charge for advanced options like custom voices, higher rate limits, or premium support. This model attracts users from cost-sensitive sectors like education or small businesses.
License the skill to other companies or developers who want to integrate offline voice generation into their own products, such as chatbot platforms or communication tools. Provide customization and technical support as part of the package.
Provide consulting services to help organizations implement this skill into their existing systems, such as corporate Telegram bots or accessibility solutions. Offer training, maintenance, and custom development for specific use cases.
💬 Integration Tip
Ensure FFmpeg and Python are installed on the target system, and use the provided script with minimal configuration for quick setup.
Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
Local Voice Input/Output for Agents using the AI Voice Agent API.
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
向指定 Telegram 群组发送语音消息
Generate Russian male voice audio using ComfyUI with Qwen3 TTS node and save as MP3 for voice messages.