Text-to-speech conversion using node-edge-tts npm package for generating audio from text.
Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation.
Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chinese (Mandarin), English, Japanese, etc.
Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium speaker voices, and instruction-based voice control (emotion, tone, style). Alternative to cloud-based TTS services like ElevenLabs. Runs entirely offline after initial model download.
Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required.
Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud.
Supports multiple languages including German (thorsten) and English (ryan) voices.
Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.
ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp,...
+12
24
5.2k
6
8d ago
🎤Speech & Audio
it will help you to send voice messages to your AI Assistant and also can make it talk
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
End-user guide for running and configuring the `translate` CLI across text/stdin/file/glob inputs, provider selection, presets, custom prompt templates, and...
Translates articles and documents between languages with three modes - quick (direct), normal (analyze then translate), and refined (analyze, translate, revi...
Japanese-English translator and language tutor. Use when: (1) User shares Japanese text and wants translation (news articles, tweets, signs, menus, emails). (2) User asks "what does X mean" for Japanese words/phrases. (3) User wants to learn Japanese grammar, vocabulary, or cultural context. (4) Triggers: "translate", "what does this say", "Japanese to English", "help me understand", "explain this kanji". Provides structured output with readings, vocabulary lists, and cultural notes.
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
Translate text, files, and conversations between any languages. Auto-detects source language. Preserves formatting (markdown, code blocks, tables). Use when...
AI-agent Skill for PPTX OOXML localization workflows. Use it to unpack PPTX, extract and apply text translations, normalize terminology, enforce language-specific fonts, validate XML integrity, and repack outputs with machine-readable JSON interfaces for automation.
Text-to-speech generation on Volcengine audio services. Use when users need narration, multi-language speech output, voice selection, or TTS troubleshooting.
When the user wants to translate content, create translation workflows, manage terminology, or optimize translation quality. Also use when the user mentions...
Translate text via LibreTranslate API with offline dictionary fallback. Use when translating content, detecting languages, managing glossaries, or setting up...
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...