🎤 Speech & Audio AI Skills

178 AI agent skills for Speech & Audio. Part of the 🤖 AI & Agents category.

Speech & Audio Skills — Page 7

178 skills

🎤Speech & Audio

VEED UGC

veed-ugc

v1.0.1

View Details

Generate UGC-style promotional videos with AI lip-sync. Takes an image (person with product from Morpheus/Ad-Ready) and a script (pure dialogue), creates a video of the person speaking. Uses ElevenLabs for voice synthesis.

609

today

🎤Speech & Audio

Elevenlabs Pro

openclaw-skill-elevenlabs-pro

v0.1.0

View Details

ElevenLabs advanced TTS for converting text to speech, listing voices, and managing credits

today

🎤Speech & Audio

Local Whisper

whisper-cpp

v1.0.2

View Details

Install and use whisper.cpp (local, free/offline speech-to-text) with OpenClaw. Supports downloading different ggml model sizes (tiny/base/small/medium/large...

2d ago

🎤Speech & Audio

PodcastIndex

podcastindex

v1.0.0

View Details

Search and retrieve podcast and episode details from Podcast Index API using keywords, titles, feed IDs, URLs, or featured persons with authenticated requests.

today

🎤Speech & Audio

yap

yap

tobihagemann

v1.0.1

View Details

Fast on-device speech-to-text transcription on macOS 26+ using Apple Speech.framework, supporting multiple languages and output formats without model downloads.

142

3d ago

🎤Speech & Audio

ComfyUI TTS

comfyui-tts

v1.0.0

View Details

Convert text to speech audio via ComfyUI's Qwen-TTS API, supporting customizable voice, style, model, and output options.

322

today

🎤Speech & Audio

Venice Transcribe

venice-transcribe

v1.0.1

View Details

Transcribe audio to text using Venice AI's Whisper-based speech recognition. Supports WAV, MP3, FLAC, M4A, AAC formats with optional timestamps.

3d ago

🎤Speech & Audio

hotbutter voice chat

hotbutter

v1.0.6

View Details

Enables local voice chat by embedding Hotbutter relay server and PWA, providing speech-to-text and text-to-speech via a secure, self-hosted connection.

today

🎤Speech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe

v1.0.0

View Details

Transcribe audio/video with AssemblyAI (local upload or URL), plus subtitles + paragraph/sentence exports.

1.2k

3d ago

🎤Speech & Audio

Valtec Vietnamese TTS

valtec-tts

v1.0.2

View Details

Local Vietnamese text-to-speech via VITS2 (offline, no cloud). Supports 5 built-in speaker voices and zero-shot voice cloning from reference audio.

10d ago

🎤Speech & Audio

Local Voice (FluidAudio TTS/STT)

local-voice

v1.0.1

View Details

Local text-to-speech (TTS) and speech-to-text (STT) using FluidAudio on Apple Silicon. Sub-second voice synthesis and transcription running entirely on-device via the Apple Neural Engine. Use when setting up local voice capabilities, voice assistant integration, or replacing cloud TTS/STT services.

952

today

🎤Speech & Audio

Deepgram ASR / Deepgram 语音转写

deepgram-asr

v0.2.2

View Details

Transcribe audio via Deepgram Nova-3 API. Fast, accurate, and cost-effective speech-to-text for 50+ languages. Transcripción de audio rápida y precisa. Trans...

4d ago

🎤Speech & Audio

Yandex Speechkit STT via Telegram Gateway

yandex-speechkit-stt

v1.0.0

View Details

Распознавание речи через Yandex SpeechKit API для голосовых сообщений в Telegram. Используй когда пользователь отправляет голосовые сообщения и хочет, чтобы...

yesterday

🎤Speech & Audio

Inworld TTS

inworld-tts

v1.0.0

View Details

Text-to-speech via Inworld.ai API. Use when generating voice audio from text, creating spoken responses, or converting text to MP3/audio files. Supports multiple voices, speaking rates, and streaming for long text.

879

3d ago

🎤Speech & Audio

Truly Local Piper Multilang TTS (secure)

local-piper-tts-multilang-secure

szafranski

v1.1.0

View Details

Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...

3d ago

🎤Speech & Audio

Whisper Transcribe

whisper-transcribe

v1.0.0

View Details

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

751

yesterday

🎤Speech & Audio

mmVoiceMaker

mm-voice-maker

v1.0.0

View Details

Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creat...

2d ago

🎤Speech & Audio

Speech is Cheap Transcribe

asr

v1.2.0

View Details

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

1.1k

3d ago

🎤Speech & Audio

SAM TTS

sam-tts

v1.0.0

View Details

Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-speech synthesizer. Use for /sam command to generate voice messages. Supports /sam on/off toggle mode where all responses are spoken in SAM voice. Supports pitch, speed, mouth, and throat parameters for voice customization.

203

3d ago

🎤Speech & Audio

ton

ton

v0.1.0

View Details

Ton namespace for Netsnek e.U. audio and media processing tools. Handles audio transcription, format conversion, waveform analysis, and podcast production wo...

107

3d ago

🎤Speech & Audio

Audio Processing

audio-processing

v1.1.0

View Details

Audio ingestion, analysis, transformation, and generation (Transcribe, TTS, VAD, Features).

12d ago

🎤Speech & Audio

Audiomind

audiomind

v2.1.7

View Details

One skill for all AI audio: TTS, music, SFX, and voice cloning. Routes your requests to 17+ models (ElevenLabs, fal.ai) via a single proxy. Free tier include...

132

today

🎤Speech & Audio

Whispers from the Star CN

whispers-from-the-star-cn

v1.0.0

View Details

星之低语 - 科幻生存冒险游戏。玩家扮演 Stella Chen，一位坠落在陌生星球的宇航员，需要在盖亚星球上探索、生存、解谜，寻找回家之路。支持多场景探索、资源管理、外星生物互动。适用于科幻冒险、生存模拟、互动叙事等场景。

615

3d ago

🎤Speech & Audio

Podcast Generation from PDF, Text, and Links

ai-podcast

v1.0.11

View Details

Generate AI podcast episodes from PDFs, text, notes, and links using MagicPodcast in OpenClaw. Creates natural two-person dialogue audio, supports custom lan...

346

today