🎤 Speech & Audio AI Skills

829 AI agent skills for Speech & Audio. Part of the 🤖 AI & Agents category.

Speech & Audio Skills

Lang:

829 skills found

Page 1 of 35

🎤Speech & Audio

Openai Whisper Api

openai-whisper-api

steipete

Av1.0.0

View Details

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

968

16.9k

today

🎤Speech & Audio

Openai Whisper

openai-whisper

steipete

Av1.0.0

View Details

Local speech-to-text with the Whisper CLI (no API key).

356

13.4k

1mo ago

🎤Speech & Audio

Sag

sag

steipete

Av1.0.0

View Details

ElevenLabs text-to-speech with mac-style say UX.

272

6.8k

1mo ago

🎤Speech & Audio

Voice Wake Say

voice-wake-say

xadenryan

Av1.0.1

View Details

Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").

5.9k

today

🎤Speech & Audio

ElevenLabs Voices

elevenlabs-voices

robbyczgw-cla

Sv2.1.6

View Details

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

5.6k

today

🎤Speech & Audio

Elevenlabs Tts

elevenlabs-tts

shaharsha

Sv2.4.0

View Details

ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp,...

+12

5.2k

today

🎤Speech & Audio

Faster Whisper

faster-whisper

theplasmak

Av1.5.1

View Details

Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...

5.1k

today

🎤Speech & Audio

Jarvis Voice

jarvis-voice

globalcaos

Av2.2.1

View Details

Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.

4.4k

1mo ago

🎤Speech & Audio

Edge TTS

edge-tts

i3130002

Av2.0.0

View Details

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

3.8k

1mo ago

🎤Speech & Audio

Voice Transcribe

voice-transcribe

darinkishore

Av1.0.1

View Details

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

3.7k

1mo ago

🎤Speech & Audio

Alexa CLI

alexa-cli

buddyh

Av1.3.0

View Details

Control Amazon Alexa devices and smart home via the `alexacli` CLI. Use when a user asks to speak/announce on Echo devices, control lights/thermostats/locks, send voice commands, or query Alexa.

3.6k

1mo ago

🎤Speech & Audio

OpenAI TTS

openai-tts

pors

Av1.0.0

View Details

Text-to-speech via OpenAI Audio Speech API.

3.5k

1mo ago

🎤Speech & Audio

Discord Voice

discord-voice

avatarneil

Av0.1.6

View Details

Real-time voice conversations in Discord voice channels with Claude AI

3.3k

1mo ago

🎤Speech & Audio

Kokoro TTS

kokoro-tts

edkief

Av0.1.0

View Details

Generate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.

3.1k

1mo ago

🎤Speech & Audio

Voice Agent

voice-agent

ricardotrevisan

Av1.1.0

View Details

Local Voice Input/Output for Agents using the AI Voice Agent API.

today

🎤Speech & Audio

Audio Cog

audio-cog

nitishgargiitd

Av1.0.11

View Details

AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre...

2.9k

today

🎤Speech & Audio

ElevenLabs Speech-to-Text

elevenlabs-stt

clawdbotborges

Av1.0.0

View Details

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

2.9k

1mo ago

🎤Speech & Audio

Local Whisper

local-whisper

araa47

Av1.0.0

View Details

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

2.8k

1mo ago

🎤Speech & Audio

AudioPod

audiopod

Rakesh1002

Bv1.2.3

View Details

Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API_KEY env var or pass api_key directly.

2.8k

1mo ago

🎤Speech & Audio

Local Whisper

whisper-mlx-local

ImpKind

Av1.5.0

View Details

Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

2.7k

1mo ago

🎤Speech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe

tristanmanchester

Av1.0.1

View Details

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

2.7k

today

🎤Speech & Audio

Vocal Chat

vocal-chat

rubenfb23

Av1.0.0

View Details

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

2.6k

1mo ago

🎤Speech & Audio

Voice Reply

voice-reply

stolot0mt0m

Av1.0.0

View Details

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

2.6k

1mo ago

🎤Speech & Audio

MLX STT

mlx-stt

guoqiao

Bv1.0.7

View Details

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

2.6k

1mo ago

…

More in 🤖 AI & Agents

Agent Self-Improvement

Automation & Workflows

283 skills

💬

Chatbots & Assistants

🎤 Speech & Audio AI Skills

829 AI agent skills for Speech & Audio. Part of the 🤖 AI & Agents category.

Speech & Audio Skills

Lang:

829 skills found

Page 1 of 35

🎤Speech & Audio

Openai Whisper Api

openai-whisper-api

steipete

Av1.0.0

View Details

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

968

16.9k

today

🎤Speech & Audio

Openai Whisper

openai-whisper

steipete

Av1.0.0

View Details

Local speech-to-text with the Whisper CLI (no API key).

356

13.4k

1mo ago

🎤Speech & Audio

Sag

sag

steipete

Av1.0.0

View Details

ElevenLabs text-to-speech with mac-style say UX.

272

6.8k

1mo ago

🎤Speech & Audio

Voice Wake Say

voice-wake-say

xadenryan

Av1.0.1

View Details

5.9k

today

🎤Speech & Audio

ElevenLabs Voices

elevenlabs-voices

robbyczgw-cla

Sv2.1.6

View Details

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

5.6k

today

🎤Speech & Audio

Elevenlabs Tts

elevenlabs-tts

shaharsha

Sv2.4.0

View Details

ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp,...

+12

5.2k

today

🎤Speech & Audio

Faster Whisper

faster-whisper

theplasmak

Av1.5.1

View Details

Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...

5.1k

today

🎤Speech & Audio

Jarvis Voice

jarvis-voice

globalcaos

Av2.2.1

View Details

Turn your AI into JARVIS. Voice, wit, and personality — the complete package. Humor cranked to maximum.

4.4k

1mo ago

🎤Speech & Audio

Edge TTS

edge-tts

i3130002

Av2.0.0

View Details

3.8k

1mo ago

🎤Speech & Audio

Voice Transcribe

voice-transcribe

darinkishore

Av1.0.1

View Details

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

3.7k

1mo ago

🎤Speech & Audio

Alexa CLI

alexa-cli

buddyh

Av1.3.0

View Details

Control Amazon Alexa devices and smart home via the `alexacli` CLI. Use when a user asks to speak/announce on Echo devices, control lights/thermostats/locks, send voice commands, or query Alexa.

3.6k

1mo ago

🎤Speech & Audio

OpenAI TTS

openai-tts

pors

Av1.0.0

View Details

Text-to-speech via OpenAI Audio Speech API.

3.5k

1mo ago

🎤Speech & Audio

Discord Voice

discord-voice

avatarneil

Av0.1.6

View Details

Real-time voice conversations in Discord voice channels with Claude AI

3.3k

1mo ago

🎤Speech & Audio

Kokoro TTS

kokoro-tts

edkief

Av0.1.0

View Details

Generate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.

3.1k

1mo ago

🎤Speech & Audio

Voice Agent

voice-agent

ricardotrevisan

Av1.1.0

View Details

Local Voice Input/Output for Agents using the AI Voice Agent API.

today

🎤Speech & Audio

Audio Cog

audio-cog

nitishgargiitd

Av1.0.11

View Details

AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre...

2.9k

today

🎤Speech & Audio

ElevenLabs Speech-to-Text

elevenlabs-stt

clawdbotborges

Av1.0.0

View Details

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

2.9k

1mo ago

🎤Speech & Audio

Local Whisper

local-whisper

araa47

Av1.0.0

View Details

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

2.8k

1mo ago

🎤Speech & Audio

AudioPod

audiopod

Rakesh1002

Bv1.2.3

View Details

2.8k

1mo ago

🎤Speech & Audio

Local Whisper

whisper-mlx-local

ImpKind

Av1.5.0

View Details

Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

2.7k

1mo ago

🎤Speech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe

tristanmanchester

Av1.0.1

View Details

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

2.7k

today

🎤Speech & Audio

Vocal Chat

vocal-chat

rubenfb23

Av1.0.0

View Details

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

2.6k

1mo ago

🎤Speech & Audio