Logo
ClawHub Skills Lib
HomeCategoriesUse CasesTrendingStatisticsBlog
HomeCategoriesUse CasesTrendingStatisticsBlog
ClawHub Skills Lib
ClawHub Skills Lib

Browse 50.000+ community-built AI agent skills for OpenClaw. Updated daily from clawhub.ai.

Explore

  • Home
  • Categories
  • Use Cases
  • Trending
  • Blog

Categories

  • Development
  • AI & Agents
  • Productivity
  • Communication
  • Data & Research
  • Business
  • Platforms
  • Lifestyle
  • Education
  • Design

Use Cases

  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • Crypto & Web3
  • Real-Time Web Search
  • News & Media Monitoring
  • Academic Research
  • Data & Analytics
  • AI Image Generation
  • Voice & Audio AI
  • AI Video Creation
  • Content Writing
  • Task & Project Management
  • Knowledge Management
  • Email & Messaging
  • SEO & Content Marketing
  • Sales & CRM
  • Workflow Automation
  • Social Media
  • Chinese Platforms
  • E-Commerce
  • Education & Tutoring
  • HR & Recruiting
  • Legal & Compliance
  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • Crypto & Web3
  • Real-Time Web Search
  • News & Media Monitoring
  • Academic Research
  • Data & Analytics
  • AI Image Generation
  • Voice & Audio AI
  • AI Video Creation
  • Content Writing
  • Task & Project Management
  • See all use cases โ†’
  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • See all use cases โ†’
ยฉ 2026 ClawHub Skills Lib. All rights reserved.Built with Next.js ยท Neon ยท Prisma
Home/๐Ÿค– AI & Agents/๐ŸŽค Speech & Audio

๐ŸŽค Speech & Audio AI Skills

829 AI agent skills for Speech & Audio. Part of the ๐Ÿค– AI & Agents category.

Speech & Audio Skills

Lang:

829 skills found

Page 1 of 35

๐ŸŽคSpeech & Audio

Openai Whisper Api

openai-whisper-api
steipete
Av1.0.0
View Details

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

968
16.9k
38
today
๐ŸŽคSpeech & Audio

Openai Whisper

openai-whisper
steipete
Av1.0.0
View Details

Local speech-to-text with the Whisper CLI (no API key).

356
13.4k
70
1mo ago
๐ŸŽคSpeech & Audio

Sag

sag
steipete
Av1.0.0
View Details

ElevenLabs text-to-speech with mac-style say UX.

272
6.8k
8
1mo ago
๐ŸŽคSpeech & Audio

Voice Wake Say

voice-wake-say
xadenryan
Av1.0.1
View Details

Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").

36
5.9k
4
today
๐ŸŽคSpeech & Audio

ElevenLabs Voices

elevenlabs-voices
robbyczgw-cla
Sv2.1.6
View Details

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

20
5.6k
16
today
๐ŸŽคSpeech & Audio

Elevenlabs Tts

elevenlabs-tts
shaharsha
Sv2.4.0
View Details

ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp,...

+12
24
5.2k
6
today
๐ŸŽคSpeech & Audio

Faster Whisper

faster-whisper
theplasmak
Av1.5.1
View Details

Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...

22
5.1k
4
today
๐ŸŽคSpeech & Audio

Jarvis Voice

jarvis-voice
globalcaos
Av2.2.1
View Details

Turn your AI into JARVIS. Voice, wit, and personality โ€” the complete package. Humor cranked to maximum.

27
4.4k
3
1mo ago
๐ŸŽคSpeech & Audio

Edge TTS

edge-tts
i3130002
Av2.0.0
View Details

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

25
3.8k
6
1mo ago
๐ŸŽคSpeech & Audio

Voice Transcribe

voice-transcribe
darinkishore
Av1.0.1
View Details

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

13
3.7k
10
1mo ago
๐ŸŽคSpeech & Audio

Alexa CLI

alexa-cli
buddyh
Av1.3.0
View Details

Control Amazon Alexa devices and smart home via the `alexacli` CLI. Use when a user asks to speak/announce on Echo devices, control lights/thermostats/locks, send voice commands, or query Alexa.

22
3.6k
14
1mo ago
๐ŸŽคSpeech & Audio

OpenAI TTS

openai-tts
pors
Av1.0.0
View Details

Text-to-speech via OpenAI Audio Speech API.

22
3.5k
4
1mo ago
๐ŸŽคSpeech & Audio

Discord Voice

discord-voice
avatarneil
Av0.1.6
View Details

Real-time voice conversations in Discord voice channels with Claude AI

+1
14
3.3k
3
1mo ago
๐ŸŽคSpeech & Audio

Kokoro TTS

kokoro-tts
edkief
Av0.1.0
View Details

Generate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.

12
3.1k
1mo ago
๐ŸŽคSpeech & Audio

Voice Agent

voice-agent
ricardotrevisan
Av1.1.0
View Details

Local Voice Input/Output for Agents using the AI Voice Agent API.

23
3k
today
๐ŸŽคSpeech & Audio

Audio Cog

audio-cog
nitishgargiitd
Av1.0.11
View Details

AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre...

15
2.9k
2
today
๐ŸŽคSpeech & Audio

ElevenLabs Speech-to-Text

elevenlabs-stt
clawdbotborges
Av1.0.0
View Details

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

7
2.9k
4
1mo ago
๐ŸŽคSpeech & Audio

Local Whisper

local-whisper
araa47
Av1.0.0
View Details

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

7
2.8k
5
1mo ago
๐ŸŽคSpeech & Audio

AudioPod

audiopod
Rakesh1002
Bv1.2.3
View Details

Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API_KEY env var or pass api_key directly.

1
2.8k
3
1mo ago
๐ŸŽคSpeech & Audio

Local Whisper

whisper-mlx-local
ImpKind
Av1.5.0
View Details

Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

+2
2
2.7k
9
1mo ago
๐ŸŽคSpeech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe
tristanmanchester
Av1.0.1
View Details

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

7
2.7k
3
today
๐ŸŽคSpeech & Audio

Vocal Chat

vocal-chat
rubenfb23
Av1.0.0
View Details

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

+1
12
2.6k
8
1mo ago
๐ŸŽคSpeech & Audio

Voice Reply

voice-reply
stolot0mt0m
Av1.0.0
View Details

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

7
2.6k
4
1mo ago
๐ŸŽคSpeech & Audio

MLX STT

mlx-stt
guoqiao
Bv1.0.7
View Details

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

6
2.6k
1mo ago
โ€ฆ

More in ๐Ÿค– AI & Agents

๐Ÿ›ก๏ธ
Agent Security
0 skills
๐Ÿง 
LLMs & Model APIs
801 skills
๐Ÿค–
Agent Frameworks
3542 skills
๐Ÿง 
Agent Memory
743 skills
๐Ÿ”„
Agent Self-Improvement
325 skills
โš™๏ธ
AI Tools & Utilities
293 skills
๐Ÿ–ผ๏ธ
Image Generation
1181 skills
๐ŸŽฌ
Video Generation
286 skills
โšก
Automation & Workflows
283 skills
๐Ÿ’ฌ
Chatbots & Assistants
764 skills
๐Ÿ“
Prompt & Config
318 skills