Logo
ClawHub Skills Lib
HomeCategoriesUse CasesTrendingBlog
HomeCategoriesUse CasesTrendingBlog
ClawHub Skills Lib
ClawHub Skills Lib

Browse 27,000+ community-built AI agent skills for OpenClaw. Updated daily from clawhub.ai.

Explore

  • Home
  • Trending
  • Use Cases
  • Blog

Categories

  • Development
  • AI & Agents
  • Productivity
  • Communication
  • Data & Research
  • Business
  • Platforms
  • Lifestyle
  • Education
  • Design

Use Cases

  • Security Auditing
  • Workflow Automation
  • Finance & Fintech
  • MCP Integration
  • Crypto Trading
  • Web3 & DeFi
  • Data Analysis
  • Social Media
  • 中文平台技能
  • All Use Cases →
© 2026 ClawHub Skills Lib. All rights reserved.Built with Next.js · Neon · Prisma
Home/Use Cases/🎙️ Voice & Audio AI/🎚️ Audio Processing

🎚️ Audio Processing AI Skills

Clean audio, remove noise, separate vocals, and process audio files at scale.

29 skillsPart of 🎙️ Voice & Audio AI

29 skills found

Page 1 of 2

🎤Speech & Audio

ElevenLabs Voices

elevenlabs-voices
robbyczgw-cla
v2.1.6
View Details

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

20
5.6k
16
7d ago
🎤Speech & Audio

ElevenLabs Music

elevenlabs-music
clawdbotborges
v1.0.1
View Details

Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated lyrics, instrumental tracks, and multiple genres/styles. Requires paid ElevenLabs plan.

13
2.5k
1
25d ago
🎤Speech & Audio

Vocal Chat

vocal-chat
rubenfb23
v1.0.0
View Details

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

+1
12
2.6k
8
18d ago
🎤Speech & Audio

Whisper Transcribe

whisper-transcribe
JosunLP
v1.0.0
View Details

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

7
1.1k
2
23d ago
🎤Speech & Audio

macOS Local Voice

macos-local-voice
STRRL
v1.0.0
View Details

Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice quality detection and smart voice selection.

6
1.1k
7d ago
🎤Speech & Audio

Podcast Generation from PDF, Text, and Links

ai-podcast
mogens9
v1.0.11
View Details

Generate AI podcast episodes from PDFs, text, notes, and links using MagicPodcast in OpenClaw. Creates natural two-person dialogue audio, supports custom lan...

+3
5
745
2
22d ago
🎤Speech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe
tristanmanchester
v1.0.1
View Details

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

5
2.6k
3
7d ago
🎤Speech & Audio

Parakeet Stt

parakeet-stt
carlulsoe
v1.1.0
View Details

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

3
1.9k
26d ago
🎤Speech & Audio

Voice Message

voice-message
xmanrui
v1.0.4
View Details

Send voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...

+5
3
467
1
7d ago
🎤Speech & Audio

ElevenLabs

elevenlabs-api
byungkyu
v1.0.0
View Details

ElevenLabs API integration with managed authentication. AI-powered text-to-speech, voice cloning, sound effects, and audio processing. Use this skill when users want to generate speech from text, clone voices, create sound effects, or process audio. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

3
567
1
25d ago
🎤Speech & Audio

Audio

audio
ivangdavila
v1.0.1
View Details

Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.

2
832
2
7d ago
🎤Speech & Audio

Whisper Transcriber

whisper-transcriber
vvusu
v1.0.0
View Details

Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT/JSON outputs, and model download. Cr...

2
158
1
14d ago
🎤Speech & Audio

Eachlabs Voice Audio

eachlabs-voice-audio
eftalyurtseven
v0.1.0
View Details

Text-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the user needs TTS, transcription, or voice conversion.

2
814
7d ago
🎤Speech & Audio

Audio Mastering CLI

audio-mastering-cli
alesys
v1.0.2
View Details

CLI audio mastering without a reference track using ffmpeg; accepts audio or video inputs and outputs mastered WAV/MP3 or remuxed MP4.

1
471
7d ago
🎤Speech & Audio

Voice Note To Midi

voice-note-to-midi
DanBennettUK
v0.1.0
View Details

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

1
1.6k
7d ago
🎤Speech & Audio

AudioPod

audiopod
Rakesh1002
v1.2.3
View Details

Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API_KEY env var or pass api_key directly.

1
2.8k
3
24d ago
🎤Speech & Audio

ton

ton
kleberbaum
v0.1.0
View Details

Ton namespace for Netsnek e.U. audio and media processing tools. Handles audio transcription, format conversion, waveform analysis, and podcast production wo...

1
416
25d ago
🎤Speech & Audio

Podcastifier

agents-skill-podcastifier
cerbug45
v0.1.0
View Details

Turn incoming text (email/newsletter) into a short TTS podcast with chunking + ffmpeg concat.

1
486
25d ago
🎤Speech & Audio

Telegram Voice To Voice Macos

telegram-voice-to-voice-macos
Fiberian1981
v0.1.3
View Details

Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.

+1
1
1.2k
25d ago
🎤Speech & Audio

TTS AutoPlay with Wake Word

tts-autoplay
WangZjhz
v2.0.1
View Details

Auto-play TTS voice files with wake word detection. Only plays audio when user message contains wake words like "语音", "念出来", "voice", etc. Perfect for Webcha...

1
245
7d ago
🎤Speech & Audio

Webchat Voice Proxy

webchat-voice-proxy
neldar
v0.2.2
View Details

⚠️ DEPRECATED — This skill has been split into two separate skills for better modularity: **webchat-https-proxy** (HTTPS/WSS reverse proxy) and **webchat-voi...

0
654
17d ago
🎤Speech & Audio

Generate Protoss-style (StarCraft) voice effects using SoX and FFmpeg.

protoss-voice
vemec
v1.1.1
View Details

Apply Protoss-style (StarCraft) psionic effects to ANY audio file. Use as a post-processing layer for TTS or user recordings.

+1
0
1.7k
3
25d ago
🎤Speech & Audio

SpeakNotes: YouTube, Audio & Document Summaries

speaknotes-youtube-audio-document-summarizer
JackLillie
v1.0.1
View Details

Use when OpenClaw needs to call SpeakNotes API routes directly using an API key and generate transcripts/summaries from YouTube URLs, media files, or documen...

+6
0
158
18d ago
🎤Speech & Audio

Qwen ASR (C-based Offline)

rightister-qwen-asr
rightister
v1.0.0
View Details

Offline Chinese and mixed Chinese-English speech-to-text recognition in pure C without Python or FFmpeg dependencies, suitable for edge devices.

0
26
4d ago

Other 🎙️ Voice & Audio AI Phases

🔊
Text-to-Speech
Convert text to natural-sounding speech with AI voice models and custom voice styles.
📝
Transcription & STT
Transcribe audio and video files to text with speaker labels and timestamps.
🌍
Audio Translation
Translate spoken content across languages — transcribe, translate, and re-synthesize.