Audio Processing AI Agent Skills — Voice & Audio AI | ClawHub

🎤Speech & Audio

macOS Local Voice

macos-local-voice

strrl

v1.0.0

Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice quality detection and smart voice selection.

105

2.8k

1mo ago

🎤Speech & Audio

Audio

audio

ivangdavila

v1.0.1

View Details

Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.

2.5k

1mo ago

🎤Speech & Audio

Voice Message

voice-message

xmanrui

v1.0.4

View Details

Send voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...

1.8k

1mo ago

🎤Speech & Audio

Audio Mastering CLI

audio-mastering-cli

alesys

v1.0.2

View Details

CLI audio mastering without a reference track using ffmpeg; accepts audio or video inputs and outputs mastered WAV/MP3 or remuxed MP4.

1.2k

1mo ago

🎤Speech & Audio

feishu-audio

feishu-audio

tianyn1990

v1.0.1

View Details

将音频文件转换为飞书可播放的语音消息。先用 ffmpeg 转为 opus 格式，再上传到飞书，最后发送 audio 消息。适用于用户想要在飞书中收到可播放的语音消息的场景。

1.1k

1mo ago

🎤Speech & Audio

ElevenLabs Voices

elevenlabs-voices

robbyczgw-cla

v2.1.6

View Details

High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.

1mo ago

🎤Speech & Audio

Byted Mediakit Voiceover Editing

byted-mediakit-voiceover-editing

volc-ai-mediakit

v1.0.8

View Details

Volcano Engine AI MediaKit talking-head video editing Skill: a one-stop workflow from environment setup through media management, audio processing, talking-h...

690

2mo ago

🎤Speech & Audio

Vocal Chat

vocal-chat

rubenfb23

v1.0.0

View Details

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

3mo ago

🎤Speech & Audio

ElevenLabs Music

elevenlabs-music

clawdbotborges

v1.0.1

View Details

Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated lyrics, instrumental tracks, and multiple genres/styles. Requires paid ElevenLabs plan.

3.8k

3mo ago

🎤Speech & Audio

speech-translation

speech-translation

decin

v1.0.0

View Details

Build, adapt, or run an audio-processing workflow that takes spoken audio, transcribes it with Whisper or faster-whisper, translates the transcript using the...

544

1mo ago

🎤Speech & Audio

Bilibili Audio Transcribe

bilibili-audio-transcribe

yizh4ng

v0.1.0

View Details

Download audio from Bilibili or b23.tv links and transcribe it into txt, srt, and segment JSON with yt-dlp, ffmpeg, and faster-whisper. Use when a user asks...

486

1mo ago

🎤Speech & Audio

Virtual voice builder

virtual-voice-ai

suhas12345685-pro

v1.0.0

View Details

Wires a real microphone through an AI brain (STT → LLM → TTS) and routes the output to a virtual audio cable so apps like Google Meet hear the processed voic...

473

1mo ago

🎤Speech & Audio

Audio Command Executor

audio-command-executor

sirkovz

v1.0.1

View Details

Processes inbound audio files, transcribes them, and answers to resulting texts. Converts non-WAV inputs to WAV before transcription.

402

1mo ago

🎤Speech & Audio

Multimodal Base

yuyonghao-multimodal-base

yuyonghao-123

v0.1.0

View Details

Supports image understanding, OCR, speech-to-text, and text-to-speech synthesis with multi-voice and multimodal unified processing using OpenAI and Edge TTS.

400

2mo ago

🎤Speech & Audio

Whisper Transcribe

whisper-transcribe

JosunLP

v1.0.0

View Details

Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.

1.8k

3mo ago

🎤Speech & Audio

ElevenLabs

elevenlabs-api

byungkyu

v1.0.0

View Details

ElevenLabs API integration with managed authentication. AI-powered text-to-speech, voice cloning, sound effects, and audio processing. Use this skill when users want to generate speech from text, clone voices, create sound effects, or process audio. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

2.5k

3mo ago

🎤Speech & Audio

AssemblyAI advanced speech transcription

assemblyai-transcribe

tristanmanchester

v1.0.1

View Details

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

3.5k

1mo ago

🎤Speech & Audio

Podcast Generation from PDF, Text, and Links

ai-podcast

mogens9

v1.0.11

View Details

Generate AI podcast episodes from PDFs, text, notes, and links using MagicPodcast in OpenClaw. Creates natural two-person dialogue audio, supports custom lan...

2.5k

3mo ago

🎤Speech & Audio

Parakeet Stt

parakeet-stt

carlulsoe

v1.1.0

View Details

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

2.8k

1mo ago

🎤Speech & Audio

Telegram Voice To Voice Macos

telegram-voice-to-voice-macos

fiberian1981

v0.1.3

View Details

Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.

2.2k

1mo ago

🎤Speech & Audio

Music Generation

music-gen

scikkk

v1.0.0

View Details

SenseAudio Music Generation API for creating AI-generated lyrics and songs. Supports lyrics generation, song generation with style/vocal control, and async t...

615

1mo ago

🎤Speech & Audio

Evolink Music — AI Music Generation (Suno v4/v4.5/v5)

evolink-music

EvoLinkAI

v2.0.0

View Details

AI music generation with Suno v4, v4.5, v5. Text-to-music, custom lyrics, instrumental, vocal control. 5 models, one API key.

1.2k

1mo ago

🎤Speech & Audio

tencent-tts-podcast

tencent-tts-podcast

islinxu

v1.0.0

View Details

Convert text to podcast audio using Tencent Cloud TTS. Supports both short and long text processing, generates up to 30-minute long audio with automatic chun...

840

1mo ago

🎤Speech & Audio

AudioPod

audiopod

Rakesh1002

v1.2.3

View Details

Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API_KEY env var or pass api_key directly.

3.6k

3mo ago

🎚️ Audio Processing AI Skills

macOS Local Voice

Audio

Voice Message

Audio Mastering CLI

feishu-audio

ElevenLabs Voices

Byted Mediakit Voiceover Editing

Vocal Chat

ElevenLabs Music

speech-translation

Bilibili Audio Transcribe

Virtual voice builder

Audio Command Executor

Multimodal Base

Whisper Transcribe

ElevenLabs

AssemblyAI advanced speech transcription

Podcast Generation from PDF, Text, and Links

Parakeet Stt

Telegram Voice To Voice Macos

Music Generation

Evolink Music — AI Music Generation (Suno v4/v4.5/v5)

tencent-tts-podcast

AudioPod

Other 🎙️ Voice & Audio AI Phases