Logo
ClawHub Skills Lib
HomeTrending
Home/🤖 AI & Agents/🎤 Speech & Audio

🎤 Speech & Audio AI Skills

180 AI agent skills for Speech & Audio. Part of the 🤖 AI & Agents category.

Speech & Audio Skills — Page 4

180 skills
🎤Speech & Audio

Faster Whisper Local Service

faster-whisper-local-service
neldar
v0.1.7
View Details

OpenClaw local speech-to-text backend using faster-whisper over HTTP on 127.0.0.1:18790. Use when you want voice transcription without external APIs, without...

+9
1
565
today
🎤Speech & Audio

Ai Podcast Pipeline

ai-podcast-pipeline
jeong-wooseok
v0.1.5
View Details

Create Korean AI podcast packages from QuickView trend notes. Use for dual-host script writing (Callie × Nick), Gemini multi-speaker TTS audio generation, subtitle timing/render fixes, thumbnail+MP4 packaging, and YouTube title/description output. Supports both full (15~20 min) and compressed (5~7 min) editions.

404
3d ago
🎤Speech & Audio

Local Whisper

whisper-mlx-local
v1.5.0
View Details

Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

1.4k
7
3d ago
🎤Speech & Audio

Piper TTS

beware-piper-tts
v1.0.1
View Details

Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or...

195
today
🎤Speech & Audio

Gettr Transcribe

gettr-transcribe
v1.0.1
View Details

Download audio from a GETTR post or streaming page and transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT). Use when given a GE...

+1
111
today
🎤Speech & Audio

Whisnap

whisnap
v1.0.0
View Details

macOS CLI for transcribing audio and video files using local Whisper models or Whisnap Cloud.

95
3d ago
🎤Speech & Audio

Pullthatupjamie

pullthatupjamie
v1.5.2
View Details

PullThatUpJamie — Podcast Intelligence. A semantically indexed podcast corpus (109+ feeds, ~7K episodes, ~1.9M paragraphs) that works as a vector DB for podc...

+2
144
1
3d ago
🎤Speech & Audio

Speech to Text Skill (Yandex SpeechKit) for OpenClaw

sergei-mikhailov-stt
v1.1.2
View Details

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

173
yesterday
🎤Speech & Audio

Audio Content Generator

audio-gen
v1.0.0
View Details

Generate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple formats (audiobook, podcast, educational), custom lengths, and voice effects. Use when asked to create audio content, make a podcast, generate an audiobook, or produce educational audio. Returns MP3 audio file via MEDIA token.

1.9k
1
today
🎤Speech & Audio

ANY WHISPER API

any-whisper-api
v1.3.0
View Details

Transcribe audio via API Whisper with any compatible local servers.

75
2
today
🎤Speech & Audio

Speech to Text Transcription

speech-to-text-transcription
v1.0.0
View Details

Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.

72
3d ago
🎤Speech & Audio

Qwen3 Tts Mlx

qwen3-tts-mlx
v2.1.0
View Details

Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS.

10
yesterday
🎤Speech & Audio

Audio Transcribe

audio-transcribe
AKTheKnight
v1.0.0
View Details

Auto-transcribe voice messages locally using faster-whisper with selectable Whisper models, no API key required.

191
3d ago
🎤Speech & Audio

Sapi Tts

sapi-tts
v1.1.0
View Details

Windows SAPI5 text-to-speech with Neural voices. Lightweight alternative to GPU-heavy TTS - zero GPU usage, instant generation. Auto-detects best available voice for your language. Works on Windows 10/11.

713
3d ago
🎤Speech & Audio

Whisper Stt

openclaw-skill-whisper-stt
v0.1.0
View Details

语音转文字 - 使用OpenAI Whisper将音频文件识别为文字

3d ago
🎤Speech & Audio

Voice Assistant

openclaw-voice-assistant
v1.0.4
View Details

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.

211
3d ago
🎤Speech & Audio

Elevenlabs AI

elevenlabs-ai
codedao12
v1.0.0
View Details

Access ElevenLabs APIs for text-to-speech, speech-to-speech, realtime speech-to-text, voice/model management, and dialogue workflows with direct HTTP calls.

633
yesterday
🎤Speech & Audio

Podcast Chaptering Highlights

podcast-chaptering-highlights
codedao12
v1.0.0
View Details

Create chapters, highlights, and show notes from podcast audio or transcripts. Use when a user wants chapter markers, highlight clips, or show-note drafts without publishing or distribution actions.

580
yesterday
🎤Speech & Audio

Sound FX

sound-fx
v0.1.1
View Details

Generate short sound effects via ElevenLabs SFX (text-to-sound). Use when you need SFX clips like applause, canned laughter, whooshes, ambience, or short stingers, and optionally convert to WhatsApp-friendly .ogg/opus.

890
3d ago
🎤Speech & Audio

Pocket TTS Complete Documentation

lb-pocket-tts-skill
v0.1.0
View Details

Generate speech from text using Kyutai Pocket TTS - lightweight, CPU-friendly, streaming TTS with voice cloning. English only. ~6x real-time on M4 MacBook Air.

+3
545
today
🎤Speech & Audio

Browser Audio Capture

browser-audio-capture
v1.1.0
View Details

Capture audio from any browser tab — meetings, YouTube, podcasts, courses, webinars — and stream to any AI agent. Zero API keys, works with any framework.

54
today
🎤Speech & Audio

Audio

audio
v1.0.1
View Details

Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.

483
2
today
🎤Speech & Audio

AssemblyAI Transcriber

assemblyai-transcriber
v1.1.0
View Details

Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.

878
today
🎤Speech & Audio

Webchat Audio Notifications

webchat-audio-notifications
v1.2.0
View Details

Add browser audio notifications to Moltbot/Clawdbot webchat with 5 intensity levels - from whisper to impossible-to-miss (only when tab is backgrounded).

901
3d ago
←1…345…8→

Data sourced from clawhub.ai · Built with Next.js, Supabase, Prisma