🎤 Speech & Audio AI Skills

🎤Speech & Audio

Faster Whisper Local Service

faster-whisper-local-service

neldar

v0.1.7

OpenClaw local speech-to-text backend using faster-whisper over HTTP on 127.0.0.1:18790. Use when you want voice transcription without external APIs, without...

565

today

🎤Speech & Audio

Ai Podcast Pipeline

ai-podcast-pipeline

jeong-wooseok

v0.1.5

View Details

Create Korean AI podcast packages from QuickView trend notes. Use for dual-host script writing (Callie × Nick), Gemini multi-speaker TTS audio generation, subtitle timing/render fixes, thumbnail+MP4 packaging, and YouTube title/description output. Supports both full (15~20 min) and compressed (5~7 min) editions.

404

3d ago

🎤Speech & Audio

Local Whisper

whisper-mlx-local

v1.5.0

View Details

Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.

1.4k

3d ago

🎤Speech & Audio

Piper TTS

beware-piper-tts

v1.0.1

View Details

Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or...

195

today

🎤Speech & Audio

Gettr Transcribe

gettr-transcribe

v1.0.1

View Details

Download audio from a GETTR post or streaming page and transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT). Use when given a GE...

111

today

🎤Speech & Audio

Whisnap

whisnap

v1.0.0

View Details

macOS CLI for transcribing audio and video files using local Whisper models or Whisnap Cloud.

3d ago

🎤Speech & Audio

Pullthatupjamie

pullthatupjamie

v1.5.2

View Details

PullThatUpJamie — Podcast Intelligence. A semantically indexed podcast corpus (109+ feeds, ~7K episodes, ~1.9M paragraphs) that works as a vector DB for podc...

144

3d ago

🎤Speech & Audio

Speech to Text Skill (Yandex SpeechKit) for OpenClaw

sergei-mikhailov-stt

v1.1.2

View Details

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

173

yesterday

🎤Speech & Audio

Audio Content Generator

audio-gen

v1.0.0

View Details

Generate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple formats (audiobook, podcast, educational), custom lengths, and voice effects. Use when asked to create audio content, make a podcast, generate an audiobook, or produce educational audio. Returns MP3 audio file via MEDIA token.

1.9k

today

🎤Speech & Audio

ANY WHISPER API

any-whisper-api

v1.3.0

View Details

Transcribe audio via API Whisper with any compatible local servers.

today

🎤Speech & Audio

Speech to Text Transcription

speech-to-text-transcription

v1.0.0

View Details

Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.

3d ago

🎤Speech & Audio

Qwen3 Tts Mlx

qwen3-tts-mlx

v2.1.0

View Details

Local Qwen3-TTS speech synthesis on Apple Silicon via MLX. Use for offline narration, audiobooks, video voiceovers, and multilingual TTS.

yesterday

🎤Speech & Audio

Audio Transcribe

audio-transcribe

AKTheKnight

v1.0.0

View Details

Auto-transcribe voice messages locally using faster-whisper with selectable Whisper models, no API key required.

191

3d ago

🎤Speech & Audio

Sapi Tts

sapi-tts

v1.1.0

View Details

Windows SAPI5 text-to-speech with Neural voices. Lightweight alternative to GPU-heavy TTS - zero GPU usage, instant generation. Auto-detects best available voice for your language. Works on Windows 10/11.

713

3d ago

🎤Speech & Audio

Whisper Stt

openclaw-skill-whisper-stt

v0.1.0

View Details

语音转文字 - 使用OpenAI Whisper将音频文件识别为文字

3d ago

🎤Speech & Audio

Voice Assistant

openclaw-voice-assistant

v1.0.4

View Details

Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds. Supports multi-turn conversation with automatic follow-up listening, mic suppression to prevent feedback, and a system tray with pause/resume. Recommended voices: Matilda (XrExE9yKIg1WjnnlVkGX, free tier) or Ivy (MClEFoImJXBTgLwdLI5n, paid tier). Fully customizable wake word, voice, hotkey, and silence thresholds.

211

3d ago

🎤Speech & Audio

Elevenlabs AI

elevenlabs-ai

codedao12

v1.0.0

View Details

Access ElevenLabs APIs for text-to-speech, speech-to-speech, realtime speech-to-text, voice/model management, and dialogue workflows with direct HTTP calls.

633

yesterday

🎤Speech & Audio

Podcast Chaptering Highlights

podcast-chaptering-highlights

codedao12

v1.0.0

View Details

Create chapters, highlights, and show notes from podcast audio or transcripts. Use when a user wants chapter markers, highlight clips, or show-note drafts without publishing or distribution actions.

580

yesterday

🎤Speech & Audio

Sound FX

sound-fx

v0.1.1

View Details

Generate short sound effects via ElevenLabs SFX (text-to-sound). Use when you need SFX clips like applause, canned laughter, whooshes, ambience, or short stingers, and optionally convert to WhatsApp-friendly .ogg/opus.

890

3d ago

🎤Speech & Audio

Pocket TTS Complete Documentation

lb-pocket-tts-skill

v0.1.0

View Details

Generate speech from text using Kyutai Pocket TTS - lightweight, CPU-friendly, streaming TTS with voice cloning. English only. ~6x real-time on M4 MacBook Air.

545

today

🎤Speech & Audio

Browser Audio Capture

browser-audio-capture

v1.1.0

View Details

Capture audio from any browser tab — meetings, YouTube, podcasts, courses, webinars — and stream to any AI agent. Zero API keys, works with any framework.

today

🎤Speech & Audio

Audio

audio

v1.0.1

View Details

Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.

483

today

🎤Speech & Audio

AssemblyAI Transcriber

assemblyai-transcriber

v1.1.0

View Details

Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.

878

today

🎤Speech & Audio

Webchat Audio Notifications

webchat-audio-notifications

v1.2.0

View Details

Add browser audio notifications to Moltbot/Clawdbot webchat with 5 intensity levels - from whisper to impossible-to-miss (only when tab is backgrounded).

901

3d ago

Speech & Audio Skills — Page 4

Faster Whisper Local Service

Ai Podcast Pipeline

Local Whisper

Piper TTS

Gettr Transcribe

Whisnap

Pullthatupjamie

Speech to Text Skill (Yandex SpeechKit) for OpenClaw

Audio Content Generator

ANY WHISPER API

Speech to Text Transcription

Qwen3 Tts Mlx

Audio Transcribe

Sapi Tts

Whisper Stt

Voice Assistant

Elevenlabs AI

Podcast Chaptering Highlights

Sound FX

Pocket TTS Complete Documentation

Browser Audio Capture

Audio

AssemblyAI Transcriber

Webchat Audio Notifications