Logo
ClawHub Skills Lib
HomeCategoriesUse CasesTrendingStatisticsBlog
HomeCategoriesUse CasesTrendingStatisticsBlog
ClawHub Skills Lib
ClawHub Skills Lib

Browse 50.000+ community-built AI agent skills for OpenClaw. Updated daily from clawhub.ai.

Explore

  • Home
  • Categories
  • Use Cases
  • Trending
  • Blog

Categories

  • Development
  • AI & Agents
  • Productivity
  • Communication
  • Data & Research
  • Business
  • Platforms
  • Lifestyle
  • Education
  • Design

Use Cases

  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • Crypto & Web3
  • Real-Time Web Search
  • News & Media Monitoring
  • Academic Research
  • Data & Analytics
  • AI Image Generation
  • Voice & Audio AI
  • AI Video Creation
  • Content Writing
  • Task & Project Management
  • Knowledge Management
  • Email & Messaging
  • SEO & Content Marketing
  • Sales & CRM
  • Workflow Automation
  • Social Media
  • Chinese Platforms
  • E-Commerce
  • Education & Tutoring
  • HR & Recruiting
  • Legal & Compliance
  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • Crypto & Web3
  • Real-Time Web Search
  • News & Media Monitoring
  • Academic Research
  • Data & Analytics
  • AI Image Generation
  • Voice & Audio AI
  • AI Video Creation
  • Content Writing
  • Task & Project Management
  • See all use cases →
  • AI Code Generation
  • Code Review & Testing
  • DevOps & Cloud
  • Security & Compliance
  • Build an AI Agent
  • Agent Memory & RAG
  • Multi-Agent Orchestration
  • Browser & Web Automation
  • Financial & Market Data
  • See all use cases →
© 2026 ClawHub Skills Lib. All rights reserved.Built with Next.js · Neon · Prisma
Home/Use Cases/🎙️ Voice & Audio AI

🎙️ Voice & Audio AI AI Agent Skills

Convert text to natural speech, transcribe audio to text, and translate spoken content with AI.

These skills handle the full audio pipeline — synthesizing voices with ElevenLabs and OpenAI TTS, transcribing recordings with Whisper, translating spoken content across languages, and processing audio files. Used by podcasters, developers, and accessibility teams.

459 skills4 types

Browse by Type

Convert text to natural-sounding speech with AI voice models and custom voice styles.

🔧speech-audio

Openai Whisper

openai-whisper
steipete
v1.0.0
View Details

Local speech-to-text with the Whisper CLI (no API key).

2.1k
84k
321
3mo ago
🔧speech-audio

Sag

sag
steipete
v1.0.0
View Details

ElevenLabs text-to-speech with mac-style say UX.

1.3k
26.7k
26
3mo ago
🔧speech-audio

Edge TTS

edge-tts
i3130002
v2.0.0
View Details

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

267
20.6k
31
3mo ago
🔧speech-audio

Sherpa ONNX TTS

sherpa-onnx-tts
danielsinewe
v0.1.0
View Details

Local text-to-speech via sherpa-onnx (offline, no cloud)

+3
137
3.9k
1
1mo ago
🔧speech-audio

Voice Call

voice-call
danielsinewe
v0.1.0
View Details

Start voice calls via the OpenClaw voice-call plugin.

+3
134
3.8k
2
1mo ago
🔧speech-audio

macOS Local Voice

macos-local-voice
strrl
v1.0.0
View Details

Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice quality detection and smart voice selection.

105
2.8k
1
1mo ago
Browse all 653 Text-to-Speech skills →

Quick install — most popular voice & audio ai skill:

clawdbot install steipete/openai-whisper

All Voice & Audio AI Skills

Lang:

459 skills found

Page 1 of 20

🎤Speech & Audio

Openai Whisper

openai-whisper
steipete
v1.0.0
View Details

Local speech-to-text with the Whisper CLI (no API key).

2.1k
84k
321
3mo ago
🎤Speech & Audio

Sag

sag
steipete
v1.0.0
View Details

ElevenLabs text-to-speech with mac-style say UX.

1.3k
26.7k
26
3mo ago
🎤Speech & Audio

Edge TTS

edge-tts
i3130002
v2.0.0
View Details

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

267
20.6k
31
3mo ago
🎤Speech & Audio

Sherpa ONNX TTS

sherpa-onnx-tts
danielsinewe
v0.1.0
View Details

Local text-to-speech via sherpa-onnx (offline, no cloud)

+3
137
3.9k
1
1mo ago
🎤Speech & Audio

Voice Call

voice-call
danielsinewe
v0.1.0
View Details

Start voice calls via the OpenClaw voice-call plugin.

+3
134
3.8k
2
1mo ago
🇨🇳Chinese Platforms

Aliyun Asr

aliyun-asr
jixsonwang
v1.0.10
View Details

Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu

+2
111
2.9k
2
1mo ago
🎤Speech & Audio

Local Whisper

local-whisper
araa47
v1.0.0
View Details

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

79
12.2k
12
3mo ago
🎤Speech & Audio

Faster Whisper Transcription

faster-whisper-transcribe
kalmuraee
v1.0.0
View Details

Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.

66
1.8k
1mo ago
🎤Speech & Audio

OpenAI TTS

openai-tts
pors
v1.0.0
View Details

Text-to-speech via OpenAI Audio Speech API.

57
7.2k
6
3mo ago
🎤Speech & Audio

Openai Whisper 1.0.0

openai-whisper-1-0-0
czubi1928
v1.0.0
View Details

Local speech-to-text with the Whisper CLI (no API key).

52
1.3k
3mo ago
🤖Agent Frameworks

Pixel Lobster Skill

pixel-lobster
joeproai
v1.2.1
View Details

Pixel art desktop lobster that lip-syncs to OpenClaw TTS speech. Use when: (1) user wants a visual avatar for their AI agent, (2) user wants a desktop overla...

+2
51
1.4k
1mo ago
📱Social Media

YouTube Shorts 자동 생성

youtube-shorts
kangjjang
v2.0.1
View Details

AI/DevOps 유튜브 숏츠 자동 생성. 트렌드 수집 → 스크립트 → 이미지 → Veo 영상 → TTS 나레이션 → Remotion 합성 → YouTube 업로드

+6
51
1.4k
1mo ago
🎤Speech & Audio

SiliconFlow TTS Gen

siliconflow-tts-gen
lilei0311
v1.0.0
View Details

Text-to-Speech using SiliconFlow API (CosyVoice2). Supports multiple voices, languages, and dialects.

+1
51
1.4k
1mo ago
🎤Speech & Audio

Mac TTS

mac-tts
kalijason
v1.0.0
View Details

Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chinese (Mandarin), English, Japanese, etc.

49
6.8k
2
3mo ago
💰Finance & Accounting

Invoice Engine

afrexai-invoice-engine
1kalin
v1.0.0
View Details

Generate, manage, and track professional invoices with client onboarding, customizable payment terms, recurring billing, automated overdue reminders, and fin...

+1
47
1.3k
1mo ago
🎤Speech & Audio

Tts

tts
AMSTKO
v1.0.0
View Details

Convert text to speech using Hume AI (or OpenAI) API. Use when the user asks for an audio message, a voice reply, or to hear something "of vive voix".

46
4.4k
1
3mo ago
🎥Video Processing Tools

Video News Downloader

video-news-downloader
cyberpsychosissss
v1.0.0
View Details

Automated daily news video downloader with AI subtitle proofreading. Downloads CBS Evening News and BBC News at Ten from YouTube, extracts and proofreads sub...

+5
42
1.1k
1
1mo ago
🎤Speech & Audio

feishu-audio

feishu-audio
tianyn1990
v1.0.1
View Details

将音频文件转换为飞书可播放的语音消息。先用 ffmpeg 转为 opus 格式,再上传到飞书,最后发送 audio 消息。适用于用户想要在飞书中收到可播放的语音消息的场景。

41
1.1k
1
1mo ago
🎤Speech & Audio

Voice Transcribe

voice-transcribe
darinkishore
v1.0.1
View Details

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

40
6.3k
13
1mo ago
🎤Speech & Audio

Qwen3-tts

qwen-tts
paki81
v1.0.0
View Details

Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium speaker voices, and instruction-based voice control (emotion, tone, style). Alternative to cloud-based TTS services like ElevenLabs. Runs entirely offline after initial model download.

38
3.8k
9
1mo ago
🎤Speech & Audio

Audio Cog

audio-cog
nitishgargiitd
v1.0.12
View Details

AI audio generation and text-to-speech powered by CellCog. Voiceover, narration, voice cloning, avatar voices, sound effects, music, podcasts, dialogue. Thre...

38
5.8k
4
1mo ago
🎤Speech & Audio

TTS AutoPlay with Wake Word

tts-autoplay
wangzjhz
v2.0.1
View Details

Auto-play TTS voice files with wake word detection. Only plays audio when user message contains wake words like "语音", "念出来", "voice", etc. Perfect for Webcha...

+5
37
986
1mo ago
🎤Speech & Audio

Kokoro TTS

kokoro-tts
edkief
v0.1.0
View Details

Generate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.

36
7k
1
3mo ago
🎤Speech & Audio

Voice Reply

voice-reply
stolot0mt0m
v1.0.0
View Details

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

35
4.6k
6
1mo ago
…

Frequently Asked Questions

Which TTS providers do these skills support?

Skills integrate with ElevenLabs, OpenAI TTS, Murf, Play.ht, Azure Cognitive Services, and Google Cloud TTS — covering natural voices, voice cloning, and custom voice styles.

Can these skills transcribe a 2-hour audio file accurately?

Yes. Whisper-based skills handle long-form audio by chunking and processing in segments, with speaker diarization and timestamp output. Cloud-based skills handle files up to several hours.

Related Use Cases

🖼️
AI Image Generation
Generate, edit, and transform images with AI — from DALL-E and Stable Diffusion to FLUX and Midjourney.
🎬
AI Video Creation
Generate, edit, and automate video content with AI — from scripts and avatars to clips and captions.