discord-voiceReal-time voice conversations in Discord voice channels with Claude AI
Install via ClawdBot CLI:
clawdbot install avatarneil/discord-voiceReal-time voice conversations in Discord voice channels. Join a voice channel, speak, and have your words transcribed, processed by Claude, and spoken back.
ffmpeg (audio processing)@discordjs/opus and sodium-native# Ubuntu/Debian
sudo apt-get install ffmpeg build-essential python3
# Fedora/RHEL
sudo dnf install ffmpeg gcc-c++ make python3
# macOS
brew install ffmpeg
clawdhub install discord-voice
Or manually:
cd ~/.clawdbot/extensions
git clone <repository-url> discord-voice
cd discord-voice
npm install
{
plugins: {
entries: {
"discord-voice": {
enabled: true,
config: {
sttProvider: "local-whisper",
ttsProvider: "openai",
ttsVoice: "nova",
vadSensitivity: "medium",
allowedUsers: [], // Empty = allow all users
silenceThresholdMs: 1500,
maxRecordingMs: 30000,
openai: {
apiKey: "sk-...", // Or use OPENAI_API_KEY env var
},
},
},
},
},
}
Ensure your Discord bot has these permissions:
Add these to your bot's OAuth2 URL or configure in Discord Developer Portal.
| Option | Type | Default | Description |
| --------------------- | -------- | ----------------- | ----------------------------------------------- |
| enabled | boolean | true | Enable/disable the plugin |
| sttProvider | string | "local-whisper" | "whisper", "deepgram", or "local-whisper" |
| streamingSTT | boolean | true | Use streaming STT (Deepgram only, ~1s faster) |
| ttsProvider | string | "openai" | "openai" or "elevenlabs" |
| ttsVoice | string | "nova" | Voice ID for TTS |
| vadSensitivity | string | "medium" | "low", "medium", or "high" |
| bargeIn | boolean | true | Stop speaking when user talks |
| allowedUsers | string[] | [] | User IDs allowed (empty = all) |
| silenceThresholdMs | number | 1500 | Silence before processing (ms) |
| maxRecordingMs | number | 30000 | Max recording length (ms) |
| heartbeatIntervalMs | number | 30000 | Connection health check interval |
| autoJoinChannel | string | undefined | Channel ID to auto-join on startup |
{
openai: {
apiKey: "sk-...",
whisperModel: "whisper-1",
ttsModel: "tts-1",
},
}
{
elevenlabs: {
apiKey: "...",
voiceId: "21m00Tcm4TlvDq8ikWAM", // Rachel
modelId: "eleven_multilingual_v2",
},
}
{
deepgram: {
apiKey: "...",
model: "nova-2",
},
}
Once registered with Discord, use these commands:
/discord_voice join - Join a voice channel/discord_voice leave - Leave the current voice channel/discord_voice status - Show voice connection status# Join a voice channel
clawdbot discord_voice join <channelId>
# Leave voice
clawdbot discord_voice leave --guild <guildId>
# Check status
clawdbot discord_voice status
The agent can use the discord_voice tool:
Join voice channel 1234567890
The tool supports actions:
join - Join a voice channel (requires channelId)leave - Leave voice channelspeak - Speak text in the voice channelstatus - Get current voice statusWhen using Deepgram as your STT provider, streaming mode is enabled by default. This provides:
To use streaming STT:
{
sttProvider: "deepgram",
streamingSTT: true, // default
deepgram: {
apiKey: "...",
model: "nova-2",
},
}
When enabled (default), the bot will immediately stop speaking if a user starts talking. This creates a more natural conversational flow where you can interrupt the bot.
To disable (let the bot finish speaking):
{
bargeIn: false,
}
The plugin includes automatic connection health monitoring:
If the connection drops, you'll see logs like:
[discord-voice] Disconnected from voice channel
[discord-voice] Reconnection attempt 1/3
[discord-voice] Reconnected successfully
Ensure the Discord channel is configured and the bot is connected before using voice.
Install build tools:
npm install -g node-gyp
npm rebuild @discordjs/opus sodium-native
DEBUG=discord-voice clawdbot gateway start
| Variable | Description |
| -------------------- | ------------------------------ |
| DISCORD_TOKEN | Discord bot token (required) |
| OPENAI_API_KEY | OpenAI API key (Whisper + TTS) |
| ELEVENLABS_API_KEY | ElevenLabs API key |
| DEEPGRAM_API_KEY | Deepgram API key |
MIT
Generated Mar 1, 2026
Integrates with Discord voice channels to provide real-time transcription and AI-driven summaries during team meetings. It can answer questions based on the conversation and generate action items, enhancing productivity for remote teams.
Uses speech-to-text and text-to-speech to simulate conversations in Discord voice channels for language practice. Learners can speak in a target language, receive corrections, and engage in interactive dialogues with Claude AI.
Deploys in Discord communities to handle voice-based customer inquiries in real-time. It transcribes user questions, processes them with Claude AI for responses, and speaks back solutions, reducing wait times for support.
Joins Discord voice channels during gaming sessions to provide live assistance, such as strategy tips, lore explanations, or social moderation. It uses barge-in to respond quickly without interrupting gameplay.
Enables visually impaired users to participate in Discord voice chats by transcribing spoken content into text and reading back responses aloud. It offers customizable TTS voices and low-latency streaming for seamless interaction.
Offers basic voice conversation features for free with limited API calls, while charging for advanced features like custom TTS voices, higher transcription accuracy, or priority support. Targets small Discord communities and indie developers.
Provides customized deployments for large organizations, including on-premise hosting, enhanced security, and integration with existing CRM or project management tools. Includes dedicated support and SLAs for reliability.
Charges based on usage metrics such as minutes of transcription, TTS characters processed, or number of voice channel interactions. Appeals to developers building scalable applications with variable demand.
💬 Integration Tip
Ensure system dependencies like ffmpeg are installed and test with different STT/TTS providers to balance cost and latency based on your use case.
iMessage/SMS CLI for listing chats, history, watch, and sending.
Use when you need to control Discord from Clawdbot via the discord tool: send messages, react, post or upload stickers, upload emojis, run polls, manage threads/pins/search, fetch permissions or member/role/channel info, or handle moderation actions in Discord DMs or channels.
Use when you need to control Slack from Clawdbot via the slack tool, including reacting to messages or pinning/unpinning items in Slack channels or DMs.
Send WhatsApp messages to other people or search/sync WhatsApp history via the wacli CLI (not for normal user chats).
Build or update the BlueBubbles external channel plugin for Clawdbot (extension package, REST send/probe, webhook inbound).
OpenClaw skill for designing Telegram Bot API workflows and command-driven conversations using direct HTTPS requests (no SDKs).