phone-agentRun a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.
Install via ClawdBot CLI:
clawdbot install kesslerio/phone-agentGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Potentially destructive shell commands in tool definitions
exec(Calls external URL not in known-safe list
https://console.deepgram.com/Uses known external API (expected, informational)
api.openai.comAI Analysis
The skill requires users to expose a local server to the internet via ngrok and configure multiple third-party API keys, creating a significant attack surface for potential credential interception or server compromise. While the external API usage (Deepgram, OpenAI, ElevenLabs) is consistent with the stated purpose, the setup process introduces operational security risks if not properly secured.
Generated Mar 1, 2026
Deploy the phone agent to handle routine customer inquiries, such as account balance checks or service status updates, reducing wait times and freeing human agents for complex issues. It can provide 24/7 support in multiple languages by integrating with different ElevenLabs voices.
Use the agent to manage phone-based appointment bookings for clinics or salons, transcribing caller requests and confirming details via LLM-generated responses. It streamlines scheduling without manual input, improving operational efficiency.
Implement the agent to answer inbound sales calls, ask qualifying questions based on a customized system prompt, and log lead information for follow-up. This helps prioritize high-potential leads and reduces sales team workload.
Set up the agent to provide automated updates during crises, such as weather alerts or service disruptions, by delivering pre-programmed or real-time information via TTS. It ensures reliable communication when human operators are overwhelmed.
Replace traditional IVR systems with this AI agent to handle complex menu navigation and natural language queries, offering more intuitive customer interactions. It reduces call abandonment rates by understanding context better.
Offer the phone agent as a cloud-based service with tiered pricing based on call volume or features, such as advanced LLM models or custom voices. Revenue comes from monthly subscriptions, targeting small businesses needing affordable automation.
Provide professional services to customize and deploy the agent for specific industries, including system prompt tuning and API integration. Revenue is generated through one-time project fees and ongoing maintenance contracts.
Monetize the agent by exposing its capabilities via an API, charging per minute of call time or per transaction processed. This model suits developers building voice applications without managing infrastructure.
💬 Integration Tip
Ensure all API keys are securely stored and test the WebSocket connection with Twilio before going live to avoid call drops.
Scored Apr 19, 2026
Audited Apr 16, 2026 · audit v1.0
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.