voice-agentLocal Voice Input/Output for Agents using the AI Voice Agent API.
Install via ClawdBot CLI:
clawdbot install ricardotrevisan/voice-agentGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://github.com/ricardotrevisan/ai-conversational-skillAudited Apr 16, 2026 · audit v1.0
Generated Feb 24, 2026
Enables automated customer support via voice interactions, allowing users to speak queries and receive spoken responses. Ideal for call centers or help desks to handle common inquiries without human agents, improving efficiency and availability.
Facilitates language practice by transcribing student speech and generating audio feedback for pronunciation and conversation. Useful for educational apps or tutoring services to provide immersive, interactive learning experiences.
Allows patients to verbally log symptoms or medication adherence, with the system transcribing and synthesizing reminders or summaries. Supports telehealth platforms by enhancing accessibility for users with mobility or literacy challenges.
Integrates with home automation systems to process voice commands for controlling devices like lights or thermostats, responding with audio confirmations. Enhances user convenience in residential IoT applications by enabling hands-free operation.
Converts text-based content like documents or websites into audio output and transcribes user voice inputs for navigation. Serves assistive technology providers to improve digital accessibility and independence for users with visual impairments.
Offers the voice agent as a cloud service with tiered pricing based on usage volume or features, such as higher-quality TTS or faster transcription. Generates recurring revenue from businesses integrating it into their applications for scalable voice capabilities.
Charges customers per transaction, such as each audio transcription or synthesis request, with volume discounts for high usage. Attracts developers and startups needing flexible, low-cost access without long-term commitments, driving revenue from variable demand.
Licenses the skill to enterprises for customization and branding within their own products, such as call center software or educational tools. Provides upfront licensing fees and ongoing support contracts, targeting large organizations seeking proprietary voice solutions.
💬 Integration Tip
Ensure the local backend API is running on port 8000 and follow the provided documentation for setup; test health checks before deployment to avoid connection issues.
Scored Apr 16, 2026
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <device>").
Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.