senseaudio-asrBuild and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1...
Install via ClawdBot CLI:
clawdbot install scikkk/senseaudio-asrGrade Limited — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Sends data to undocumented external endpoint (potential exfiltration)
POST → https://api.senseaudio.cn/v1/audio/transcriptions`Calls external URL not in known-safe list
https://senseaudio.cnAudited Apr 17, 2026 · audit v1.0
Generated Mar 22, 2026
Transcribe recorded customer service calls for quality assurance and training. Use HTTP file transcription with diarization to separate agent and customer speech, enabling analysis of conversation flow and compliance.
Provide real-time transcription and translation for international business meetings via WebSocket ASR. Use the ASR model to support multiple languages and generate live captions, facilitating cross-lingual collaboration.
Transcribe audio content for podcast episodes or video productions with timestamps and sentiment analysis. Use the Pro model for precise diarization to identify different speakers and edit content efficiently.
Transcribe educational lectures for accessibility and study materials. Use HTTP transcription with the Lite model for cost-effective basic transcription, then analyze audio quality to ensure clarity before processing.
Transcribe patient consultations for medical records using diarization to distinguish between healthcare providers and patients. Ensure compliance by handling sensitive audio data securely and querying recognition records for audit trails.
Charge customers based on audio duration or number of requests processed. Offer tiered pricing for different models (e.g., Lite for basic, Pro for advanced features), appealing to startups and enterprises with variable usage needs.
Provide monthly or annual subscriptions with capped usage limits and premium support. Include features like real-time WebSocket ASR and advanced diarization, targeting businesses with consistent transcription demands such as call centers.
License the ASR technology to other companies for integration into their own products, such as video conferencing tools or e-learning platforms. Offer customization options and dedicated API keys, generating revenue through licensing agreements.
💬 Integration Tip
Always validate model-parameter compatibility before sending requests and handle different response formats (JSON, text, SSE) robustly to avoid integration errors.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.