zhipu-asrAutomatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Use when you need to transcribe audio files to text. Supports Chinese audio trans...
Install via ClawdBot CLI:
clawdbot install franklu0819-lang/zhipu-asrGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://bigmodel.cn/usercenter/proj-mgmt/apikeysAudited Apr 17, 2026 · audit v1.0
Generated Mar 1, 2026
Transcribe business meetings or conference calls in Chinese, using context prompts to link multiple audio segments for continuity. Ideal for capturing discussions on project updates, decisions, and action items in corporate settings.
Convert Chinese-language lectures, seminars, or online course audio into text, with hotwords for technical terms or names to improve accuracy. Useful for creating study materials or subtitles in education and training.
Transcribe patient-doctor conversations in Chinese, leveraging hotwords for medical terminology like symptoms or treatments to ensure precise documentation. Supports compliance and record-keeping in healthcare.
Process audio from customer support calls in Chinese, using context prompts to maintain conversation flow and hotwords for product names or issues. Helps in quality assurance and feedback analysis for service improvement.
Transcribe Chinese audio from podcasts, interviews, or broadcasts, with hotwords for names and brands to enhance transcription quality. Facilitates content creation, subtitling, and archiving in media industries.
Offer tiered subscription plans for developers or businesses to access the ASR service via API, with limits on requests or features like hotwords. Revenue comes from monthly or annual fees based on usage volume.
Charge users per audio file or minute transcribed, with optional add-ons for advanced features like context prompts or bulk processing. Targets occasional users or small businesses needing flexible pricing.
License the ASR technology to large organizations for integration into their internal systems, such as call centers or compliance tools, with customization and support. Revenue is generated through licensing fees and service contracts.
💬 Integration Tip
Ensure the ZHIPU_API_KEY is set in the environment and use the provided shell script with proper audio file paths for quick setup.
Scored Apr 23, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.