polyphoneFix Chinese polyphone (多音字) mispronunciation in TTS by auto-detecting ambiguous characters and applying pinyin annotations. Use when users complain about wro...
Install via ClawdBot CLI:
clawdbot install scikkk/polyphoneGrade Limited — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Sends data to undocumented external endpoint (potential exfiltration)
POST → https://api.senseaudio.cn/v1/t2a_v2Calls external URL not in known-safe list
https://senseaudio.cnAudited Apr 17, 2026 · audit v1.0
Generated Mar 22, 2026
Used by e-learning platforms to generate accurate audio for Chinese language lessons, textbooks, and pronunciation guides, ensuring polyphones like 行 and 好 are correctly pronounced based on context. This improves learning outcomes by providing clear, context-aware audio examples for students.
Applied in audiobook studios to synthesize narration for Chinese literature, where polyphones such as 了 and 得 must match the intended meaning to preserve narrative flow and avoid listener confusion. It automates pronunciation fixes during post-production, saving time for voice actors and editors.
Deployed in call centers or IVR systems to generate natural-sounding speech for automated responses in Chinese, correcting mispronunciations in phrases like 银行行长 to enhance professionalism and user trust. It ensures clarity in financial or service-related announcements.
Integrated into assistive technologies like screen readers for visually impaired users, providing precise TTS output for Chinese text by handling polyphones like 重 and 中 correctly. This improves accessibility in digital content such as websites, documents, and apps.
Utilized by marketing agencies to create audio ads or promotional videos in Chinese, where accurate pronunciation of brand names or slogans containing polyphones like 发 and 参 is crucial for brand image. It streamlines audio production for campaigns targeting Chinese-speaking audiences.
Monetized by offering the TTS service via API calls, charging based on usage tiers such as per-character or per-minute of audio generated. This model targets developers and businesses integrating precise Chinese TTS into their applications, with revenue from subscription plans or pay-as-you-go fees.
Provides customized solutions for large organizations like educational institutions or media companies, offering dedicated support, higher usage limits, and integration assistance. Revenue is generated through annual licensing contracts tailored to specific client needs and scale.
Offers a free tier with basic TTS capabilities and limited polyphone corrections, while premium features like advanced dictionary overrides, higher-quality voices, and priority support are paid. This attracts individual users and small businesses, with revenue from upgrades and add-ons.
💬 Integration Tip
Ensure the SENSEAUDIO_API_KEY is securely stored as an environment variable and use cloned voices with the SenseAudio-TTS-1.5 model for dictionary functionality to work properly.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.