meeting-assistant
用于构建和排查 SenseAudio 会议助手,覆盖实时会议转写、说话人区分、实时翻译、会议纪要生成、行动项提取与转录导出。Build and troubleshoot SenseAudio meeting assistants for live meeting transcription, speaker-aw...
music-gen
SenseAudio Music Generation API for creating AI-generated lyrics and songs. Supports lyrics generation, song generation with style/vocal control, and async t...
voice-clone
Guide users through SenseAudio platform voice cloning, then generate TTS with cloned `voice_id` values. Use when users want to clone voices, manage cloned vo...
video-narrator
Generate SenseAudio TTS narration tracks for videos, including timestamped segments, style variants, and editor-ready voiceover exports. Use when users need...
chatbot
Build real-time voice chatbot applications with natural conversation flow and customizable personalities. Use when users want to create voice assistants, con...
meme
Create funny voice memes with various styles, effects, and templates. Use when users want to make humorous audio content, voice memes, or entertaining sound...
subtitle
Generate synchronized subtitles (SRT/VTT/ASS) from video audio with precise timestamps. Use when users need subtitles, captions, or video transcription with...
text2speech
SenseAudio Text-to-Speech (TTS) API for converting text to natural speech. Supports synchronous and SSE streaming modes, multiple voices, emotion control, sp...
segment-anything
使用 SAM(Segment Anything Model)去除图像背景,将前景主体提取为透明 PNG。适用于去除背景、抠图、提取前景主体或图像分割等需求。
quick-tts
Zero-config text-to-speech — give text, get an mp3 file. Handles natural-language voice selection ("用女声", "撒娇语气", "生气一点") and auto-inserts pacing breaks for...
language-tutor
Create language learning audio with SenseAudio TTS, including pronunciation drills, bilingual lessons, slowed speech practice, and dialogue exercises. Use wh...
songmaker
Generate a complete song from a text description — AI writes lyrics then composes music. Use when users want to create a song, turn a description into audio,...
senseaudio
Integration guide for SenseAudio Open Platform APIs, including TTS (sync/SSE/WebSocket), ASR (HTTP/WebSocket), realtime Agents, video generation/storyboard,...
audio-quality-checker
Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio quality, validate recordings, or ident...
realtime-agent
Manage SenseAudio realtime agents by listing agents, starting or continuing sessions, querying status, and leaving sessions with proper error handling.
senseaudio-audio-quality-checker
senseaudio-tts
Build and debug SenseAudio text-to-speech integrations on `/v1/t2a_v2` and `/ws/v1/t2a_v2`, including sync HTTP, SSE stream, WebSocket event sequencing, hex...
meetingsummarizer
Transcribe meetings with SenseAudio ASR speaker diarization, timestamps, and meeting-note extraction workflows. Use when users need meeting transcription, me...
senseaudio-asr
Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1...
memo
Transcribe and organize voice memos with automatic categorization and information extraction. Use when users have voice notes, audio memos, or spoken notes t...
bgm
Generate original background music for short videos from a natural language description. Use when creators need royalty-free BGM, video background music, or...
bedtime-radio
Generate a complete bedtime story audio program from a keyword — with intro, narration, character voices, and a sleepy outro. Use when parents or caregivers...
weather-broadcast
Fetch weather data and generate a spoken weather broadcast using SenseAudio TTS.
elderly-assistant
银发族语音助手——老年人对着手机说话就能发消息、查天气、设闹钟、听戏曲,无需学任何操作。
voice-translator
说中文出外语语音——按住说中文,2-3秒内播放英/日/韩语音。支持场景模式、双向对话、常用句收藏。
audiobook-generator
Generate audiobooks from novels and long-form text with chapter management and character voices. Use when users mention audiobooks, narrating books, or conve...
jingle-forge
Create a brand jingle (5–15 seconds) from a brand name and tone keywords. Use when users want a brand sound logo, audio identity, jingle, intro music, or sho...
clone-wizard
Guided voice cloning workflow — from recording tips to first playback. Use when users want to clone their voice, create a custom voice, or ask "怎么克隆声音", "我想用...
senseaudio-voice
Guide for SenseAudio voice selection, plan-level voice entitlement checks, and cloned voice usage constraints in TTS calls. Use this whenever user asks why a...
polyphone
Fix Chinese polyphone (多音字) mispronunciation in TTS by auto-detecting ambiguous characters and applying pinyin annotations. Use when users complain about wro...
rapper
Create and debug SenseAudio rap, hip-hop, or vocal song generation workflows using the `/v1/song/lyrics/create`, `/v1/song/lyrics/pending/:task_id`, `/v1/son...
pronunciation
Foreign language pronunciation coach — listen to standard TTS pronunciation, record yourself, get word-by-word feedback on what was wrong, then practice targ...
voice-picker
Recommend the best SenseAudio voice for any scenario or emotion. Use when users ask which voice to use — e.g. "儿童故事播客用什么音色", "电商直播带货适合哪个声音", "我需要撒娇感的女声", "有没...
lyric-flip
Rewrite a song with a new theme while preserving the original's rhyme scheme, line structure, and rhythmic skeleton. Use when users want to parody a song, wr...
sam
Use SAM (Segment Anything Model) to remove image backgrounds and extract foreground subjects as transparent PNGs. Use when users want to remove backgrounds,...