doubao-asrTranscribe recorded audio files to text via Doubao Seed-ASR 2.0 (豆包录音文件识别模型2.0) from ByteDance/Volcengine. Best-in-class Chinese speech recognition with spea...
Install via ClawdBot CLI:
clawdbot install vahnxu/doubao-asrGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://www.volcengine.com/docs/6561/1354868Audited Apr 17, 2026 · audit v1.0
Generated Mar 20, 2026
Transcribes recorded meetings from audio files (e.g., m4a, mp3) into text with speaker identification, enabling teams to review discussions and assign action items. Ideal for corporate environments where tracking who said what is crucial for follow-ups and documentation.
Converts voice memos or recorded interviews into text for journalists, researchers, or students to analyze content and extract quotes. Supports various audio formats like wav and flac, making it useful for field recordings or personal notes.
Transcribes customer service calls to text with speaker separation, helping companies monitor interactions, identify common issues, and train staff. Enhances quality assurance by providing searchable transcripts for compliance and improvement.
Transcribes audio recordings of legal proceedings or medical consultations into accurate text records, aiding in documentation and case management. Ensures precise transcripts for archival and reference purposes in regulated industries.
Converts podcast or video audio tracks into text for subtitles, show notes, or content repurposing. Streamlines production workflows by providing editable transcripts that can be used for SEO and audience engagement.
Offers monthly or annual plans for businesses to transcribe a set number of audio files, with tiered pricing based on usage volume. Generates recurring revenue by catering to regular needs like meeting recordings and customer calls.
Provides API access for developers to integrate transcription into their apps, charging per minute of audio processed. Attracts tech companies and startups needing scalable, on-demand speech recognition without upfront costs.
Sells customized licenses to corporations for unlimited transcription within their infrastructure, including support and compliance features. Targets industries like legal or healthcare with high-volume, sensitive audio processing needs.
💬 Integration Tip
Ensure all required environment variables are set correctly, especially the API key and TOS bucket details, to avoid upload and authentication errors during transcription.
Scored Apr 19, 2026
Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Gemini CLI for one-shot Q&A, summaries, and generation.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates openclaw.json. Use when the user mentions free AI, OpenRouter, model switching, rate limits, or wants to reduce AI costs.
Manages free AI models from OpenRouter for OpenClaw. Automatically ranks models by quality, configures fallbacks for rate-limit handling, and updates opencla...
Reduce OpenClaw AI costs by 97%. Haiku model routing, free Ollama heartbeats, prompt caching, and budget controls. Go from $1,500/month to $50/month in 5 min...
HTML-first PDF production skill for reports, papers, and structured documents. Must be applied before generating PDF deliverables from HTML.