toby-audio-transcribeTranscribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
Install via ClawdBot CLI:
clawdbot install tobeyrebecca/toby-audio-transcribeGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://www.assemblyai.com/docsAudited Apr 16, 2026 · audit v1.0
Generated May 6, 2026
Transcribe team meetings or interviews with speaker labels and optional name mapping. The skill produces agent-friendly Markdown and JSON outputs, enabling downstream AI workflows such as automated meeting summaries or action item extraction.
Translate or transcribe audio/video in multiple languages using AssemblyAI's language detection and translation features. Content creators can generate subtitles, translated transcripts, and structured exports for international audiences.
Analyze customer service calls by transcribing with speaker diarization, then extracting topics, entities, and sentiment. The structured JSON output can feed CRM systems or analytics dashboards to improve service quality.
Convert recorded lectures into searchable, timestamped transcripts with speaker names. Educators can produce Markdown notes for students and NLP-friendly JSON for automated quiz generation or content indexing.
Transcribe legal proceedings with high accuracy, speaker labels, and optional translation. The agent-friendly output formats facilitate integration with legal document management systems for review and discovery.
Offer pay-as-you-go or monthly subscription plans based on audio hours processed, with higher tiers unlocking advanced features like speaker identification, translation, and LLM Gateway extraction.
Package the skill as a composable microservice in AI agent marketplaces (e.g., Clawdbot). Charge per API call or via subscription, enabling other agents to invoke transcription and post-processing on demand.
Provide a white-label transcription solution with dedicated support, custom integrations, and SLAs for enterprises needing high accuracy and compliance. Includes on-premise or VPC deployment options.
💬 Integration Tip
To maximize agent compatibility, always use the --bundle-dir flag to generate manifest files and multiple output formats (Markdown, JSON) for easy downstream consumption.
Scored May 6, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.