🎤 Speech & Audio

Speech to Text Transcriptionv1.0.0

Name: Speech to Text Transcription
Author: ivangdavila

speech-to-text-transcription

ivangdavila

Transcribe audio and video files to text with speaker detection, timestamps, and format conversion.

stttranscriptiontts

Download Package View on ClawHub

Installs (all time)

Installs (current)

Downloads

1.2K

Stars

CreatedFeb 22, 2026

UpdatedFeb 26, 2026

Install & Quick Start

Install via ClawdBot CLI:

clawdbot install ivangdavila/speech-to-text-transcription

Requires:

ffmpeg

https://clawic.com/skills/speech-to-text-transcription

Skill Package4 files

📋SKILL.mdmarkdown

Failed to load file.

Quality Score

B56/100

Grade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.

Market Validation9/35

· 1111 downloads (moderate demand)
· 3 installs (very low)

Documentation16/25

· SKILL.md present
· Detailed documentation (≥3000 chars)
· Detailed summary

Package Completeness6/15

· skillAssets present (3 files)

Security Analysis

💙 Low Risk

CREDENTIAL_ACCESShigh

Accesses sensitive credential files or environment variables

$OPENAI

UNDOCUMENTED_EXTERNALlow

Calls external URL not in known-safe list

https://clawic.com/skills/speech-to-text-transcription

KNOWN_EXTERNALlow

Uses known external API (expected, informational)

api.openai.com

AI Analysis

The skill's external API usage (OpenAI, AssemblyAI, Deepgram) is consistent with its stated transcription purpose, and it explicitly recommends local processing with Whisper for privacy. The primary risk is potential credential access via environment variables, but this is a standard pattern for optional cloud services.

Audited Apr 16, 2026 · audit v1.0

💡

Usage Guide

Generated Mar 21, 2026

Content creators and podcastersAcademic researchers and educatorsBusiness professionals and legal teamsintermediate

💡 Application Scenarios

Academic Lecture TranscriptionEducation

Transcribes university lectures from video recordings into text with timestamps, enabling students to create searchable notes and study materials. Supports long durations and speaker diarization to distinguish between professor and student interactions.

Podcast ProductionMedia and Entertainment

Converts podcast audio files into transcripts for subtitles, show notes, and content repurposing. Uses speaker detection to label hosts and guests, and outputs formats like SRT for video platforms.

Corporate Meeting DocumentationBusiness and Corporate

Transcribes business meetings and interviews, extracting action items and summaries for team collaboration. Handles multi-speaker content with diarization and ensures privacy by using local processing for sensitive discussions.

Medical Dictation ProcessingHealthcare

Transcribes voice memos from healthcare professionals into structured text for patient records. Requires high accuracy and can use local Whisper to maintain data privacy and compliance with regulations.

Legal Deposition TranscriptionLegal

Transcribes audio recordings of legal depositions with precise timestamps and speaker identification for court documentation. Supports batch processing of long files and outputs in JSON for easy integration with case management systems.

💼 Business Models

Freemium SaaSSubscription fees and pay-per-use API credits

Offers basic transcription with local Whisper for free, while charging for premium features like cloud provider integrations (e.g., OpenAI Whisper API for higher accuracy) and advanced diarization. Revenue comes from subscription tiers based on usage limits and support.

B2B Enterprise LicensingAnnual licensing fees and service contracts

Licenses the skill package to businesses for internal use, such as in corporate training or media production. Includes custom integrations, priority support, and volume discounts for large-scale transcription needs.

Agency ServicesProject-based fees and retainer agreements

Operates as a transcription service agency using this skill to process client audio files efficiently. Charges per minute of audio transcribed, with added fees for rush jobs, multiple output formats, and data preprocessing.

💬 Integration Tip

Ensure ffmpeg is installed and configured for audio preprocessing, and set up environment variables for cloud API keys only when needed to avoid unnecessary data exposure.