⚠️Install with caution. This skill has very few installs. Always review the source and verify it on clawhub.ai before installing. Community-built skills run with agent permissions — only install ones you trust.

🎤 Speech & Audio

Speech to Text (Yandex SpeechKit)v1.1.8

Name: Speech to Text (Yandex SpeechKit)
Author: bzSega

sergei-mikhailov-stt

bzSega

Speech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...

stttranscriptiontts

Download Package View on ClawHub

Installs (all time)

Installs (current)

Downloads

1.0K

Stars

CreatedFeb 23, 2026

UpdatedMar 7, 2026

Install & Quick Start

Install via ClawdBot CLI:

clawdbot install bzSega/sergei-mikhailov-stt

Skill Package13 files

📋SKILL.mdmarkdown

Failed to load file.

Quality Score

B58/100

Grade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.

Market Validation2/35

· No tracked installs (may still have manual users)
· 485 downloads (low demand)

Documentation20/25

· SKILL.md present
· Detailed documentation (≥3000 chars)
· Contains usage examples or trigger description
· Detailed summary

Package Completeness11/15

· skillAssets present (12 files)

Security Analysis

💙 Low Risk

UNKNOWN_DATA_SINKhigh

Sends data to undocumented external endpoint (potential exfiltration)

POST → https://stt.api.cloud.yandex.net/speech/v1/stt:recognize?folderId=${CHECK_FOLDER

UNDOCUMENTED_EXTERNALlow

Calls external URL not in known-safe list

https://www.python.org/downloads/

AI Analysis

The skill sends audio data to Yandex SpeechKit API for speech recognition, which is consistent with its stated purpose and documented in the skill definition. While this involves external data transmission, it uses a legitimate provider and the skill explicitly warns against exposing API keys. No credential harvesting, hidden instructions, or obfuscation were found.

Audited Apr 17, 2026 · audit v1.0

💡

Usage Guide

Generated Mar 21, 2026

Businesses using OpenClaw for messaging automationDevelopers building voice-enabled applicationsintermediate

💡 Application Scenarios

Customer Support AutomationCustomer Service

Automatically transcribe customer voice messages from messaging apps like WhatsApp or Telegram into text for ticketing systems. This enables faster response times by converting spoken queries into actionable text data that support agents can prioritize and address efficiently.

Meeting Transcription for Remote TeamsTechnology

Transcribe voice messages shared in team collaboration tools such as Slack or Microsoft Teams into text summaries. This helps remote teams capture meeting notes, action items, and decisions without manual note-taking, improving documentation and follow-up.

Language Learning AssistanceEducation

Convert student voice recordings in language learning apps to text for pronunciation analysis and feedback. Educators can use the transcriptions to assess fluency, correct errors, and track progress over time, enhancing personalized learning experiences.

Healthcare Patient IntakeHealthcare

Transcribe patient voice messages describing symptoms or medical history from telehealth platforms into structured text for electronic health records. This streamlines intake processes, reduces manual data entry errors, and ensures accurate patient information for healthcare providers.

Legal Deposition DocumentationLegal

Convert audio recordings from legal depositions or client interviews into text transcripts for case management systems. This aids lawyers in reviewing evidence, preparing documents, and maintaining organized records, saving time on manual transcription.

💼 Business Models

Subscription-Based SaaSRecurring monthly fees from $50 to $500+ per user

Offer the skill as part of a monthly or annual subscription plan for businesses using OpenClaw, with tiered pricing based on usage volume (e.g., number of transcriptions per month). This provides recurring revenue and scales with customer demand for automated speech-to-text services.

Pay-Per-Use APITransaction-based earnings, e.g., $0.01 to $0.10 per minute of audio

Charge users per transcription request, with fees based on audio duration or provider costs (e.g., Yandex SpeechKit pricing). This model appeals to occasional users or small businesses, allowing flexible usage without long-term commitments and generating revenue from variable demand.

Enterprise LicensingOne-time or annual licensing fees from $10,000 to $100,000+

Sell custom licenses to large organizations for on-premises deployment or integration with existing systems, including premium support, customization, and multi-provider setups. This targets industries like healthcare or legal with high compliance needs, yielding high-value contracts.

💬 Integration Tip

Ensure API keys are securely configured via OpenClaw's JSON file and test with sample audio files to verify provider compatibility before deployment.