sergei-mikhailov-sttSpeech recognition from voice messages using Yandex SpeechKit (with an extensible architecture for other providers). Use when you need to convert a voice mes...
Install via ClawdBot CLI:
clawdbot install bzSega/sergei-mikhailov-sttGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Sends data to undocumented external endpoint (potential exfiltration)
POST → https://stt.api.cloud.yandex.net/speech/v1/stt:recognize?folderId=${CHECK_FOLDERCalls external URL not in known-safe list
https://www.python.org/downloads/AI Analysis
The skill sends audio data to Yandex SpeechKit API for speech recognition, which is consistent with its stated purpose and documented in the skill definition. While this involves external data transmission, it uses a legitimate provider and the skill explicitly warns against exposing API keys. No credential harvesting, hidden instructions, or obfuscation were found.
Audited Apr 17, 2026 · audit v1.0
Generated Mar 21, 2026
Automatically transcribe customer voice messages from messaging apps like WhatsApp or Telegram into text for ticketing systems. This enables faster response times by converting spoken queries into actionable text data that support agents can prioritize and address efficiently.
Transcribe voice messages shared in team collaboration tools such as Slack or Microsoft Teams into text summaries. This helps remote teams capture meeting notes, action items, and decisions without manual note-taking, improving documentation and follow-up.
Convert student voice recordings in language learning apps to text for pronunciation analysis and feedback. Educators can use the transcriptions to assess fluency, correct errors, and track progress over time, enhancing personalized learning experiences.
Transcribe patient voice messages describing symptoms or medical history from telehealth platforms into structured text for electronic health records. This streamlines intake processes, reduces manual data entry errors, and ensures accurate patient information for healthcare providers.
Convert audio recordings from legal depositions or client interviews into text transcripts for case management systems. This aids lawyers in reviewing evidence, preparing documents, and maintaining organized records, saving time on manual transcription.
Offer the skill as part of a monthly or annual subscription plan for businesses using OpenClaw, with tiered pricing based on usage volume (e.g., number of transcriptions per month). This provides recurring revenue and scales with customer demand for automated speech-to-text services.
Charge users per transcription request, with fees based on audio duration or provider costs (e.g., Yandex SpeechKit pricing). This model appeals to occasional users or small businesses, allowing flexible usage without long-term commitments and generating revenue from variable demand.
Sell custom licenses to large organizations for on-premises deployment or integration with existing systems, including premium support, customization, and multi-provider setups. This targets industries like healthcare or legal with high compliance needs, yielding high-value contracts.
💬 Integration Tip
Ensure API keys are securely configured via OpenClaw's JSON file and test with sample audio files to verify provider compatibility before deployment.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.