iyeque-audio-processingAudio ingestion, analysis, transformation, and generation (Transcribe, TTS, VAD, Features).
Install via ClawdBot CLI:
clawdbot install iyeque/iyeque-audio-processingA comprehensive toolset for audio manipulation and analysis.
Perform audio operations like transcription, text-to-speech, and feature extraction.
action (string, required): One of transcribe, tts, extract_features, vad_segments, transform.file_path (string, optional): Path to input audio file.text (string, optional): Text for TTS.output_path (string, optional): Path for output file (default: auto-generated).model (string, optional): Whisper model size (tiny, base, small, medium, large). Default: base.Usage:
# Transcribe
uv run --with "openai-whisper" --with "pydub" --with "numpy" skills/audio-processing/tool.py transcribe --file_path input.wav
# TTS
uv run --with "gTTS" skills/audio-processing/tool.py tts --text "Hello world" --output_path hello.mp3
# Features
uv run --with "librosa" --with "numpy" --with "soundfile" skills/audio-processing/tool.py extract_features --file_path input.wav
AI Usage Analysis
Analysis is being generated⦠refresh in a few seconds.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
End-to-end encrypted agent-to-agent private messaging via Moltbook dead drops. Use when agents need to communicate privately, exchange secrets, or coordinate without human visibility.
Text-to-speech via OpenAI Audio Speech API.