🎤 Speech & Audio

Multimodal Basev0.1.0

Name: Multimodal Base
Author: yuyonghao-123

yuyonghao-multimodal-base

Supports image understanding, OCR, speech-to-text, and text-to-speech synthesis with multi-voice and multimodal unified processing using OpenAI and Edge TTS.

latest

Download Package View on ClawHub

Installs (all time)

Installs (current)

Downloads

400

Stars

CreatedMar 27, 2026

UpdatedMar 27, 2026

Install & Quick Start

Install via ClawdBot CLI:

clawdbot install yuyonghao-123/yuyonghao-multimodal-base

Skill Package9 files

📋SKILL.mdmarkdown

Failed to load file.

Quality Score

B53/100

Grade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.

Market Validation4/35

· 218 downloads (low demand)
· No tracked installs (may still have real users via manual install)

Documentation16/25

· SKILL.md present
· Detailed documentation (≥3000 chars)
· Detailed summary

Package Completeness8/15

· skillAssets present (8 files)
· Includes scripts or config files

Security Analysis

💙 Low Risk

CREDENTIAL_ACCESShigh

Accesses sensitive credential files or environment variables

process.env.OPENAI

UNDOCUMENTED_EXTERNALlow

Calls external URL not in known-safe list

https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz

KNOWN_EXTERNALlow

Uses known external API (expected, informational)

api.openai.com

AI Analysis

The skill's external API usage (OpenAI GPT-4V, Whisper) is consistent with its stated multimodal purpose and requires explicit user-provided API keys. No hidden instructions, credential harvesting, or obfuscation are evident in the provided definition. The primary risk is the standard data-sharing inherent to using third-party AI services.

💡

Usage Guide

Loading usage data… refresh in a few seconds.

Multimodal Basev0.1.0

Install & Quick Start

Quality Score

Security Analysis

More Speech & Audio Skills