music-genSenseAudio Music Generation API for creating AI-generated lyrics and songs. Supports lyrics generation, song generation with style/vocal control, and async t...
Install via ClawdBot CLI:
clawdbot install scikkk/music-genGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Sends data to undocumented external endpoint (potential exfiltration)
POST → https://api.senseaudio.cn/v1/song/lyrics/createCalls external URL not in known-safe list
https://senseaudio.cn/docs/song/lyrics_createAI Analysis
The skill interacts with a documented, specialized music generation API (SenseAudio) consistent with its stated purpose. While it sends user prompts to an external server, this is an expected function for the service, and no hidden instructions, credential harvesting, or obfuscation are present. The primary risk is the standard data-sharing inherent to using any third-party AI service.
Audited Apr 16, 2026 · audit v1.0
Generated Mar 20, 2026
Influencers and content creators can generate custom background music for videos, podcasts, or reels, enhancing engagement without licensing issues. This allows for unique, on-brand audio that matches specific moods or themes, such as upbeat tracks for travel vlogs or ambient sounds for tutorials.
Marketing agencies can produce original jingles or soundtracks for commercials, reducing costs and time compared to traditional music production. By tailoring lyrics and styles to target demographics, campaigns can achieve higher emotional resonance and brand recall across platforms like TV, radio, and online ads.
Educational institutions and e-learning platforms can create mnemonic songs or background music for courses, making learning more engaging and accessible. For example, generating songs with lyrics about historical events or scientific concepts aids memory retention in students of all ages.
Health and wellness apps can offer custom relaxation or motivational music based on user preferences, such as calming instrumental tracks for meditation or energizing pop songs for workouts. This personalization enhances user experience and supports mental health initiatives through tailored audio content.
Game developers can dynamically generate soundtracks that adapt to in-game scenarios, such as changing music styles during action sequences or creating ambient scores for different environments. This reduces reliance on pre-composed libraries and allows for more immersive, responsive gameplay experiences.
Charge businesses a monthly or annual fee based on usage tiers (e.g., number of API calls or generated tracks). This model provides predictable revenue and scales with client needs, appealing to startups and enterprises integrating music generation into their products, such as apps or marketing tools.
Offer a free tier with limited features (e.g., basic lyrics generation) and charge for advanced capabilities like high-quality song generation or faster processing. This attracts individual users and small creators, converting them to paid plans as their needs grow, with revenue from microtransactions and premium upgrades.
License the technology to other companies for embedding into their own platforms under their brand, with custom pricing based on scale and exclusivity. This generates high-margin revenue from partnerships in industries like education or media, where clients seek turnkey solutions without developing in-house AI capabilities.
💬 Integration Tip
Ensure secure handling of the SENSEAUDIO_API_KEY environment variable and implement robust error handling for async polling to manage task delays and failures effectively.
Scored Apr 19, 2026
Local speech-to-text with the Whisper CLI (no API key).
ElevenLabs text-to-speech with mac-style say UX.
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Local text-to-speech via sherpa-onnx (offline, no cloud)
Start voice calls via the OpenClaw voice-call plugin.