ai-mediaGenerate photorealistic images, videos, talking heads, and natural TTS audio using GPU-accelerated AI models and scripts on a remote server.
Install via ClawdBot CLI:
clawdbot install bowen31337/ai-mediaFull-stack AI media generation powered by GPU server (RTX 3090/3080/2070S).
${GPU_USER}@${GPU_HOST}~/.ssh/id_ed25519_gpu/data/ai-stack/comfyui/ComfyUI/ (port 8188)/data/ai-stack/sadtalker//data/ai-stack/whisper//data/ai-stack/output/./scripts/image.sh "lady on beach at sunset" realistic
./scripts/image.sh "cyberpunk cityscape" artistic
Arguments:
$1: Prompt text$2: Style (realistic|artistic) โ optional, default: realisticOutput: Path to generated image (e.g., /data/ai-stack/output/image_001.png)
./scripts/video.sh "waves crashing on shore" animatediff 4
./scripts/video.sh "city traffic timelapse" ltx2 8
Arguments:
$1: Prompt text$2: Model (animatediff|ltx2) โ optional, default: animatediff$3: Duration in seconds โ optional, default: 4Output: Path to generated video (e.g., /data/ai-stack/output/video_001.mp4)
./scripts/talking-head.sh "Hello, I'm Agent" gentle input.jpg
./scripts/talking-head.sh "Welcome to the future" neutral photo.png
Arguments:
$1: Speech text$2: Voice style (gentle|neutral|energetic) โ optional, default: gentle$3: Avatar image path โ optional, generates default if not providedOutput: Path to talking head video (e.g., /data/ai-stack/output/talking_001.mp4)
./scripts/audio.sh "This is a test message" en male
./scripts/audio.sh "Bonjour le monde" fr female
Arguments:
$1: Text to speak$2: Language code (en|fr|es|etc) โ optional, default: en$3: Voice gender (male|female) โ optional, default: maleOutput: Path to audio file (e.g., /data/ai-stack/output/audio_001.wav)
All dependencies are pre-installed on GPU server:
Scripts will:
Status: Active development
Maintainer: Agent
GPU Server: ${GPU_USER}@${GPU_HOST}
AI Usage Analysis
Analysis is being generatedโฆ refresh in a few seconds.
Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.
Capture frames or clips from RTSP/ONVIF cameras.
Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.
Generate images using the internal Google Antigravity API (Gemini 3 Pro Image). High quality, native generation without browser automation.
ไฝฟ็จๅ ็ฝฎ image_generate.py ่ๆฌ็ๆๅพ็, ๅๅคๆธ ๆฐๅ ทไฝ็ `prompt`ใ
AI image generation powered by CellCog. Create images, edit photos, consistent characters, product photography, reference-based images, sets of images, style transfer. Professional image creation with AI.