skill-spotlightimage-gennano-banana-proclawhubopenclawgeminiimage-generationai-art

Nano Banana Pro Skill: Generate and Edit 4K Images From Claude Code

March 8, 2026·7 min read

50.000+ downloads and 163 stars — nano-banana-pro by @steipete (Peter Steinberger) is the top image generation skill on ClawHub. It gives Claude Code direct access to Google's Gemini 3 Pro Image model — also known as Nano Banana Pro — for both text-to-image generation and image-to-image editing, at 1K, 2K, or 4K resolution.

The skill is a thin, well-designed Python wrapper. No web UI, no credits dashboard to manage in a browser — just a command your AI agent can run directly, with results saved to disk.

What Nano Banana Pro (Gemini 3 Pro Image) Is

Gemini 3 Pro Image is Google's flagship image generation model. It generates images from text prompts and can also take an existing image as input and apply editing instructions to it. It produces images at three resolution tiers:

1K (1024×1024) — fast drafts, social media, web use
2K (2048×2048) — high-quality content creation
4K (4096×4096) — professional design, print, large displays

Pricing scales with resolution: approximately $0.13 per image at 1K-2K, and $0.24 per image at 4K. The API is accessed via a GEMINI_API_KEY — the same key used for Gemini text generation.

Core Features

Text-to-Image Generation

Generate an image from a text description:

uv run ~/.openclaw/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "A dramatic aerial view of a futuristic city at dawn, photorealistic" \
  --filename "2026-03-08-14-30-00-city-dawn.png" \
  --resolution 1K

The script handles model initialization, API call, and saving the result to disk. Output is always PNG.

Image-to-Image Editing

Pass an existing image as input along with editing instructions:

uv run ~/.openclaw/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "Make the sky more dramatic, add storm clouds" \
  --filename "2026-03-08-14-32-00-city-stormy.png" \
  --input-image "2026-03-08-14-30-00-city-dawn.png" \
  --resolution 2K

The model receives both the original image and the edit prompt, producing a modified version. Common uses: background replacement, style changes, object removal (partial), lighting adjustments, color palette shifts.

Auto-resolution detection: If you don't specify --resolution when using --input-image, the script automatically matches the output resolution to the input image dimensions:

Input ≥ 3000px → 4K output
Input ≥ 1500px → 2K output
Input < 1500px → 1K output

The Draft → Iterate → Final Workflow

The skill's documentation is explicit about this workflow, and it matters for cost:

1. Draft (1K)   — verify composition, colors, subject matter
2. Iterate (1K) — adjust prompt until the result is right
3. Final (4K)   — only when the prompt is locked

4K costs roughly 2× the 1K price. Running 10 drafts at 1K costs about $1.30, then one 4K final costs $0.24 — total $1.54 and you have a high-quality result. Running everything at 4K would cost $2.40 for the same iteration count.

The filename convention yyyy-mm-dd-hh-mm-ss-descriptive-name.png keeps draft iterations organized:

2026-03-08-14-30-00-city-dawn-draft1.png
2026-03-08-14-31-00-city-dawn-draft2.png
2026-03-08-14-45-00-city-dawn-final.png

Resolution Guide

`--resolution`	Output Size	Best For	Cost/image
`1K` (default)	1024×1024	Drafts, social media, web	~$0.13
`2K`	2048×2048	Marketing assets, presentations	~$0.13
`4K`	4096×4096	Print, large displays, professional design	~$0.24

Resolution options must be uppercase: 1K, 2K, 4K. Lowercase will fail.

Setup

Install the Skill

clawhub install nano-banana-pro

Get a Gemini API Key

Go to Google AI Studio and create an API key
Set it as an environment variable:

export GEMINI_API_KEY="your-key-here"

Or add it to your ~/.openclaw/openclaw.json so it persists across sessions.

Dependency Check

The script uses uv for dependency management (handles google-genai and pillow automatically). Verify your setup:

command -v uv        # must exist
echo $GEMINI_API_KEY # must be non-empty

Practical Use Cases

Product Mockup Generation

Create product images for e-commerce without a photography session:

# Draft: get the composition right
uv run generate_image.py --prompt "Minimalist skincare bottle on white marble, studio lighting, product photo" --filename "product-draft.png" --resolution 1K
 
# Final: once composition is confirmed
uv run generate_image.py --prompt "Minimalist skincare bottle on white marble, studio lighting, product photo, 4K ultra detailed" --filename "product-final.png" --resolution 4K

Marketing Asset Variations

Generate multiple variations of a scene for A/B testing:

# Variation A: outdoor
uv run generate_image.py --prompt "Person using laptop at outdoor cafe, sunny day, lifestyle photo" --filename "ad-outdoor.png" --resolution 2K
 
# Variation B: indoor
uv run generate_image.py --prompt "Person using laptop in modern home office, morning light, lifestyle photo" --filename "ad-indoor.png" --resolution 2K

Iterative Image Editing

Edit an existing photo in stages:

# Step 1: change the background
uv run generate_image.py --prompt "Replace background with a clean white studio" --filename "step1.png" --input-image "original.jpg"
 
# Step 2: adjust the subject
uv run generate_image.py --prompt "Make the subject look more formal, professional attire" --filename "step2.png" --input-image "step1.png"

AI-Assisted Design Workflow

Integrate into a larger design pipeline where Claude handles both writing and visual asset creation for a document, slide deck, or web page.

Comparison: nano-banana-pro vs Alternatives

Feature	nano-banana-pro	DALL-E 3 skill	Stable Diffusion	Midjourney
Resolution options	1K / 2K / 4K	1K (1024px)	Varies	Up to 2K
Image-to-image	✅	❌ (generate only)	✅	✅ (remix)
Native OpenClaw integration	✅	✅	⚠️ (self-hosted)	❌
Auto resolution from input	✅	❌	❌	❌
API key required	✅ (Gemini)	✅ (OpenAI)	❌	✅
Script-based (no UI)	✅	✅	⚠️ (depends on setup)	❌

Common Failures and Fixes

Error: No API key provided. → Set GEMINI_API_KEY as an environment variable or pass --api-key your-key

Error loading input image: → Verify the --input-image path is correct and the file is a readable image format

Quota or 403 errors → The API key may lack image generation permissions, or your quota is exhausted. Check Google AI Studio.

uv: command not found → Install uv: curl -LsSf https://astral.sh/uv/install.sh | sh

Considerations

Cost is real — Unlike text generation where per-token costs are negligible at small scale, image generation at 4K ($0.24/image) adds up quickly. Use the draft workflow to keep costs predictable.
Image editing is approximate — Gemini's image-to-image is generative, not surgical. It may change things you wanted to keep, especially in complex compositions. For precise edits, a tool like Photoshop with specific masking is still better.
Output is always PNG — The script converts all model output to PNG. If you need JPEG for file size, convert after generation.
Text in images is good, not perfect — Gemini 3 Pro Image has notably strong text rendering compared to earlier models, but long text passages in images can still have errors.
Prompt sensitivity — Small prompt changes can produce dramatically different outputs. The iterative workflow helps manage this.

The Bigger Picture

The traditional workflow for generating images in a development or content process involves switching contexts — stopping what you're doing, opening a browser, typing into a web UI, downloading results, organizing files. nano-banana-pro collapses that into a single command your AI agent can run as part of a larger task.

This matters for automation: a Claude Code agent working on a presentation or product page can generate the images it needs without any human handoff to a separate tool. The agent proposes image ideas, generates drafts, iterates based on your feedback, and produces the final assets — all within the same conversation.

With 50.000+ downloads, it's clearly a workflow a lot of people are finding valuable.

View the skill on ClawHub: nano-banana-pro

← Back to Blog