nanobanana-ppt-skills基于 AI 自动分析文档内容,智能规划并生成多风格高清 PPT 图片,支持可选转场视频和交互式播放体验。
Install via ClawdBot CLI:
clawdbot install ITRocker/nanobanana-ppt-skills必需:
GEMINI_API_KEY: Google AI API 密钥(用于生成 PPT 图片)可选(用于视频功能):
KLING_ACCESS_KEY: 可灵 AI Access KeyKLING_SECRET_KEY: 可灵 AI Secret Keypip install google-genai pillow python-dotenv
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpeg
/ppt-generator-pro
或直接告诉 Claude:
我想基于以下文档生成一个 5 页的 PPT,使用渐变毛玻璃风格。
[文档内容...]
选项 A: 文档路径
用户: 基于 my-document.md 生成 PPT
→ 使用 Read 工具读取文件内容
选项 B: 直接文本
用户: 我想生成一个关于 AI 产品设计的 PPT
主要内容:
1. 现状分析
2. 设计原则
3. 案例研究
选项 C: 主动询问
如果用户未提供内容,询问:
"请提供文档路径或直接粘贴文档内容"
扫描 styles/ 目录,列出可用风格:
# 自动检测风格文件
styles = ['gradient-glass.md', 'vector-illustration.md']
如果有多个风格,使用 AskUserQuestion:
问题: 请选择 PPT 风格
选项:
- 渐变毛玻璃卡片风格(科技感、商务演示)
- 矢量插画风格(温暖、教育培训)
使用 AskUserQuestion 询问:
问题: 希望生成多少页 PPT?
选项:
- 5 页(5 分钟演讲)
- 5-10 页(10-15 分钟演讲)
- 10-15 页(20-30 分钟演讲)
- 20-25 页(45-60 分钟演讲)
问题: 选择图片分辨率
选项:
- 2K (2752x1536) - 推荐,快速生成
- 4K (5504x3072) - 高质量,适合打印
如果配置了可灵 AI 密钥,询问:
问题: 是否生成转场视频?
选项:
- 仅图片(快速)
- 图片 + 转场视频(完整体验)
根据页数范围,智能规划每一页内容:
5 页版本:
5-10 页版本:
2-3. 引言/背景
4-7. 核心内容(3-4 个关键观点)
8-9. 案例或数据支持
10-15 页版本:
2-3. 引言/目录
4-6. 第一章节(3 页)
7-9. 第二章节(3 页)
10-12. 第三章节/案例研究
13-14. 数据可视化
20-25 页版本:
3-4. 引言和背景
5-8. 第一部分(4 页)
9-12. 第二部分(4 页)
13-16. 第三部分(4 页)
17-19. 案例研究
20-22. 数据分析和洞察
23-24. 关键发现和建议
创建 JSON 文件:
{
"title": "文档标题",
"total_slides": 5,
"slides": [
{
"slide_number": 1,
"page_type": "cover",
"content": "标题:AI 产品设计指南\n副标题:构建以用户为中心的智能体验"
},
{
"slide_number": 2,
"page_type": "content",
"content": "核心原则\n- 简单直观\n- 快速响应\n- 透明可控"
},
{
"slide_number": 3,
"page_type": "content",
"content": "设计流程\n1. 用户研究\n2. 原型设计\n3. 测试迭代"
},
{
"slide_number": 4,
"page_type": "data",
"content": "用户满意度\n使用前:65%\n使用后:92%\n提升:+27%"
},
{
"slide_number": 5,
"page_type": "content",
"content": "总结\n- 以用户为中心\n- 持续优化迭代\n- 数据驱动决策"
}
]
}
重要: 将此文件保存到:
./slides_plan.json.claude/skills/ppt-generator/slides_plan.json独立模式:
cd /path/to/ppt-generator
Skill 模式:
cd ~/.claude/skills/ppt-generator
python generate_ppt.py \
--plan slides_plan.json \
--style styles/gradient-glass.md \
--resolution 2K
或使用 uv run(推荐):
uv run python generate_ppt.py \
--plan slides_plan.json \
--style styles/gradient-glass.md \
--resolution 2K
参数说明:
--plan: slides 规划 JSON 文件路径--style: 风格文件路径--resolution: 分辨率(2K 或 4K)--template: HTML 模板路径(可选)脚本会输出进度信息:
✅ 已加载环境变量: /path/to/.env
📊 开始生成 PPT 图片...
总页数: 5
分辨率: 2K (2752x1536)
风格: 渐变毛玻璃卡片风格
🎨 生成第 1 页 (封面页)...
提示词已生成
调用 Nano Banana Pro API...
✅ 第 1 页生成成功 (32.5 秒)
🎨 生成第 2 页 (内容页)...
✅ 第 2 页生成成功 (28.3 秒)
...
✅ 所有页面生成完成!
📁 输出目录: outputs/20260112_143022/
这是 Skill 的核心优势:我(Claude Code)会分析生成的 PPT 图片,为每个转场生成精准的视频提示词。
我会读取所有生成的图片:
# 自动读取输出目录中的所有图片
slides = ['slide-01.png', 'slide-02.png', ...]
对于每对相邻图片,我会:
示例输出:
{
"preview": {
"slide_path": "outputs/.../slide-01.png",
"prompt": "画面保持封面的静态构图,中心的3D玻璃环缓慢旋转..."
},
"transitions": [
{
"from_slide": 1,
"to_slide": 2,
"prompt": "镜头从封面开始,玻璃环逐渐解构,分裂成透明碎片..."
}
]
}
我会将生成的提示词保存到:
outputs/TIMESTAMP/transition_prompts.json
关键优势:
如果用户选择生成视频,使用阶段 4 生成的提示词文件:
python generate_ppt_video.py \
--slides-dir outputs/20260112_143022/images \
--output-dir outputs/20260112_143022_video \
--prompts-file outputs/20260112_143022/transition_prompts.json
生成内容:
preview.mp4)transition_01_to_02.mp4 等)video_index.html)full_ppt_video.mp4)✅ PPT 生成成功!
📁 输出目录: outputs/20260112_143022/
🖼️ PPT 图片: outputs/20260112_143022/images/
🎬 播放网页: outputs/20260112_143022/index.html
打开播放网页:
open outputs/20260112_143022/index.html
播放器快捷键:
- ← → 键: 切换页面
- ↑ Home: 回到首页
- ↓ End: 跳到末页
- 空格: 暂停/继续自动播放
- ESC: 全屏切换
- H: 隐藏/显示控件
✅ PPT 视频生成成功!
📁 输出目录: outputs/20260112_143022_video/
🖼️ PPT 图片: outputs/20260112_143022/images/
🎬 转场视频: outputs/20260112_143022_video/videos/
🎮 交互式播放器: outputs/20260112_143022_video/video_index.html
🎥 完整视频: outputs/20260112_143022_video/full_ppt_video.mp4
打开交互式播放器:
open outputs/20260112_143022_video/video_index.html
播放逻辑:
1. 首页: 播放循环预览视频
2. 按右键 → 播放转场视频 → 显示目标页图片(2 秒)
3. 再按右键 → 播放下一个转场 → 显示下一页图片
4. 依此类推...
视频播放器快捷键:
- ← → 键: 上一页/下一页(含转场)
- 空格: 播放/暂停当前视频
- ESC: 全屏切换
- H: 隐藏/显示控件
Skill 会按以下顺序查找 .env 文件:
./ppt-generator/.env.git 或 .env 的目录~/.claude/skills/ppt-generator/.env# Google AI API 密钥(必需)
GEMINI_API_KEY=your_gemini_api_key_here
# 可灵 AI API 密钥(可选,用于视频功能)
KLING_ACCESS_KEY=your_kling_access_key_here
KLING_SECRET_KEY=your_kling_secret_key_here
1. API 密钥未设置
错误: ⚠️ 未找到 .env 文件,尝试使用系统环境变量
未设置 GEMINI_API_KEY 环境变量
解决:
1. 创建 .env 文件
2. 添加 GEMINI_API_KEY=your_key_here
2. Python 依赖缺失
错误: ModuleNotFoundError: No module named 'google.genai'
解决: pip install google-genai pillow python-dotenv
3. FFmpeg 未安装
错误: ❌ FFmpeg 不可用!
解决: brew install ffmpeg # macOS
sudo apt-get install ffmpeg # Ubuntu
4. API 调用失败
错误: API 调用超时或失败
解决:
1. 检查网络连接
2. 确认 API 密钥有效
3. 稍后重试
5. 视频生成失败
错误: 可灵 AI 密钥未配置
解决:
1. 如果只需要图片,跳过视频生成步骤
2. 如果需要视频,配置 KLING_ACCESS_KEY 和 KLING_SECRET_KEY
gradient-glass.md)视觉特点:
适用场景:
vector-illustration.md)视觉特点:
适用场景:
styles/ 目录创建新的 .md 文件Nano Banana Pro(图片生成):
gemini-3-pro-image-preview16:9IMAGE可灵 AI(视频生成):
FFmpeg(视频合成):
生成速度:
文件大小:
仅图片模式:
outputs/20260112_143022/
├── images/
│ ├── slide-01.png
│ ├── slide-02.png
│ └── ...
├── index.html # 图片播放器
└── prompts.json # 提示词记录
视频模式:
outputs/20260112_143022_video/
├── videos/
│ ├── preview.mp4 # 首页循环预览
│ ├── transition_01_to_02.mp4
│ ├── transition_02_to_03.mp4
│ └── ...
├── video_index.html # 交互式播放器
└── full_ppt_video.mp4 # 完整视频
prompts.json 了解生成逻辑,可手动调整后重新生成用户输入:
我需要基于这份会议纪要生成一个 5 页的 PPT,使用矢量插画风格。
会议主题:Q1 产品路线图规划
参与人:产品团队
讨论内容:
1. 用户反馈汇总
2. 新功能优先级
3. 技术可行性评估
4. Q1 里程碑
5. 下一步行动项
Skill 执行:
用户输入:
基于 AI-Product-Design.md 文档,生成一个 15 页的 PPT,使用渐变毛玻璃风格,需要转场视频。
Skill 执行:
MIT License
ARCHITECTURE.mdAPI_MANAGEMENT.mdENV_SETUP.mdSECURITY.mdREADME.mdGenerated Mar 1, 2026
HR departments can use this skill to create engaging training materials on company policies, software tutorials, or compliance topics. The AI analyzes training documents and generates structured slides with professional visuals, making onboarding and continuous education more effective.
Educators and professors can transform lecture notes or research papers into visually appealing presentations for classrooms or conferences. The skill's content planning adapts to different page ranges, supporting concise summaries or detailed academic breakdowns.
Marketing teams can quickly generate pitch decks for product launches, client proposals, or investor meetings. The gradient glass style adds a modern, tech-savvy look, while video transitions create dynamic demos that stand out in competitive pitches.
Event organizers or speakers can use this to prepare keynote speeches with high-impact visuals. The skill handles large presentations (20-25 pages) with structured sections, and optional video features enable seamless transitions for live or recorded events.
Managers and analysts can convert data-heavy reports into digestible presentations for stakeholder meetings. The AI extracts key points from documents, and the vector illustration style is suitable for warm, internal communications with clear data visualization.
Offer a free tier with basic image generation and limited styles, then charge subscriptions for advanced features like video transitions, 4K resolution, and premium templates. Target individual professionals, educators, and small teams needing quick presentation tools.
Sell annual licenses to corporations, universities, or government agencies for bulk usage. Include custom integrations, dedicated support, and enhanced security features. This model leverages the skill's ability to handle large-scale, structured presentations.
Provide API access to developers who want to embed presentation generation into their own applications, such as e-learning platforms or content management systems. Charge based on API calls, with tiers for different volumes and features like video generation.
💬 Integration Tip
Ensure environment variables like GEMINI_API_KEY are securely configured, and consider automating dependency installation in CI/CD pipelines for smoother deployment in team environments.
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.
Humanize AI-generated text to bypass detection. This humanizer rewrites ChatGPT, Claude, and GPT content to sound natural and pass AI detectors like GPTZero,...
Collaborative thinking partner for exploring complex problems through questioning
Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 24 pattern detectors, 500+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
根据用户的功能需求,完成与 VeADK 相关的功能。
Use this skill to query your Google NotebookLM notebooks directly from Claude Code for source-grounded, citation-backed answers from Gemini. Browser automation, library management, persistent auth. Drastically reduced hallucinations through document-only responses.