pdf-ocrPDF扫描件转Word文档。支持中文OCR识别,自动裁掉页眉页脚,保留插图,彩色章节封面页保留为图片。使用百度OCR API(免费额度1000次/月)。当用户要求把扫描PDF转成文字/Word时触发。
Install via ClawdBot CLI:
clawdbot install dadaniya99/pdf-ocrGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Mar 20, 2026
Researchers can convert scanned PDFs of historical or printed academic papers into editable Word documents for easier citation, analysis, and sharing. This is useful for digitizing archives or processing conference proceedings with mixed content like charts and images.
Law firms and legal departments can transform scanned legal documents, such as contracts or case files, into searchable and editable formats. This aids in document review, editing, and compliance checks, though manual verification may be needed for complex layouts.
Companies can digitize scanned business reports, financial statements, or meeting minutes into Word for data extraction, updates, and distribution. It streamlines workflows by making old documents accessible and editable, with compression for efficient sharing.
Publishers and libraries can convert scanned books or manuscripts into editable formats for reprinting, translation, or digital archiving. This handles mixed content like covers and illustrations, with compression to reduce file sizes for distribution.
Government agencies can digitize scanned public records, forms, or historical documents into Word for improved accessibility, editing, and long-term preservation. Manual checks may be required for complex pages to ensure accuracy.
Offer a free tier with 1000 pages per month using the Baidu OCR API, then charge for additional pages or premium features like faster processing or enhanced accuracy. This attracts users with basic needs and converts them to paying customers.
License the skill to businesses, such as legal firms or publishers, for bulk document processing with custom integrations, support, and higher usage limits. This targets organizations with regular digitization needs and offers scalable solutions.
Package the skill as a white-label tool for other software providers, like document management systems or educational platforms, to embed OCR functionality into their products. This generates revenue through partnership agreements or royalties.
💬 Integration Tip
Ensure proper API key management and test with sample PDFs to adjust cropping ratios for optimal OCR accuracy before full deployment.
Scored Apr 18, 2026
Edit PDFs with natural-language instructions using the nano-pdf CLI.
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
Generate PPT with Baidu AI. Smart template selection based on content.
Create, inspect, and edit Microsoft Word documents and DOCX files with reliable styles, numbering, tracked changes, tables, sections, and compatibility check...
Create, inspect, and edit Microsoft Excel workbooks and XLSX files with reliable formulas, dates, types, formatting, recalculation, and template preservation...
Feishu Doc Manager | 飞书文档管理器 Seamlessly publish Markdown content to Feishu Docs with automatic formatting. Solves key pain points: Markdown table conversion, permission management, batch writing. 将 Markdown 内容无缝发布到飞书文档,自动渲染格式。 解决核心痛点:Markdown 表格转换、权限管理、批量写入。