pdf-parser-mineruPDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats.
Install via ClawdBot CLI:
clawdbot install baokui/pdf-parser-mineruGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://opendatalab.github.io/MinerU/Audited Apr 18, 2026 · audit v1.0
Generated Mar 21, 2026
Researchers can extract structured content from PDF papers, including formulas, tables, and images, into Markdown or JSON for analysis, citation, or dataset creation. This supports literature reviews and automated metadata extraction.
Engineering teams convert PDF manuals, specifications, or reports to Markdown for version control, collaborative editing, or web publishing. It preserves complex elements like tables and formulas, streamlining documentation workflows.
Businesses process scanned or garbled PDFs in multiple languages using OCR, converting them to machine-readable formats for archiving, translation, or content management systems. Supports 109 languages for global operations.
Law firms or compliance departments parse PDF contracts, regulations, or reports into JSON to extract structured data like clauses, tables, and metadata for review, auditing, or automated compliance checks.
Organizations automate the conversion of large volumes of PDF invoices, forms, or records to Markdown or JSON, enabling integration with databases or AI pipelines for data extraction and analysis without manual input.
Offer a cloud-based service where users upload PDFs and receive converted Markdown or JSON outputs via API, with tiered pricing based on volume or features like OCR and formula recognition. Targets businesses needing scalable document processing.
Sell on-premise or self-hosted licenses to large organizations for integrating the tool into their internal workflows, with customization, support, and training services. Ideal for industries with strict data privacy requirements.
Provide a free basic version for individual users or small teams with limited conversions, and charge for advanced features like batch processing, GPU acceleration, or priority support. Drives adoption and upsells to paid plans.
💬 Integration Tip
Ensure absolute file paths are used and system meets memory requirements; start with the default hybrid-auto-engine backend for balanced performance before optimizing.
Scored Apr 19, 2026
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.
Skill 查找器 | Skill Finder. 帮助发现和安装 ClawHub Skills | Discover and install ClawHub Skills. 回答'有什么技能可以X'、'找一个技能' | Answers 'what skill can X', 'find a skill'. 触发...
📰 RSS AI 阅读器 — 自动抓取订阅、LLM生成摘要、多渠道推送! 支持 Claude/OpenAI 生成中文摘要,推送到飞书/Telegram/Email。 触发条件: 用户要求订阅RSS、监控博客、抓取新闻、生成摘要、设置定时抓取、 "帮我订阅"、"监控这个网站"、"每天推送新闻"、RSS/Atom feed 相关。
AI-powered PDF generator for legal docs, pitch decks, and reports. SAFEs, NDAs, term sheets, whitepapers. npx ai-pdf-builder. Works with Claude, Cursor, GPT, Copilot.
Monitor RSS and Atom feeds for content research. Track blogs, news sites, newsletters, and any feed source. Use when monitoring competitors, tracking industr...
Query, design, migrate, and optimize SQL databases. Use when working with SQLite, PostgreSQL, or MySQL — schema design, writing queries, creating migrations, indexing, backup/restore, and debugging slow queries. No ORMs required.