mineru用 MinerU API 解析 PDF/Word/PPT/图片为 Markdown,支持公式、表格、OCR。适用于论文解析、文档提取。
Install via ClawdBot CLI:
clawdbot install EasonAI-5589/mineruGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Sends data to undocumented external endpoint (potential exfiltration)
POST → https://mineru.net/api/v4/extract/taskCalls external URL not in known-safe list
https://mineru.net/Uses known external API (expected, informational)
arxiv.orgAI Analysis
The skill interacts with a documented external API (MinerU) for its stated purpose of document parsing, which is consistent with its description. While it sends user documents to an external service, this is expected functionality for a document parsing tool, and the API endpoint is clearly documented as part of the service. No credential harvesting, hidden instructions, or obfuscation patterns were detected.
Generated Feb 23, 2026
Researchers can automatically parse arXiv PDFs into structured Markdown with LaTeX formulas and tables preserved, enabling quick literature reviews and data extraction without manual copying. This is ideal for summarizing papers, building knowledge bases, or preparing annotated bibliographies.
Law firms can convert scanned contracts, Word documents, or PDFs into searchable Markdown text while retaining complex layouts and tables, streamlining document review and analysis for cases or compliance checks. OCR support handles mixed-language content in legal materials.
Companies can extract data from financial reports, PowerPoint presentations, and Word documents to create structured summaries or integrate content into databases, improving efficiency in reporting and decision-making processes. Batch processing allows handling multiple quarterly reports at once.
Libraries or museums can digitize historical documents, books, and images by converting them into Markdown with OCR, preserving formulas and tables for digital archives or online publications. This supports heritage preservation and accessibility initiatives.
Engineering teams can parse technical manuals, diagrams in PDFs, or PPT slides into Markdown to update documentation, extract specifications, or feed into knowledge management systems, ensuring accurate retention of complex tables and formulas.
Offer tiered subscription plans based on usage quotas, such as number of pages or files processed per month, with premium tiers for higher concurrency or advanced features like VLM model access. Revenue comes from recurring payments from businesses and researchers.
Provide custom licenses to large organizations for on-premise deployment or dedicated API instances, including support, training, and integration services. This targets industries like legal or finance with high-volume, secure document processing needs.
Implement a usage-based pricing model where users pay per document or page processed, appealing to occasional users or small projects. Integrate with platforms like OpenClaw for seamless billing and low-barrier access to document parsing capabilities.
💬 Integration Tip
Set the MINERU_TOKEN environment variable in OpenClaw config for easy authentication, and use batch processing to optimize API quota when handling multiple files.
Scored Apr 18, 2026
Audited Apr 16, 2026 · audit v1.0
Connect to 100+ APIs (Google Workspace, Microsoft 365, GitHub, Notion, Slack, Airtable, HubSpot, etc.) with managed OAuth. Use this skill when users want to...
Skill 查找器 | Skill Finder. 帮助发现和安装 ClawHub Skills | Discover and install ClawHub Skills. 回答'有什么技能可以X'、'找一个技能' | Answers 'what skill can X', 'find a skill'. 触发...
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.
网页内容获取工具 | 当常规爬虫被过滤时,使用替代服务获取网页内容。支持:1) r.jina.ai - 最稳定 2) markdown.new - Cloudflare 专用 3) defuddle.md - 备用方案。触发词:获取网页内容、网页转markdown、内容抓取、fetch webpage、bypas...
Web content extraction via Jina AI Reader API. Three modes: read (URL to markdown), search (web search + full content), ground (fact-checking). Extracts clea...
Extract text from PDFs with OCR support. Perfect for digitizing documents, processing invoices, or analyzing content. Zero dependencies required.