photo-ocrOCR for photos and images using MinerU. Extract text from photographs, screenshots, camera captures, and image files with high accuracy. Features: image OCR...
Grade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
https://mineru.netAudited Apr 17, 2026 · audit v1.0
Generated May 22, 2026
Automatically extract text from photos of receipts and invoices, converting them into structured data for expense tracking or accounting. Ideal for small business owners and freelancers who need to digitize paper receipts quickly.
Use the skill to capture notes from whiteboard photos or text from signs and posters, converting them into editable Markdown or text. Useful for meeting notes, brainstorming sessions, or translating signage.
Extract text from screenshots of web pages, presentations, or error messages. Developers and researchers can quickly capture code snippets, quotes, or data without manual typing.
Digitize photos of paper documents, such as contracts, letters, or forms, for storage or search. Archives and libraries can use this to preserve and index physical documents.
Extract text from images containing multiple languages, such as multilingual menus or international signs. Supports English, Chinese, and other languages, making it useful for travel and global business.
Offer basic OCR via flash-extract for free, no token required, with limitations on file size and pages. Charge for premium extract with higher accuracy, VLM mode, and no size limits, monetized via token purchases.
Provide the mineru-open-api as a service, charging per API call or subscription for developers integrating OCR into their apps. The CLI tool can be used directly or via API wrappers.
License the skill to enterprises for automated document processing, such as invoice scanning or data entry. Offer custom integrations, on-premises deployment, and dedicated support.
💬 Integration Tip
Start with flash-extract for quick testing without a token. For production, authenticate with MINERU_TOKEN and use extract for higher accuracy on complex images.
Scored Jun 19, 2026
Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.
Capture frames or clips from RTSP/ONVIF cameras.
Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.
Generate images using Qwen Image API (Alibaba Cloud DashScope). Use when users request image generation with Chinese prompts or need high-quality AI-generated images from text descriptions.
Generate AI images with FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: FLUX Dev LoRA, FLUX.2 Klein LoRA, Gemini 3 Pro Image,...
Generate and edit images via Google Gemini API. Supports Gemini native generation, Imagen 3, style presets, and batch generation with HTML gallery. Zero depe...