jina-readerWeb content extraction via Jina AI Reader API. Three modes: read (URL to markdown), search (web search + full content), ground (fact-checking). Extracts clea...
Install via ClawdBot CLI:
clawdbot install ericsantos/jina-readerRequires:
Extract clean web content via Jina AI — without exposing your server IP.
{baseDir}/scripts/reader.sh "https://example.com/article"
{baseDir}/scripts/reader.sh --mode search "latest AI news 2025"
{baseDir}/scripts/reader.sh --mode ground "OpenAI was founded in 2015"
| Flag | Description | Default |
|------|-------------|---------|
| --mode | read, search, ground | read |
| --selector | CSS selector to extract specific region | — |
| --wait | CSS selector to wait for before extraction | — |
| --remove | CSS selectors to remove (comma-separated) | — |
| --proxy | Country code for geo-proxy (br, us, etc.) | — |
| --nocache | Force fresh content (skip cache) | off |
| --format | markdown, html, text, screenshot | markdown |
| --json | Raw JSON output | off |
# Extract article content
{baseDir}/scripts/reader.sh "https://blog.example.com/post"
# Extract specific section via CSS selector
{baseDir}/scripts/reader.sh --selector "article.main" "https://example.com"
# Remove nav and ads before extraction
{baseDir}/scripts/reader.sh --remove "nav,footer,.ads" "https://example.com"
# Search with JSON output
{baseDir}/scripts/reader.sh --mode search --json "AI enterprise trends"
# Read via Brazil proxy
{baseDir}/scripts/reader.sh --proxy br "https://example.com.br"
# Fact-check a claim
{baseDir}/scripts/reader.sh --mode ground "Tesla is the most valuable car company"
export JINA_API_KEY="jina_..."
Free tier: 10M tokens (no signup needed). Get key at https://jina.ai/reader/
Generated Mar 1, 2026
Businesses can use Jina Reader to extract clean content from competitor websites, news articles, and industry reports. The search mode helps gather the latest trends, while the fact-checking mode verifies claims about market data or product features, enabling data-driven strategic decisions.
Media companies and content platforms can automate the extraction of articles from various sources using the read mode with CSS selectors to focus on specific sections. The proxy option allows accessing geo-restricted content, and the JSON output facilitates integration into content management systems for real-time updates.
Researchers and educational institutions can leverage the ground mode to verify statements or claims by cross-referencing multiple web sources. The ability to extract clean markdown without ads or navigation aids in compiling references and analyzing data from diverse online materials efficiently.
E-commerce businesses can use Jina Reader to scrape product details, reviews, and pricing from competitor sites with the read mode and CSS selectors for specific elements. The IP protection ensures anonymity, and the JSON output supports automated data feeds for price comparison or inventory management.
Legal firms and compliance departments can extract and monitor regulatory updates, legal documents, or public statements from websites. The fact-checking mode helps verify compliance claims, while the search mode tracks changes in laws or industry standards, aiding in risk assessment and reporting.
Offer a free tier with limited tokens (e.g., 10M tokens) to attract users, then charge based on usage tiers for higher volumes or premium features like ReaderLM-v2. Revenue comes from pay-per-page fees for read mode and token-based pricing for search and ground modes, targeting developers and small businesses.
Provide customized solutions for large organizations needing web content extraction at scale, with features like dedicated support, enhanced security, and integration into existing workflows. Revenue is generated through annual or monthly subscriptions, with pricing based on data volume, custom selectors, and priority processing.
License the Jina Reader technology to other companies, such as market research firms or content platforms, allowing them to rebrand and use the extraction capabilities in their own products. Revenue comes from licensing fees, which can be flat rates or revenue-sharing agreements based on client usage.
💬 Integration Tip
Ensure the JINA_API_KEY is set as an environment variable and use the provided scripts with curl and jq for easy command-line integration into automated workflows.
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
AI-optimized web search via Tavily API. Returns concise, relevant results for AI agents.
This skill should be used when users need to search the web for information, find current content, look up news articles, search for images, or find videos. It uses DuckDuckGo's search API to return results in clean, formatted output (text, markdown, or JSON). Use for research, fact-checking, finding recent information, or gathering web resources.
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
Search indexed Discord community discussions via Answer Overflow. Find solutions to coding problems, library issues, and community Q&A that only exist in Discord conversations.
Multi search engine integration with 17 engines (8 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.