crawl-for-aiFull web page scraping with JavaScript rendering via local Crawl4AI instance, delivering clean markdown or detailed JSON including links and media.
Install via ClawdBot CLI:
clawdbot install angusthefuzz/crawl-for-aiLocal Crawl4AI instance for full web page extraction with JavaScript rendering.
Proxy (port 11234) — Clean output, OpenWebUI-compatible
[{page_content, metadata}]Direct (port 11235) — Full output with all data
{results: [{markdown, html, links, media, ...}]}# Via script
node {baseDir}/scripts/crawl4ai.js "url"
node {baseDir}/scripts/crawl4ai.js "url" --json
Script options:
--json — Full JSON responseOutput: Clean markdown from the page.
Required environment variable:
CRAWL4AI_URL — Your Crawl4AI instance URL (e.g., http://localhost:11235)Optional:
CRAWL4AI_KEY — API key if your instance requires authenticationUses your local Crawl4AI instance REST API. Auth header only sent if CRAWL4AI_KEY is set.
Generated Feb 27, 2026
Extract product details, pricing, and customer reviews from competitor websites with dynamic content. Useful for price tracking and trend analysis in retail industries.
Scrape news articles, blog posts, and social media updates from JavaScript-heavy sites for real-time content curation and media monitoring services.
Gather research papers, datasets, and scholarly articles from academic portals and databases that use JavaScript for navigation and content loading.
Extract property details, images, and pricing from real estate websites with interactive maps and dynamic listings for market analysis and investment insights.
Collect stock prices, financial reports, and economic indicators from financial news sites and dashboards that rely on JavaScript for data visualization.
Offer a web scraping platform as a service with tiered pricing based on usage volume and features like JavaScript rendering. Target businesses needing regular data extraction without API limits.
Provide tailored scraping services for specific industries, such as e-commerce or finance, with custom scripts and data delivery formats like JSON or CSV.
Resell access to the local Crawl4AI instance as an API to developers and small businesses, offering endpoints for clean or full output with optional authentication.
💬 Integration Tip
Ensure the CRAWL4AI_URL environment variable is correctly set to your local instance, and use the proxy endpoint for simple content extraction to avoid unnecessary data overhead.
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with w...
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
Browser automation and web scraping with Playwright. Forms, screenshots, data extraction. Works standalone or via MCP. Testing included.
Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.
Automate web tasks like form filling, data scraping, testing, monitoring, and scheduled jobs with multi-browser support and retry mechanisms.
Web scraping and content comprehension agent — multi-strategy extraction with cascade fallback, news detection, boilerplate removal, structured metadata, and...