xpr-web-scrapingTools for fetching and extracting cleaned text, metadata, and links from single or multiple web pages with format options and link filtering.
Install via ClawdBot CLI:
clawdbot install paulgnz/xpr-web-scrapingYou have web scraping tools for fetching and extracting data from web pages:
Single page:
scrape_url — fetch a URL and get cleaned text content + metadata (title, description, link count)Link discovery:
extract_links — fetch a page and extract all links with text and type (internal/external)pattern parameter to filter by regex (e.g. "\\.pdf$" for PDF links)Multi-page research:
scrape_multiple — fetch up to 10 URLs in parallel for comparison/researchBest practices:
store_deliverable to save scraped content as job evidenceGenerated Mar 1, 2026
Scrape e-commerce product pages daily to track competitor pricing and promotions. Use scrape_multiple for parallel monitoring of key competitors and store_deliverable to log changes over time for strategic adjustments.
Extract latest articles from financial news websites using scrape_url with format='markdown' to preserve structure. Combine with extract_links to discover related reports, enabling real-time market trend analysis for investment decisions.
Scrape multiple scholarly articles or PDF links from university websites using extract_links with pattern='\.pdf$'. Use scrape_url to fetch text content for analysis, supporting literature reviews or data mining in academic projects.
Fetch property details from real estate portals using scrape_url with format='text' to extract cleaned data like prices and descriptions. Apply scrape_multiple to gather listings from various sources for market comparison and client reports.
Scrape job postings from career sites to analyze hiring trends and skill demands. Use extract_links to filter internal job pages and scrape_url to extract key details, aiding recruitment agencies or HR departments in strategy planning.
Offer subscription-based access to scraped data feeds, such as daily price updates or news summaries. Revenue comes from monthly or annual fees charged to clients like retailers or analysts who rely on timely, structured data.
Provide tailored web scraping services for specific client needs, such as one-time data extraction for market research or ongoing monitoring. Revenue is generated through project-based contracts or hourly consulting rates.
Integrate the scraping tools into third-party platforms via APIs, enabling partners to offer data extraction features. Revenue streams include licensing fees, usage-based pricing, or revenue sharing from enhanced platform capabilities.
💬 Integration Tip
Combine scrape_url with store_deliverable to automatically save extracted content as evidence in job workflows, ensuring data traceability and compliance with best practices like rate limiting.
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with w...
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
Browser automation and web scraping with Playwright. Forms, screenshots, data extraction. Works standalone or via MCP. Testing included.
Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.
Automate web tasks like form filling, data scraping, testing, monitoring, and scheduled jobs with multi-browser support and retry mechanisms.
Web scraping and content comprehension agent — multi-strategy extraction with cascade fallback, news detection, boilerplate removal, structured metadata, and...