browser-use-localAutomate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.
Install via ClawdBot CLI:
clawdbot install fengjiajie/browser-use-localbrowser tool here; OpenClaw browser may fail if no supported system browser is present.--session .1) Open
browser-use --session demo open https://example.com
2) Inspect (sometimes state returns 0 elements on heavy/JS sites)
browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'
3) Screenshot (always works; best debugging primitive)
browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png
4) HTML for link discovery (works even when state is empty)
browser-use --session demo --json get html > /tmp/page_html.json
python3 - <<'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'<>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
print(u)
PY
5) Lightweight DOM queries via JS (useful when state is empty)
browser-use --session demo --json eval "location.href"
browser-use --session demo --json eval "document.title"
Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.
Create .env (or export env vars) with:
OPENAI_API_KEY=...OPENAI_BASE_URL=https://api.moonshot.cn/v1Then run the bundled script:
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py
Kimi/Moonshot quirks observed in practice (fixes):
temperature must be 1 for kimi-k2.5.frequency_penalty must be 0 for kimi-k2.5.remove_defaults_from_schema=Trueremove_min_items_from_schema=TrueIf you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.
1) Screenshot the page and crop candidate regions (fast, robust).
2) If HTML contains data:image/png;base64,..., extract and decode it.
Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot.
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python skills/browser-use-local/scripts/crop_candidates.py \
--in /home/node/.openclaw/workspace/login.png \
--outdir /home/node/.openclaw/workspace/qr_crops
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
browser-use --session demo --json get html > /tmp/page_html.json
python skills/browser-use-local/scripts/extract_data_images.py \
--in /tmp/page_html.json \
--outdir /home/node/.openclaw/workspace/data_imgs
state shows elements: 0: use get html + regex discovery, plus screenshots; use eval to query DOM.browser-use --browser chromium --json open https://...browser-use open https://... --browser chromiumGenerated Mar 1, 2026
Developers can use this skill to automate browser interactions for testing web applications, especially in headless environments where traditional browsers fail. It enables persistent sessions for multi-step flows, captures screenshots for visual debugging, and extracts HTML or runs DOM queries to verify page states, making it ideal for continuous integration pipelines.
Businesses can automate browsing e-commerce sites to track product prices, availability, and promotions in real-time. By using persistent sessions and HTML extraction, they can scrape data from JavaScript-heavy pages without relying on APIs, helping with competitive analysis and inventory management for retail operations.
This skill facilitates automating login processes for demo or internal systems by extracting QR codes from pages via screenshots or embedded base64 images. It's useful for setting up automated authentication flows in development environments, reducing manual steps in testing login functionalities.
Media companies or researchers can use the skill to browse websites, extract links, and aggregate content from multiple sources. By leveraging HTML parsing and regex, it automates the discovery of relevant URLs, such as news articles or resources, streamlining data collection for analysis or reporting.
Integrate with OpenAI-compatible LLMs like Moonshot/Kimi to create AI agents that interact with web interfaces, such as filling forms or navigating support portals. This enables automated customer service tasks, handling queries and actions on web platforms without human intervention.
Offer a cloud-based service that uses this skill to provide automated browser testing for websites, charging subscription fees based on usage or number of tests. It targets developers and QA teams needing reliable, headless browser automation without infrastructure setup.
Provide a service that scrapes web data for clients, such as price tracking or content aggregation, using this skill's automation capabilities. Revenue comes from custom projects or ongoing monitoring contracts, appealing to businesses in retail, finance, or marketing.
Develop and sell a tool that integrates this skill with AI agents for tasks like automated form filling or customer interactions, targeting enterprises looking to automate web-based workflows. Revenue is generated through software licensing and support services.
💬 Integration Tip
Ensure environment variables like OPENAI_API_KEY are set for AI agent workflows, and use persistent sessions with --session flag to maintain state across browser interactions.
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with w...
Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.
Browser automation and web scraping with Playwright. Forms, screenshots, data extraction. Works standalone or via MCP. Testing included.
Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.
Automate web tasks like form filling, data scraping, testing, monitoring, and scheduled jobs with multi-browser support and retry mechanisms.
Web scraping and content comprehension agent — multi-strategy extraction with cascade fallback, news detection, boilerplate removal, structured metadata, and...