stealthy-auto-browseBrowser automation that passes CreepJS, BrowserScan, Pixelscan, and Cloudflare — zero CDP exposure, OS-level input, persistent fingerprints. Use when standard browser skills get 403s or CAPTCHAs.
Install via ClawdBot CLI:
clawdbot install psyb0t/stealthy-auto-browseA stealth browser running in Docker. It uses Camoufox (a custom Firefox fork) instead of Chromium, so there are zero Chrome DevTools Protocol (CDP) signals for bot detectors to find. Mouse and keyboard input happens at the OS level via PyAutoGUI — the browser itself doesn't know it's being automated, which means behavioral analysis can't detect it either.
Standard browser automation (Playwright + Chromium, Puppeteer, Selenium) exposes CDP signals that bot detection services (Cloudflare, DataDome, PerimeterX, Akamai) catch instantly. Even with stealth plugins, the CDP protocol is still there and detectable. This skill eliminates that entirely by using Firefox (no CDP at all) and generating input events at the OS level rather than through the browser's automation API.
curl or WebFetchcurl1. Start the container:
docker run -d -p 8080:8080 -p 5900:5900 psyb0t/stealthy-auto-browse
Port 8080 is the HTTP API. Port 5900 is a noVNC web viewer where you can watch the browser in real time.
2. Set the environment variable:
export STEALTHY_AUTO_BROWSE_URL=http://localhost:8080
Or via OpenClaw config (~/.openclaw/openclaw.json):
{
"skills": {
"entries": {
"stealthy-auto-browse": {
"env": {
"STEALTHY_AUTO_BROWSE_URL": "http://localhost:8080"
}
}
}
}
}
3. Verify: curl $STEALTHY_AUTO_BROWSE_URL/health returns ok when the browser is ready.
The container runs a virtual X display (Xvfb at 1920x1080), the Camoufox browser, and an HTTP API server. You send JSON commands to the API and get JSON responses back. All commands go to POST $STEALTHY_AUTO_BROWSE_URL/ with {"action": ".
Every response has this shape:
{
"success": true,
"timestamp": 1234567890.123,
"data": { ... },
"error": "only present when success is false"
}
The data field contents vary by action — documented below for each one.
This is the most important concept. There are two ways to interact with pages:
Actions: system_click, mouse_move, mouse_click, system_type, send_key, scroll
These use PyAutoGUI to generate real OS-level mouse movements and keystrokes. The browser receives these as genuine user input — there is no way for any website JavaScript to distinguish these from a real human. Use these for stealth.
System input works with viewport coordinates (x, y pixel positions within the browser content area). Get these coordinates from get_interactive_elements.
Actions: click, fill, type
These use Playwright's DOM automation to interact with elements by CSS selector or XPath. They're faster and more reliable (no coordinate math), but they inject events through the browser's automation layer. Sophisticated behavioral analysis can potentially detect the timing patterns. Use these when speed matters more than stealth, or when you have a selector but no coordinates.
system_click to focus the field, then system_type to enter text. This is undetectable. Using fill is faster but detectable.get_interactive_elements, use system_click. If you only have a CSS selector, use click.This is the typical sequence for interacting with a page:
goto to load the URLget_text returns all visible text — usually enough to understand the pageget_html gives you the full DOM structureGET /screenshot/browser?whLargest=512)get_interactive_elements returns all buttons, links, inputs with their x,y coordinatessystem_click to click, system_type to type, send_key for Enter/Tab/Escapewait_for_element or wait_for_text instead of sleepingget_text again to confirm the page changed as expectedNavigates to a URL. This is how you load pages.
{"action": "goto", "url": "https://example.com"}
{"action": "goto", "url": "https://example.com", "wait_until": "networkidle"}
Parameters:
url (required): The URL to navigate to.wait_until (optional, default "domcontentloaded"): When to consider the page loaded. Options: "domcontentloaded" (DOM parsed, fast), "load" (all resources loaded), "networkidle" (no network activity for 500ms, slowest but most complete).Response data: {"url": "https://example.com/", "title": "Example Domain"}
Note: If a page loader matches the URL (see Page Loaders section), the loader's steps execute instead of the default navigation. The response will include "loader": "loader name" when this happens.
Reloads the current page.
{"action": "refresh"}
{"action": "refresh", "wait_until": "networkidle"}
Parameters:
wait_until (optional, default "domcontentloaded"): Same options as goto.Response data: {"url": "https://example.com/current-page", "title": "Current Page"}
Moves the mouse to viewport coordinates with a human-like curve (random jitter, eased acceleration), then clicks. This is the primary way to click things stealthily.
{"action": "system_click", "x": 500, "y": 300}
{"action": "system_click", "x": 500, "y": 300, "duration": 0.5}
Parameters:
x, y (required): Viewport coordinates — get these from get_interactive_elements.duration (optional): How long the mouse movement takes in seconds. If omitted, a random duration between 0.2-0.6s is used for realism.Response data: {"system_clicked": {"x": 500, "y": 300}}
How it differs from mouse_click: system_click always moves the mouse first (smooth human-like path), then clicks. mouse_click can click at a position instantly without the smooth movement, or click wherever the mouse currently is.
Moves the mouse to viewport coordinates with human-like movement (jitter, eased curve) but does NOT click. Use this to hover over elements (to trigger hover menus, tooltips) or to simulate natural mouse behavior between actions.
{"action": "mouse_move", "x": 500, "y": 300}
{"action": "mouse_move", "x": 500, "y": 300, "duration": 0.4}
Parameters:
x, y (required): Viewport coordinates.duration (optional): Movement time in seconds. Random 0.2-0.6s if omitted.Response data: {"moved_to": {"x": 500, "y": 300}}
Clicks at a position or at the current mouse location. Unlike system_click, this does NOT do a smooth mouse movement first — it's a direct click via PyAutoGUI.
{"action": "mouse_click"}
{"action": "mouse_click", "x": 500, "y": 300}
Parameters:
x, y (optional): If provided, clicks at that viewport position directly. If omitted, clicks wherever the mouse currently is.Response data: {"clicked_at": {"x": 500, "y": 300}} or {"clicked_at": "current"}
When to use: After a mouse_move when you want to separate the movement and click into two steps. Or when the mouse is already positioned and you just need to click.
Types text character-by-character via real OS keystrokes. Each keystroke has a randomized delay (jittered around the interval) to mimic human typing speed. Completely undetectable.
{"action": "system_type", "text": "hello world"}
{"action": "system_type", "text": "hello world", "interval": 0.12}
Parameters:
text (required): The text to type. Must click/focus an input field first.interval (optional, default 0.08): Base delay between keystrokes in seconds. Actual delay is randomized +-30ms around this value.Response data: {"typed_len": 11}
Important: You must click on the input field first (using system_click or click) before calling system_type. This action types into whatever is currently focused.
Sends a single keyboard key or key combination via OS-level input. Use this for pressing Enter to submit forms, Tab to move between fields, Escape to close dialogs, or any key combos like Ctrl+A, Ctrl+C, etc.
{"action": "send_key", "key": "enter"}
{"action": "send_key", "key": "tab"}
{"action": "send_key", "key": "escape"}
{"action": "send_key", "key": "ctrl+a"}
{"action": "send_key", "key": "ctrl+shift+t"}
Parameters:
key (required): Key name or combo with + separator. Key names follow PyAutoGUI naming: enter, tab, escape, backspace, delete, up, down, left, right, home, end, pageup, pagedown, f1-f12, ctrl, alt, shift, space, etc.Response data: {"send_key": "enter"}
Scrolls the page using the mouse scroll wheel. Generates real OS-level scroll events.
{"action": "scroll", "amount": -3}
{"action": "scroll", "amount": 5, "x": 500, "y": 300}
Parameters:
amount (optional, default -3): Scroll amount. Negative = scroll down, positive = scroll up. Each unit is roughly one "click" of a mouse wheel.x, y (optional): If provided, moves the mouse to these viewport coordinates first, then scrolls. Useful for scrolling inside a specific scrollable element rather than the whole page.Response data: {"scrolled": -3}
These are faster and more convenient but use Playwright's DOM event injection, which is detectable by sophisticated behavioral analysis.
Clicks an element by CSS selector or XPath. Playwright finds the element in the DOM, scrolls it into view if needed, and dispatches click events.
{"action": "click", "selector": "#submit-btn"}
{"action": "click", "selector": "button.primary"}
{"action": "click", "selector": "xpath=//button[@id='submit-btn']"}
Parameters:
selector (required): CSS selector or XPath (prefix with xpath=).Response data: {"clicked": "#submit-btn"}
When to use over system_click: When you have a selector but don't want to bother getting coordinates. When the element might move around and coordinates aren't reliable. When stealth isn't critical.
Fills an input field by selector. Clears any existing content first, then sets the value. This is the fastest way to fill forms but is detectable because it doesn't generate individual keystroke events.
{"action": "fill", "selector": "input[name='email']", "value": "user@example.com"}
Parameters:
selector (required): CSS selector or XPath of the input element.value (required): Text to fill in.Response data: {"filled": "input[name='email']"}
Types text into an element character-by-character via Playwright (NOT the OS). Each keystroke has a configurable delay. This is a middle ground between fill (instant but obviously automated) and system_type (OS-level, undetectable). The typing pattern is more realistic than fill but still comes through Playwright's event system.
{"action": "type", "selector": "#search", "text": "query", "delay": 0.05}
Parameters:
selector (required): CSS selector or XPath of the element.text (required): Text to type.delay (optional, default 0.05): Delay between keystrokes in seconds.Response data: {"typed": "#search"}
Screenshots are GET requests (not POST actions).
Captures the browser viewport as a PNG image. This is what the page looks like to a user.
curl -s "$STEALTHY_AUTO_BROWSE_URL/screenshot/browser?whLargest=512" -o screenshot.png
Always resize screenshots to avoid huge images. Resize query parameters (all optional):
| Parameter | What it does |
|-----------|-------------|
| whLargest=512 | Scales so the largest dimension is 512px, keeps aspect ratio. Use this by default. |
| width=800 | Scales to 800px wide, keeps aspect ratio |
| height=300 | Scales to 300px tall, keeps aspect ratio |
| width=400&height=400 | Forces exact 400x400 dimensions |
Captures the entire virtual desktop (including window chrome, taskbar, etc.) using scrot. Same resize parameters as above. Useful when you need to see things outside the browser viewport.
curl -s "$STEALTHY_AUTO_BROWSE_URL/screenshot/desktop?whLargest=512" -o desktop.png
Scans the page and returns every interactive element (buttons, links, inputs, selects, textareas, etc.) with their viewport coordinates. This is how you find what to click and where.
{"action": "get_interactive_elements"}
{"action": "get_interactive_elements", "visible_only": true}
Parameters:
visible_only (optional, default true): Only return elements that are currently visible on screen.Response data:
{
"count": 5,
"elements": [
{
"i": 0,
"tag": "button",
"id": "submit-btn",
"text": "Submit",
"selector": "#submit-btn",
"x": 400,
"y": 250,
"w": 120,
"h": 40,
"visible": true
},
{
"i": 1,
"tag": "input",
"id": null,
"text": "",
"selector": "input[name='email']",
"x": 300,
"y": 180,
"w": 250,
"h": 35,
"visible": true
}
]
}
The x, y are the center of the element — pass these directly to system_click. The selector can be used with Playwright actions like click or fill. The w, h give you the element dimensions.
This is your primary tool for understanding what you can interact with on a page. Call this before clicking anything.
Returns all visible text content of the page body. Text is truncated to 10,000 characters.
{"action": "get_text"}
Response data: {"text": "Page title\nSome content here...", "length": 1234}
This is usually the first thing to call after navigating — it tells you what's on the page without needing a screenshot.
Returns the full HTML source of the current page.
{"action": "get_html"}
Response data: {"html": "...", "length": 45678}
Use when get_text doesn't give enough structure to understand the page layout, or when you need to find specific elements in the DOM.
Executes arbitrary JavaScript in the page context and returns the result. The expression is evaluated via page.evaluate().
{"action": "eval", "expression": "document.title"}
{"action": "eval", "expression": "document.querySelectorAll('a').length"}
{"action": "eval", "expression": "JSON.stringify(performance.timing)"}
Parameters:
expression (required): JavaScript expression to evaluate. Must return a JSON-serializable value.Response data: {"result": "Example Domain"} — the result is whatever the expression returns.
Use these instead of sleep to wait for page content. They're more reliable because they wait for the exact condition rather than an arbitrary time.
Waits for an element matching a CSS selector or XPath to reach a certain state (visible, hidden, attached to DOM, detached).
{"action": "wait_for_element", "selector": "#results", "timeout": 10}
{"action": "wait_for_element", "selector": "xpath=//div[@class='loaded']", "timeout": 15}
{"action": "wait_for_element", "selector": ".spinner", "state": "hidden", "timeout": 10}
Parameters:
selector (required): CSS selector or XPath (prefix with xpath=).state (optional, default "visible"): What state to wait for. Options: "visible" (rendered and not hidden), "hidden" (not visible), "attached" (in DOM regardless of visibility), "detached" (removed from DOM).timeout (optional, default 30): Max wait time in seconds. Throws error if exceeded.Response data: {"selector": "#results", "state": "visible"}
Waits for specific text to appear anywhere in the page body.
{"action": "wait_for_text", "text": "Search results", "timeout": 10}
Parameters:
text (required): Exact text to look for (substring match on document.body.innerText).timeout (optional, default 30): Max wait time in seconds.Response data: {"text": "Search results", "found": true}
Waits for the page URL to match a pattern. Useful after form submissions or redirects.
{"action": "wait_for_url", "url": "**/dashboard", "timeout": 10}
{"action": "wait_for_url", "url": "https://example.com/success*", "timeout": 15}
Parameters:
url (required): URL pattern to match. Supports (any chars except /) and * (any chars including /) glob patterns. Can also be a full URL for exact match.timeout (optional, default 30): Max wait time in seconds.Response data: {"url": "https://example.com/dashboard"}
Waits until there are no network requests in flight for 500ms. Useful for pages that load content dynamically after the initial page load.
{"action": "wait_for_network_idle", "timeout": 30}
Parameters:
timeout (optional, default 30): Max wait time in seconds.Response data: {"idle": true}
The browser can have multiple tabs open. One tab is "active" at a time — all actions operate on the active tab.
Returns all open tabs with their URLs and which one is active.
{"action": "list_tabs"}
Response data:
{
"count": 2,
"tabs": [
{"index": 0, "url": "https://example.com/", "active": false},
{"index": 1, "url": "https://other.com/", "active": true}
]
}
Opens a new browser tab. Optionally navigates it to a URL. The new tab becomes the active tab.
{"action": "new_tab"}
{"action": "new_tab", "url": "https://example.com"}
Parameters:
url (optional): URL to navigate to in the new tab.wait_until (optional, default "domcontentloaded"): Same as goto.Response data: {"index": 1, "url": "https://example.com/"}
Switches the active tab by index (0-based). All subsequent actions will operate on this tab.
{"action": "switch_tab", "index": 0}
Parameters:
index (required): Tab index from list_tabs.Response data: {"index": 0, "url": "https://example.com/"}
Closes a tab. After closing, the last remaining tab becomes active.
{"action": "close_tab"}
{"action": "close_tab", "index": 1}
Parameters:
index (optional): Tab index to close. If omitted, closes the currently active tab.Response data: {"closed": true, "remaining": 1}
Browsers have modal dialogs (alert, confirm, prompt). By default, dialogs are auto-accepted (clicks OK). Use handle_dialog if you need to dismiss a dialog or provide text for a prompt.
Call BEFORE the action that triggers the dialog if you want to dismiss it or provide prompt text. If you don't call this, the dialog is auto-accepted (clicks OK).
{"action": "handle_dialog", "accept": true}
{"action": "handle_dialog", "accept": false}
{"action": "handle_dialog", "accept": true, "text": "my response"}
Parameters:
accept (optional, default true): true clicks OK/Accept, false clicks Cancel/Dismiss.text (optional): Response text for prompt dialogs. Ignored for alert/confirm.Response data: {"configured": {"accept": true, "text": null}}
Example — handling a confirm dialog:
# Step 1: Tell the browser to accept the next dialog
curl -X POST $API -H 'Content-Type: application/json' -d '{"action": "handle_dialog", "accept": true}'
# Step 2: Now click the button that triggers the confirm
curl -X POST $API -H 'Content-Type: application/json' -d '{"action": "system_click", "x": 300, "y": 200}'
Returns information about the most recent dialog that appeared.
{"action": "get_last_dialog"}
Response data:
{
"dialog": {
"type": "confirm",
"message": "Are you sure you want to delete this?",
"default_value": "",
"buttons": ["ok", "cancel"]
}
}
Returns {"dialog": null} if no dialog has appeared yet. The type field is one of: "alert", "confirm", "prompt", "beforeunload".
Returns all cookies for the browser context, or cookies for specific URLs.
{"action": "get_cookies"}
{"action": "get_cookies", "urls": ["https://example.com"]}
Parameters:
urls (optional): Array of URLs to filter cookies by. If omitted, returns all cookies.Response data:
{
"count": 3,
"cookies": [
{"name": "session", "value": "abc123", "domain": ".example.com", "path": "/", "httpOnly": true, "secure": true, ...}
]
}
Sets a cookie in the browser context.
{"action": "set_cookie", "name": "session", "value": "abc123", "url": "https://example.com"}
{"action": "set_cookie", "name": "pref", "value": "dark", "domain": ".example.com", "path": "/", "httpOnly": false, "secure": true}
Parameters: Any standard cookie fields — name, value, url, domain, path, httpOnly, secure, sameSite, expires. At minimum you need name, value, and either url or domain.
Response data: {"set": "session"}
Clears all cookies from the browser context.
{"action": "delete_cookies"}
Response data: {"cleared": true}
Access the page's localStorage and sessionStorage. These are per-origin — you must be on the right page for the storage to be accessible.
Returns all items from localStorage or sessionStorage as a key-value object.
{"action": "get_storage", "type": "local"}
{"action": "get_storage", "type": "session"}
Parameters:
type (optional, default "local"): "local" for localStorage, "session" for sessionStorage.Response data: {"items": {"theme": "dark", "lang": "en"}, "type": "local"}
Sets a single key-value pair in localStorage or sessionStorage.
{"action": "set_storage", "type": "local", "key": "theme", "value": "dark"}
Parameters:
type (optional, default "local"): "local" or "session".key (required): Storage key.value (required): Storage value (string).Response data: {"set": "theme", "type": "local"}
Clears all items from localStorage or sessionStorage.
{"action": "clear_storage", "type": "local"}
{"action": "clear_storage", "type": "session"}
Response data: {"cleared": "local"}
The browser automatically tracks file downloads triggered by page interactions (clicking download links, form submissions that return files, etc.).
Returns information about the most recently downloaded file.
{"action": "get_last_download"}
Response data:
{
"download": {
"url": "https://example.com/file.pdf",
"filename": "file.pdf",
"path": "/tmp/playwright-downloads/abc123/file.pdf"
}
}
Returns {"download": null} if nothing has been downloaded yet. The path is the local path inside the container where the file was saved. The filename is what the server suggested as the download name.
Programmatically sets a file on an element without opening the OS file picker. The file must exist inside the container — use docker cp to copy files in if needed.
{"action": "upload_file", "selector": "#file-input", "file_path": "/tmp/document.pdf"}
Parameters:
selector (required): CSS selector of the file input element.file_path (required): Absolute path to the file inside the container.Response data: {"selector": "#file-input", "file": "document.pdf", "size": 12345}
Note: After setting the file, you still need to submit the form (click the submit button) for the upload to actually happen.
Capture all HTTP requests and responses the page makes. Useful for debugging, finding API endpoints the page calls, or verifying that certain resources loaded.
Starts recording all HTTP requests and responses from the active page.
{"action": "enable_network_log"}
Response data: {"enabled": true}
Stops recording network activity. Already-captured entries remain.
{"action": "disable_network_log"}
Response data: {"enabled": false}
Returns all captured network entries since logging was enabled (or last cleared).
{"action": "get_network_log"}
Response data:
{
"count": 4,
"log": [
{"type": "request", "url": "https://api.example.com/data", "method": "GET", "resource_type": "fetch", "timestamp": 1234567890.123},
{"type": "response", "url": "https://api.example.com/data", "status": 200, "timestamp": 1234567890.456},
{"type": "request", "url": "https://cdn.example.com/style.css", "method": "GET", "resource_type": "stylesheet", "timestamp": 1234567890.789},
{"type": "response", "url": "https://cdn.example.com/style.css", "status": 200, "timestamp": 1234567890.999}
]
}
Each entry is either a "request" or "response". Requests include method and resource_type (fetch, document, stylesheet, script, image, etc.). Responses include status code.
Deletes all captured network entries but keeps logging enabled if it was on.
{"action": "clear_network_log"}
Response data: {"cleared": true}
Scrolls the entire page from top to bottom using JavaScript window.scrollBy(). Scrolls one viewport height at a time with a fixed delay between scrolls. When it reaches the bottom (scroll position stops changing), it scrolls back to the top. Useful for triggering lazy-loaded content.
{"action": "scroll_to_bottom"}
{"action": "scroll_to_bottom", "delay": 0.6}
Parameters:
delay (optional, default 0.4): Seconds to wait between each scroll step.Response data: {"scrolled": "bottom"}
Same as scroll_to_bottom but uses real OS-level mouse wheel scrolling (via PyAutoGUI) with randomized scroll amounts and jittered delays to look like a human scrolling. Undetectable by behavioral analysis.
{"action": "scroll_to_bottom_humanized"}
{"action": "scroll_to_bottom_humanized", "min_clicks": 3, "max_clicks": 8, "delay": 0.7}
Parameters:
min_clicks (optional, default 2): Minimum mouse wheel clicks per scroll step.max_clicks (optional, default 6): Maximum mouse wheel clicks per scroll step. A random value between min and max is chosen each time.delay (optional, default 0.5): Base delay between scroll steps. Actual delay is jittered +-30%.Response data: {"scrolled": "bottom_humanized"}
Recalculates the mapping between viewport coordinates (what get_interactive_elements returns) and screen coordinates (what PyAutoGUI uses). The browser has window chrome (title bar, address bar) that offsets the viewport from the screen origin.
{"action": "calibrate"}
Response data: {"window_offset": {"x": 0, "y": 74}}
When to call this: After entering/exiting fullscreen, after the browser window is resized, or if system_click coordinates seem off. The offset is auto-calculated at startup, so you rarely need this.
Returns the virtual display resolution (from the XVFB_RESOLUTION environment variable).
{"action": "get_resolution"}
Response data: {"width": 1920, "height": 1080}
Toggles browser fullscreen mode (hides address bar and window chrome). In fullscreen, the viewport takes up the entire screen, so coordinates map differently.
{"action": "enter_fullscreen"}
{"action": "exit_fullscreen"}
Response data: {"fullscreen": true, "changed": true} — changed is false if already in the requested state.
Important: Call calibrate after entering/exiting fullscreen to update the coordinate mapping.
Health check that returns the current page URL. Use to verify the API is responding and the browser is alive.
{"action": "ping"}
Response data: {"message": "pong", "url": "https://example.com/"}
Pauses execution for a specified duration. Prefer wait_for_element or wait_for_text when waiting for page content — use sleep only for fixed timing needs.
{"action": "sleep", "duration": 2}
Parameters:
duration (optional, default 1): Seconds to sleep.Response data: {"slept": 2}
Shuts down the browser. The container will stop after this.
{"action": "close"}
Response data: {"message": "closing"}
Returns the current browser state.
curl -s "$STEALTHY_AUTO_BROWSE_URL/state"
Response:
{
"status": "ready",
"url": "https://example.com/",
"title": "Example Domain",
"window_offset": {"x": 0, "y": 74}
}
Simple health check. Returns ok as plain text when the API is ready.
curl -s "$STEALTHY_AUTO_BROWSE_URL/health"
# Custom display resolution
docker run -d -p 8080:8080 -e XVFB_RESOLUTION=1280x720 psyb0t/stealthy-auto-browse
# Match timezone to your IP's geographic location (important for stealth — mismatched
# timezone is a common bot detection signal)
docker run -d -p 8080:8080 -e TZ=Europe/Bucharest psyb0t/stealthy-auto-browse
# Route browser traffic through an HTTP proxy
docker run -d -p 8080:8080 -e PROXY_URL=http://user:pass@proxy:8888 psyb0t/stealthy-auto-browse
# Persistent browser profile — cookies, sessions, and fingerprint survive container restarts
docker run -d -p 8080:8080 -v ./profile:/userdata psyb0t/stealthy-auto-browse
# Open a URL automatically on startup
docker run -d -p 8080:8080 psyb0t/stealthy-auto-browse https://example.com
Page loaders are like Greasemonkey/Tampermonkey userscripts but for the HTTP API. You define a set of actions that automatically run whenever the browser navigates to a matching URL. Instead of manually sending a sequence of commands every time you visit a site, you write it once as a YAML file and the container handles it.
This is useful for things like: removing cookie popups, dismissing overlays, waiting for dynamic content, cleaning up pages before scraping, or any repetitive setup you'd otherwise do manually every time.
/loadersgoto navigates to a URL that matches a loader's pattern, the loader's steps run automatically instead of the default navigationThe steps are the exact same actions as the HTTP API. Every action you can send via POST / (goto, eval, click, system_click, sleep, scroll, wait_for_element, etc.) works as a loader step. Same names, same parameters.
docker run -d -p 8080:8080 -p 5900:5900 \
-v ./my-loaders:/loaders \
psyb0t/stealthy-auto-browse
name: Human-readable name for this loader
match:
domain: example.com # Exact hostname match (www. is stripped automatically)
path_prefix: /articles # URL path must start with this
regex: "article/\\d+" # Full URL must match this regex
steps:
- action: goto # Same actions as the HTTP API
url: "${url}" # ${url} is replaced with the original URL
wait_until: networkidle
- action: eval
expression: "document.querySelector('.cookie-banner')?.remove()"
- action: wait_for_element
selector: "#main-content"
timeout: 10
All match fields are optional, but at least one is required. If you specify multiple fields, all of them must match for the loader to trigger:
domain: Exact hostname. www. is stripped from both sides before comparing, so domain: example.com matches www.example.com too.path_prefix: The URL path must start with this string. path_prefix: /blog matches /blog, /blog/post-1, /blog/archive, etc.regex: The full URL is tested against this regular expression.${url} PlaceholderIn any string value within a step, ${url} is replaced with the original URL that was passed to goto. This lets you navigate to the URL with custom wait settings, or pass it to JavaScript:
steps:
- action: goto
url: "${url}"
wait_until: networkidle
- action: eval
expression: "console.log('Loaded:', '${url}')"
Say you're scraping a news site that has cookie popups, newsletter modals, and lazy-loaded content. Without a loader, you'd send 5+ commands after every goto. With a loader:
# loaders/news_site.yaml
name: News Site Cleanup
match:
domain: news-site.com
steps:
# Navigate with full network wait so everything loads
- action: goto
url: "${url}"
wait_until: networkidle
# Wait for the main content to be there
- action: wait_for_element
selector: "article"
timeout: 10
# Kill the cookie popup
- action: eval
expression: "document.querySelector('.cookie-consent')?.remove()"
# Kill the newsletter modal
- action: eval
expression: "document.querySelector('.newsletter-overlay')?.remove()"
# Scroll to trigger lazy-loaded images
- action: scroll_to_bottom
delay: 0.3
# Small pause for everything to settle
- action: sleep
duration: 1
Now when you goto any URL on news-site.com, all of this happens automatically. Your response includes "loader": "News Site Cleanup" so you know it triggered.
{
"success": true,
"data": {
"loader": "News Site Cleanup",
"steps_executed": 6,
"last_result": { "success": true, "timestamp": 1234567890.456, "data": { "slept": 1 } }
}
}
The browser comes with these extensions pre-installed:
API=$STEALTHY_AUTO_BROWSE_URL
# Navigate to login page
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "goto", "url": "https://example.com/login"}'
# See what's on the page
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "get_text"}'
# Find all interactive elements and their coordinates
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "get_interactive_elements"}'
# Click the email field (coordinates from get_interactive_elements)
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "system_click", "x": 400, "y": 200}'
# Type email with human-like keystrokes
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "system_type", "text": "user@example.com"}'
# Tab to password field
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "send_key", "key": "tab"}'
# Type password
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "system_type", "text": "secretpassword"}'
# Press Enter to submit
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "send_key", "key": "enter"}'
# Wait for redirect to dashboard
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "wait_for_url", "url": "**/dashboard", "timeout": 15}'
# Verify we're logged in
curl -s -X POST $API -H 'Content-Type: application/json' \
-d '{"action": "get_text"}'
get_interactive_elements before clicking — don't guess coordinatessystem_click, system_type, send_key are undetectableget_text first, screenshots second — text is faster and smaller?whLargest=512 — full resolution is unnecessarily large/userdata for persistent sessions — cookies, fingerprint, and profile survive restartssleep — wait_for_element, wait_for_text, wait_for_urlhandle_dialog BEFORE the action that triggers it — if you need to dismiss or provide prompt text (dialogs are auto-accepted otherwise)calibrate after fullscreen changes — coordinate mapping shiftssleep with 0.5-1.5s between clicks looks more humanGenerated Mar 1, 2026
Monitor competitor pricing on e-commerce sites with aggressive bot protection like Cloudflare. The stealth browser bypasses detection while maintaining logged-in sessions to access personalized pricing data that would otherwise trigger CAPTCHAs or blocks.
Scrape public social media profiles and content from platforms that actively block automated tools. The OS-level input mimics human behavior perfectly, allowing collection of engagement metrics and content without account suspension.
Aggregate real-time pricing from airline and hotel websites that employ sophisticated bot detection. The system can maintain multiple sessions simultaneously to compare prices across different user profiles and locations without triggering rate limits.
Extract real-time financial data from banking portals, investment platforms, and financial news sites that block traditional scraping tools. The undetectable automation maintains authenticated sessions for accessing personalized financial dashboards.
Monitor job postings across multiple career platforms that detect and block automated browsing. The skill can maintain persistent sessions to track salary trends, skill requirements, and hiring patterns across different regions and industries.
Provide cleaned, structured data feeds to businesses needing competitive intelligence. Charge subscription fees based on data volume, frequency, and number of sources monitored. The stealth capabilities ensure reliable data collection from protected sources.
Offer the browser automation as a managed API service to developers and businesses. Provide different tiers based on concurrent sessions, request volume, and support levels. The zero-CDP exposure makes it valuable for enterprises needing reliable automation.
Build custom automation workflows for specific industries like e-commerce, finance, or travel. Charge project-based fees for development and ongoing maintenance. The skill's ability to bypass advanced bot detection makes it ideal for high-value automation projects.
💬 Integration Tip
Ensure Docker is properly configured with sufficient resources, and use the health check endpoint to verify browser readiness before sending commands to avoid timing issues.
Automatically update Clawdbot and all installed skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag,...
Essential Docker commands and workflows for container management, image operations, and debugging.
Tool discovery and shell one-liner reference for sysadmin, DevOps, and security tasks. AUTO-CONSULT this skill when the user is: troubleshooting network issues, debugging processes, analyzing logs, working with SSL/TLS, managing DNS, testing HTTP endpoints, auditing security, working with containers, writing shell scripts, or asks 'what tool should I use for X'. Source: github.com/trimstray/the-book-of-secret-knowledge
Deploy applications and manage projects with complete CLI reference. Commands for deployments, projects, domains, environment variables, and live documentation access.
Monitor topics of interest and proactively alert when important developments occur. Use when user wants automated monitoring of specific subjects (e.g., product releases, price changes, news topics, technology updates). Supports scheduled web searches, AI-powered importance scoring, smart alerts vs weekly digests, and memory-aware contextual summaries.