desktop-control-1-0-0Advanced desktop automation with mouse, keyboard, and screen control
Install via ClawdBot CLI:
clawdbot install wpegley/desktop-control-1-0-0The most advanced desktop automation skill for OpenClaw. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.
First, install required dependencies:
```bash
pip install pyautogui pillow opencv-python pygetwindow
```
```python
from skills.desktop_control import DesktopController
dc = DesktopController(failsafe=True)
dc.move_mouse(500, 300) # Move to coordinates
dc.click() # Left click at current position
dc.click(100, 200, button="right") # Right click at position
dc.type_text("Hello from OpenClaw!")
dc.hotkey("ctrl", "c") # Copy
dc.press("enter")
screenshot = dc.screenshot()
position = dc.get_mouse_position()
```
move_mouse(x, y, duration=0, smooth=True)Move mouse to absolute screen coordinates.
Parameters:
x (int): X coordinate (pixels from left)y (int): Y coordinate (pixels from top)duration (float): Movement time in seconds (0 = instant, 0.5 = smooth)smooth (bool): Use bezier curve for natural movementExample:
```python
dc.move_mouse(1000, 500)
dc.move_mouse(1000, 500, duration=1.0)
```
move_relative(x_offset, y_offset, duration=0)Move mouse relative to current position.
Parameters:
x_offset (int): Pixels to move horizontally (positive = right)y_offset (int): Pixels to move vertically (positive = down)duration (float): Movement time in secondsExample:
```python
dc.move_relative(100, 50, duration=0.3)
```
click(x=None, y=None, button='left', clicks=1, interval=0.1)Perform mouse click.
Parameters:
x, y (int, optional): Coordinates to click (None = current position)button (str): 'left', 'right', 'middle'clicks (int): Number of clicks (1 = single, 2 = double)interval (float): Delay between multiple clicksExample:
```python
dc.click()
dc.click(500, 300, clicks=2)
dc.click(button='right')
```
drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')Drag and drop operation.
Parameters:
start_x, start_y (int): Starting coordinatesend_x, end_y (int): Ending coordinatesduration (float): Drag durationbutton (str): Mouse button to useExample:
```python
dc.drag(100, 100, 500, 500, duration=1.0)
```
scroll(clicks, direction='vertical', x=None, y=None)Scroll mouse wheel.
Parameters:
clicks (int): Scroll amount (positive = up/left, negative = down/right)direction (str): 'vertical' or 'horizontal'x, y (int, optional): Position to scroll atExample:
```python
dc.scroll(-5)
dc.scroll(10)
dc.scroll(5, direction='horizontal')
```
get_mouse_position()Get current mouse coordinates.
Returns: (x, y) tuple
Example:
```python
x, y = dc.get_mouse_position()
print(f"Mouse is at: {x}, {y}")
```
type_text(text, interval=0, wpm=None)Type text with configurable speed.
Parameters:
text (str): Text to typeinterval (float): Delay between keystrokes (0 = instant)wpm (int, optional): Words per minute (overrides interval)Example:
```python
dc.type_text("Hello World")
dc.type_text("Hello World", wpm=60)
dc.type_text("Hello World", interval=0.1)
```
press(key, presses=1, interval=0.1)Press and release a key.
Parameters:
key (str): Key name (see Key Names section)presses (int): Number of times to pressinterval (float): Delay between pressesExample:
```python
dc.press('enter')
dc.press('space', presses=3)
dc.press('down')
```
hotkey(*keys, interval=0.05)Execute keyboard shortcut.
Parameters:
*keys (str): Keys to press togetherinterval (float): Delay between key pressesExample:
```python
dc.hotkey('ctrl', 'c')
dc.hotkey('ctrl', 'v')
dc.hotkey('win', 'r')
dc.hotkey('ctrl', 's')
dc.hotkey('ctrl', 'a')
```
key_down(key) / key_up(key)Manually control key state.
Example:
```python
dc.key_down('shift')
dc.type_text("hello") # Types "HELLO"
dc.key_up('shift')
dc.key_down('ctrl')
dc.click(100, 100)
dc.click(200, 100)
dc.key_up('ctrl')
```
screenshot(region=None, filename=None)Capture screen or region.
Parameters:
region (tuple, optional): (left, top, width, height) for partial capturefilename (str, optional): Path to save imageReturns: PIL Image object
Example:
```python
img = dc.screenshot()
dc.screenshot(filename="screenshot.png")
img = dc.screenshot(region=(100, 100, 500, 300))
```
get_pixel_color(x, y)Get color of pixel at coordinates.
Returns: RGB tuple (r, g, b)
Example:
```python
r, g, b = dc.get_pixel_color(500, 300)
print(f"Color at (500, 300): RGB({r}, {g}, {b})")
```
find_on_screen(image_path, confidence=0.8)Find image on screen (requires OpenCV).
Parameters:
image_path (str): Path to template imageconfidence (float): Match threshold (0-1)Returns: (x, y, width, height) or None
Example:
```python
location = dc.find_on_screen("button.png")
if location:
x, y, w, h = location
# Click center of found image
dc.click(x + w//2, y + h//2)
```
get_screen_size()Get screen resolution.
Returns: (width, height) tuple
Example:
```python
width, height = dc.get_screen_size()
print(f"Screen: {width}x{height}")
```
get_all_windows()List all open windows.
Returns: List of window titles
Example:
```python
windows = dc.get_all_windows()
for title in windows:
print(f"Window: {title}")
```
activate_window(title_substring)Bring window to front by title.
Parameters:
title_substring (str): Part of window title to matchExample:
```python
dc.activate_window("Chrome")
dc.activate_window("Visual Studio Code")
```
get_active_window()Get currently focused window.
Returns: Window title (str)
Example:
```python
active = dc.get_active_window()
print(f"Active window: {active}")
```
copy_to_clipboard(text)Copy text to clipboard.
Example:
```python
dc.copy_to_clipboard("Hello from OpenClaw!")
```
get_from_clipboard()Get text from clipboard.
Returns: str
Example:
```python
text = dc.get_from_clipboard()
print(f"Clipboard: {text}")
```
'a' through 'z'
'0' through '9'
'f1' through 'f24'
'enter' / 'return''esc' / 'escape''space' / 'spacebar''tab''backspace''delete' / 'del''insert''home''end''pageup' / 'pgup''pagedown' / 'pgdn''up' / 'down' / 'left' / 'right''ctrl' / 'control''shift''alt''win' / 'winleft' / 'winright''cmd' / 'command' (Mac)'capslock''numlock''scrolllock''.' / ',' / '?' / '!' / ';' / ':''[' / ']' / '{' / '}''(' / ')''+' / '-' / '*' / '/' / '='Move mouse to any corner of the screen to abort all automation.
```python
dc = DesktopController(failsafe=True)
```
```python
dc.pause(2.0)
if dc.is_safe():
dc.click(500, 500)
```
Require user confirmation before actions:
```python
dc = DesktopController(require_approval=True)
dc.click(500, 500) # Prompt: "Allow click at (500, 500)? [y/n]"
```
```python
dc = DesktopController()
dc.click(300, 200)
dc.type_text("John Doe", wpm=80)
dc.press('tab')
dc.type_text("john@example.com", wpm=80)
dc.press('tab')
dc.type_text("SecurePassword123", wpm=60)
dc.press('enter')
```
```python
region = (100, 100, 800, 600) # left, top, width, height
img = dc.screenshot(region=region)
import datetime
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
img.save(f"capture_{timestamp}.png")
```
```python
dc.key_down('ctrl')
dc.click(100, 200) # First file
dc.click(100, 250) # Second file
dc.click(100, 300) # Third file
dc.key_up('ctrl')
dc.hotkey('ctrl', 'c')
```
```python
dc.activate_window("Calculator")
time.sleep(0.5)
dc.type_text("5+3=", interval=0.2)
time.sleep(0.5)
dc.screenshot(filename="calculation_result.png")
```
```python
dc.drag(
start_x=200, start_y=300, # File location
end_x=800, end_y=500, # Folder location
duration=1.0 # Smooth 1-second drag
)
```
duration=0get_screen_size() to confirm dimensionsinterval for reliabilityDesktopController(failsafe=False)Install all:
```bash
pip install pyautogui pillow opencv-python pygetwindow
```
Built for OpenClaw - The ultimate desktop automation companion š¦
Generated Mar 1, 2026
This skill can automate UI testing for desktop applications by simulating mouse clicks, keyboard inputs, and verifying screen elements. It enables regression testing, reducing manual effort and improving software quality assurance.
Automate repetitive data entry tasks across multiple applications, such as transferring information from spreadsheets to forms. It speeds up processes, minimizes human error, and frees up staff for higher-value work.
Facilitate remote support by allowing technicians to control a user's desktop for troubleshooting. It can simulate actions like clicking icons or typing commands, enhancing help desk efficiency and reducing on-site visits.
Automate repetitive tasks in graphic design or video editing software, such as applying filters or batch processing files. It streamlines workflows, saving time for creative professionals and increasing productivity.
Perform automated actions in games, such as farming resources or executing complex sequences. It can assist with repetitive gameplay tasks, though ethical use must be considered to avoid violating terms of service.
Offer the skill as part of a cloud-based automation platform with tiered pricing based on usage or features. This provides recurring revenue and allows for continuous updates and support.
Sell custom licenses to businesses for internal automation needs, including integration support and training. This targets large organizations seeking to optimize workflows and reduce operational costs.
Provide a free basic version to attract users, then charge for advanced features like image recognition or priority support. This model encourages adoption and monetizes power users.
š¬ Integration Tip
Ensure dependencies like pyautogui are installed and test failsafe features in a controlled environment to prevent unintended actions.
A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands.
Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications.
Advanced desktop automation with mouse, keyboard, and screen control
Manage n8n workflows and automations via API. Use when working with n8n workflows, executions, or automation tasks - listing workflows, activating/deactivating, checking execution status, manually triggering workflows, or debugging automation issues.
Design and implement automation workflows to save time and scale operations as a solopreneur. Use when identifying repetitive tasks to automate, building workflows across tools, setting up triggers and actions, or optimizing existing automations. Covers automation opportunity identification, workflow design, tool selection (Zapier, Make, n8n), testing, and maintenance. Trigger on "automate", "automation", "workflow automation", "save time", "reduce manual work", "automate my business", "no-code automation".
Browser automation via Playwright MCP server. Navigate websites, click elements, fill forms, extract data, take screenshots, and perform full browser automation workflows.