mac-controlControl Mac via mouse/keyboard automation using cliclick and AppleScript. Use for clicking UI elements, taking screenshots, getting window bounds, handling c...
Install via ClawdBot CLI:
clawdbot install EasonC13/mac-controlAutomate Mac UI interactions using cliclick (mouse/keyboard) and system tools.
/opt/homebrew/bin/cliclick - mouse/keyboard controlCurrent setup: 1920x1080 display, 1:1 scaling (no conversion needed!)
If screenshot is 2x the logical resolution:
# Convert: cliclick_coords = screenshot_coords / 2
cliclick c:$((screenshot_x / 2)),$((screenshot_y / 2))
Run to verify your scale factor:
/Users/eason/clawd/scripts/calibrate-cursor.sh
# Click at coordinates
/opt/homebrew/bin/cliclick c:500,300
# Move mouse (no click) - Note: may not visually update cursor
/opt/homebrew/bin/cliclick m:500,300
# Double-click
/opt/homebrew/bin/cliclick dc:500,300
# Right-click
/opt/homebrew/bin/cliclick rc:500,300
# Click and drag
/opt/homebrew/bin/cliclick dd:100,100 du:200,200
# Type text
/opt/homebrew/bin/cliclick t:"hello world"
# Press key (Return, Escape, Tab, etc.)
/opt/homebrew/bin/cliclick kp:return
/opt/homebrew/bin/cliclick kp:escape
# Key with modifier (cmd+w to close window)
/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd
# Get current mouse position
/opt/homebrew/bin/cliclick p
# Wait before action (ms)
/opt/homebrew/bin/cliclick -w 100 c:500,300
# Full screen (silent)
/usr/sbin/screencapture -x /tmp/screenshot.png
# With cursor (may not work for custom cursor colors)
/usr/sbin/screencapture -C -x /tmp/screenshot.png
# Interactive region selection
screencapture -i region.png
# Delayed capture
screencapture -T 3 -x delayed.png # 3 second delay
Best practice for reliable clicking:
/usr/sbin/screencapture -x /tmp/screen.png
/opt/homebrew/bin/cliclick c:X,Y
# 1. Screenshot
/usr/sbin/screencapture -x /tmp/before.png
# 2. View image, find button at (850, 450)
# (Use Read tool on /tmp/before.png)
# 3. Click
/opt/homebrew/bin/cliclick c:850,450
# 4. Verify
/usr/sbin/screencapture -x /tmp/after.png
# Get Chrome window bounds
osascript -e 'tell application "Google Chrome" to get bounds of front window'
# Returns: 0, 38, 1920, 1080 (left, top, right, bottom)
Use AppleScript to find exact button position:
# Find Clawdbot extension button position
osascript -e '
tell application "System Events"
tell process "Google Chrome"
set toolbarGroup to group 2 of group 3 of toolbar 1 of group 1 of group 1 of group 1 of group 1 of group 1 of window 1
set allButtons to every pop up button of toolbarGroup
repeat with btn in allButtons
if description of btn contains "Clawdbot" then
return position of btn & size of btn
end if
end repeat
end tell
end tell
'
# Output: 1755, 71, 34, 34 (x, y, width, height)
# Click center of button
# center_x = x + width/2 = 1755 + 17 = 1772
# center_y = y + height/2 = 71 + 17 = 88
/opt/homebrew/bin/cliclick c:1772,88
If you need to find a specific colored element:
# Find red (#FF0000) pixels in screenshot
magick /tmp/screen.png txt:- | grep "#FF0000" | head -5
# Calculate center of colored region
magick /tmp/screen.png txt:- | grep "#FF0000" | awk -F'[,:]' '
BEGIN{sx=0;sy=0;c=0}
{sx+=$1;sy+=$2;c++}
END{printf "Center: (%d, %d)\n", sx/c, sy/c}'
# Example: Click "OK" button at (960, 540)
/opt/homebrew/bin/cliclick c:960,540
# Click to focus, then type
/opt/homebrew/bin/cliclick c:500,300
sleep 0.2
/opt/homebrew/bin/cliclick t:"Hello world"
/opt/homebrew/bin/cliclick kp:return
Located in /Users/eason/clawd/scripts/:
calibrate-cursor.sh - Calibrate coordinate scalingclick-at-visual.sh - Click at screenshot coordinatesget-cursor-pos.sh - Get current cursor positionattach-browser-relay.sh - Auto-click Browser Relay extensionGoogle OAuth and protected pages block synthetic mouse clicks! Use keyboard navigation:
# Tab to navigate between elements
osascript -e 'tell application "System Events" to keystroke tab'
# Shift+Tab to go backwards
osascript -e 'tell application "System Events" to key code 48 using shift down'
# Enter to activate focused element
osascript -e 'tell application "System Events" to keystroke return'
# Full workflow: Tab 3 times then Enter
osascript -e '
tell application "System Events"
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke return
end tell
'
When to use keyboard instead of mouse:
Problem: Browser Relay may list tabs from multiple Chrome windows, causing snapshot to fail on the desired tab.
Solution:
Check tabs visible to relay:
# In agent code
browser action=tabs profile=chrome
If target tab missing from list → wrong window attached.
Verify single window:
osascript -e 'tell application "Google Chrome" to return count of windows'
Critical: Always verify coordinates BEFORE clicking important buttons.
# 1. Take screenshot
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/before.png"'
# 2. View screenshot (Read tool), note target position
# 3. Move mouse to verify position (optional)
python3 -c "import pyautogui; pyautogui.moveTo(X, Y)"
osascript -e 'do shell script "/usr/sbin/screencapture -C -x /tmp/verify.png"'
# 4. Check cursor is on target, THEN click
/opt/homebrew/bin/cliclick c:X,Y
# 5. Take screenshot to confirm action worked
osascript -e 'do shell script "/usr/sbin/screencapture -x /tmp/after.png"'
Click lands wrong: Verify scale factor with calibration script
cliclick m: doesn't move cursor visually: Use c: (click) instead, or check with cliclick p to confirm position changed
Permission denied: System Settings → Privacy & Security → Accessibility → Add /opt/homebrew/bin/node
Window not found: Check exact app name:
osascript -e 'tell application "System Events" to get name of every process whose background only is false'
Clicks ignored on OAuth/protected pages: These pages block synthetic events. Use keyboard navigation (Tab + Enter) instead.
pyautogui vs cliclick coordinates differ: Stick with cliclick for consistency. pyautogui may have different coordinate mapping.
Quartz CGEvent clicks don't work: Some pages (Google OAuth) block low-level mouse events too. Keyboard is the only reliable method.
Generated Feb 25, 2026
This scenario involves using the Mac Control skill to automate repetitive UI testing tasks, such as clicking buttons, entering text, and verifying screen states. It's ideal for QA engineers who need to test Mac apps without manual intervention, ensuring consistent test execution and reducing human error. The skill's screenshot and coordinate-clicking capabilities allow for precise validation of UI elements.
In this scenario, the skill automates interactions with browser extensions, like clicking Chrome extension icons or managing pop-ups. It's useful for developers or power users who need to trigger extension actions programmatically, such as activating a bot or tool. The AppleScript integration helps locate exact button positions, enabling reliable automation even in complex browser interfaces.
This scenario focuses on automating the dismissal of system dialogs or pop-ups on Mac, such as software update prompts or alert windows. System administrators can use it to maintain uninterrupted workflows by scripting clicks on 'OK' or 'Cancel' buttons. The skill's ability to handle coordinate scaling ensures compatibility across different display setups.
Here, the skill automates data entry tasks by clicking into text fields, typing information, and submitting forms. It's applicable in industries like finance or healthcare where repetitive form filling is common, saving time and reducing manual input errors. The keyboard navigation features provide fallback options when mouse clicks are blocked by security measures.
This scenario uses the skill's color detection and screenshot analysis to automate interactions based on visual cues, such as clicking on specific colored elements. It can support accessibility tools by enabling users with disabilities to control their Mac via automated scripts, enhancing usability through visual feedback and precise clicking.
Offer a cloud-based service where users can schedule and run Mac automation scripts via the skill, targeting businesses needing repetitive task automation. Revenue comes from subscription tiers based on usage volume and advanced features like analytics or team collaboration. This model leverages the skill's tools to provide scalable, no-code automation solutions.
Provide consulting services to help companies integrate the Mac Control skill into their workflows, such as for testing or data entry automation. Revenue is generated through project-based fees or hourly rates for customization and support. This model caters to organizations with specific automation needs that require tailored scripting and setup.
Develop and sell training courses or certifications on using the skill for Mac automation, targeting IT professionals and developers. Revenue streams include course fees, certification exams, and supplementary materials like scripts or templates. This model capitalizes on the skill's complexity to educate users on best practices and advanced techniques.
💬 Integration Tip
Ensure proper calibration of coordinate scaling for Retina displays using the provided scripts, and use keyboard navigation as a fallback for secure pages like Google OAuth to avoid automation blocks.
A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands.
Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications.
Advanced desktop automation with mouse, keyboard, and screen control
Manage n8n workflows and automations via API. Use when working with n8n workflows, executions, or automation tasks - listing workflows, activating/deactivating, checking execution status, manually triggering workflows, or debugging automation issues.
Design and implement automation workflows to save time and scale operations as a solopreneur. Use when identifying repetitive tasks to automate, building workflows across tools, setting up triggers and actions, or optimizing existing automations. Covers automation opportunity identification, workflow design, tool selection (Zapier, Make, n8n), testing, and maintenance. Trigger on "automate", "automation", "workflow automation", "save time", "reduce manual work", "automate my business", "no-code automation".
Browser automation via Playwright MCP server. Navigate websites, click elements, fill forms, extract data, take screenshots, and perform full browser automation workflows.