android-agentControl a real Android phone via USB or network using GPT-4o vision to run tasks like opening apps, typing, tapping, and automation scripts.
Install via ClawdBot CLI:
clawdbot install harshilmathur/android-agentPlug your old Android phone into your Mac/PC. Now your AI assistant can use it.
Got an old Android in a drawer? Connect it to any machine running OpenClaw β your gateway, a Mac Mini, a Raspberry Pi. Your AI can now open apps, tap buttons, type text, and complete tasks on a real phone. Book a cab, order food, check your bank app β anything you'd do with your thumbs.
Your AI agent sees the phone screen (via screenshots), decides what to tap/type/swipe, and executes actions via ADB. Under the hood it uses DroidRun with GPT-4o vision.
βββββββββββββββ screenshots ββββββββββββββββ ADB commands βββββββββββββββ
β GPT-4o ββββββββββββββββββββ DroidRun ββββββββββββββββββββΊβ Android β
β Vision βββββββββββββββββββΊβ Agent βββββββββββββββββββββ Phone β
β β tap/type/swipe β β screen state β β
βββββββββββββββ ββββββββββββββββ βββββββββββββββ
Phone plugged into your OpenClaw gateway machine via USB. Zero networking required.
[Gateway Machine] ββUSBβββΊ [Android Phone]
Phone plugged into a Mac Mini, Raspberry Pi, or any OpenClaw node. The gateway controls it over the network.
[Gateway] ββnetworkβββΊ [Mac Mini / Pi node] ββUSBβββΊ [Android Phone]
For Node mode, connect ADB over TCP/WiFi so the node can forward commands.
On your Android phone:
# Plug phone in via USB, then:
pip install -r requirements.txt
adb devices # verify phone shows up β authorize on phone if prompted
export OPENAI_API_KEY="sk-..."
python scripts/run-task.py "Open Settings and turn on Dark Mode"
That's it. The script handles everything: waking the screen, unlocking, keeping the display on, and running your task.
python scripts/run-task.py "Order an Uber to the airport"
python scripts/run-task.py "Set an alarm for 6 AM tomorrow"
python scripts/run-task.py "Check my bank balance on PhonePe"
python scripts/run-task.py "Open Google Maps and navigate to the nearest coffee shop"
python scripts/run-task.py "Send a WhatsApp message to Mom saying I'll be late"
python scripts/run-task.py "Read my latest SMS messages"
python scripts/run-task.py "Open Telegram and check unread messages"
python scripts/run-task.py "Open Amazon and search for wireless earbuds under 2000 rupees"
python scripts/run-task.py "Add milk and bread to my Instamart cart"
python scripts/run-task.py "Open Google Calendar and check my schedule for tomorrow"
python scripts/run-task.py "Create a new note in Google Keep: Buy groceries"
python scripts/run-task.py "Play my Discover Weekly playlist on Spotify"
python scripts/run-task.py "Open YouTube and search for lo-fi study music"
python scripts/run-task.py "Turn on Dark Mode"
python scripts/run-task.py "Connect to my home WiFi network"
python scripts/run-task.py "Enable Do Not Disturb mode"
python scripts/run-task.py "Turn off Bluetooth"
python scripts/run-task.py "Take a screenshot"
python scripts/run-task.py "Open the camera and take a photo"
python scripts/run-task.py "Clear all notifications"
| Variable | Required | Description |
|----------|----------|-------------|
| OPENAI_API_KEY | Yes | API key for GPT-4o vision |
| ANDROID_SERIAL | No | Device serial number. Auto-detected if only one device is connected |
| ANDROID_PIN | No | Phone PIN/password for auto-unlock. If not set, unlock is skipped |
| DROIDRUN_TIMEOUT | No | Task timeout in seconds (default: 120) |
# macOS
brew install android-platform-tools
# Ubuntu/Debian
sudo apt install android-tools-adb
# Windows
# Download from https://developer.android.com/tools/releases/platform-tools
./scripts/connect.sh usb
adb install droidrun-portal.apk
pip install -r requirements.txt
adb tcpip 5555
adb connect <phone-ip>:5555
# If using SSH tunnel:
ssh -L 15555:<phone-ip>:5555 user@node-ip
export ANDROID_SERIAL="127.0.0.1:15555"
# Or direct WiFi (same network):
./scripts/connect.sh wifi <phone-ip>
ANDROID_SERIAL points to.The DroidRun Portal APK must be installed and running on the phone. It provides the accessibility service that allows DroidRun to read screen content and interact with UI elements.
scripts/run-task.py β The Main Script# Basic usage
python scripts/run-task.py "Your task description here"
# With options
python scripts/run-task.py --timeout 180 "Install Spotify from Play Store"
python scripts/run-task.py --model gpt-4o "Open Chrome and search for weather"
python scripts/run-task.py --no-unlock "Take a screenshot"
python scripts/run-task.py --serial 127.0.0.1:15555 "Check notifications"
python scripts/run-task.py --verbose "Open Settings"
Options:
| Flag | Description |
|------|-------------|
| goal | Task description (positional, required) |
| --timeout | Timeout in seconds (default: 120, or DROIDRUN_TIMEOUT env) |
| --model | LLM model to use (default: gpt-4o) |
| --no-unlock | Skip the auto-unlock step |
| --serial | Device serial (default: ANDROID_SERIAL env or auto-detect) |
| --verbose | Show detailed debug output |
scripts/connect.sh β Setup & Verify Connection./scripts/connect.sh # Auto-detect USB device
./scripts/connect.sh usb # USB mode (explicit)
./scripts/connect.sh wifi 192.168.1.100 # WiFi/TCP mode
scripts/screenshot.sh β Screenshot (ADB screencap, reliable)DroidRunβs internal screenshot sometimes fails on certain devices. Use this to bypass DroidRun and capture a PNG directly via ADB.
# Save to /tmp/android-screenshot.png
./scripts/screenshot.sh
# Explicit serial + output path
./scripts/screenshot.sh 127.0.0.1:15555 /tmp/a03.png
You can also do it from Python:
python scripts/run-task.py --screenshot --serial 127.0.0.1:15555 --screenshot-path /tmp/a03.png
scripts/status.sh β Device Status./scripts/status.sh
# Output:
# π± Device: Samsung Galaxy A03 (SM-A035F)
# π€ Android: 11 (API 30)
# π Battery: 87%
# πΊ Screen: ON (unlocked)
# π Connection: USB
# π¦ DroidRun Portal: installed (v0.5.5)
adb kill-server && adb start-server--verbose to see what the agent is seeing./scripts/connect.sh usb first, then ./scripts/connect.sh wifi adb shell getevent -l and tap each digitadb shell input text as a fallback on some devicesANDROID_PIN environment variable (never hardcode it)Use a dedicated test device, not your primary phone.
Bottom line: Treat the connected phone as a "work device for AI." Don't leave personal accounts logged in. Don't store secrets on it. If you wouldn't hand your unlocked phone to a stranger, don't point this skill at it.
MIT β see LICENSE
Generated Mar 1, 2026
Businesses can use the android-agent to automate testing of customer support apps on real Android devices, ensuring functionality across updates. It simulates user interactions like submitting tickets or checking statuses, reducing manual QA effort and improving app reliability.
Caregivers or service providers can deploy this skill to help elderly or disabled individuals manage daily phone tasks remotely, such as setting reminders, sending messages, or accessing health apps. It enhances independence by automating complex interactions they might struggle with.
Retail employees can automate inventory checks and updates using the android-agent on company phones, scanning barcodes or updating stock levels in apps. This streamlines operations, reduces human error, and frees up staff for customer service.
Content creators or marketers can schedule posts and interact with followers on platforms like Instagram or WhatsApp via the android-agent, automating repetitive tasks. It allows for consistent engagement without constant manual input on mobile devices.
Individuals can automate personal finance tasks, such as checking bank balances, paying bills, or tracking expenses through banking apps on their Android phones. This saves time and ensures timely financial management without daily manual oversight.
Offer a cloud-based service where companies subscribe to use the android-agent for automated testing or customer support on multiple devices. Revenue comes from monthly fees based on usage tiers, device counts, or support levels, targeting SMEs and enterprises.
Develop a user-friendly app that individuals can install for free with basic automation features, like setting alarms or sending messages. Monetize through premium upgrades for advanced tasks, such as financial automation or ad-free experiences, driving in-app purchases.
Provide consulting services to businesses for custom integrations of the android-agent into their workflows, such as retail inventory systems or healthcare apps. Revenue is generated from project-based fees, ongoing maintenance contracts, and training sessions.
π¬ Integration Tip
Ensure ADB is properly configured and the DroidRun Portal APK is installed with accessibility permissions to avoid common setup issues.
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Clau...
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
Search and analyze your own session logs (older/parent conversations) using jq.
Typed knowledge graph for structured agent memory and composable skills. Use when creating/querying entities (Person, Project, Task, Event, Document), linking related objects, enforcing constraints, planning multi-step actions as graph transformations, or when skills need to share state. Trigger on "remember", "what do I know about", "link X to Y", "show dependencies", entity CRUD, or cross-skill data access.
Ultimate AI agent memory system for Cursor, Claude, ChatGPT & Copilot. WAL protocol + vector search + git-notes + cloud backup. Never lose context again. Vibe-coding ready.
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection