monitoringSet up observability for applications and infrastructure with metrics, logs, traces, and alerts.
Install via ClawdBot CLI:
clawdbot install ivangdavila/monitoring| Level | Tools | Setup Time | Best For |
|-------|-------|------------|----------|
| Minimal | UptimeRobot, Healthchecks.io | 15 min | Side projects, MVPs |
| Standard | Uptime Kuma, Sentry, basic Grafana | 1-2 hours | Small teams, startups |
| Professional | Prometheus, Grafana, Loki, Alertmanager | 1-2 days | Production systems |
| Enterprise | Datadog, New Relic, or full OSS stack | Ongoing | Large-scale operations |
| Pillar | What It Answers | Tools |
|--------|-----------------|-------|
| Metrics | "How is the system performing?" | Prometheus, Grafana, Datadog |
| Logs | "What happened?" | Loki, ELK, CloudWatch |
| Traces | "Why is this request slow?" | Jaeger, Tempo, Sentry |
"I just want to know if it's down"
β UptimeRobot (free) or Uptime Kuma (self-hosted). See simple.md.
"I need to debug production errors"
β Sentry with your framework SDK. 5-minute setup. See apm.md.
"I want real observability"
β Prometheus + Grafana + Loki. See prometheus.md.
"I need to centralize logs"
β Loki for simple, ELK for complex queries. See logs.md.
| Do | Don't |
|----|-------|
| Alert on symptoms (user impact) | Alert on causes (CPU high) |
| Include runbook link | Require investigation to understand |
| Set appropriate severity | Make everything P1 |
| Require action | Alert on "interesting" metrics |
Alert fatigue kills monitoring. If alerts are ignored, you have no monitoring.
For alert configuration, severities, and on-call setup, see alerting.md.
| Solution | Monthly Cost (small) | Monthly Cost (medium) |
|----------|---------------------|----------------------|
| UptimeRobot | Free | $7 |
| Uptime Kuma | $5 (VPS) | $5 (VPS) |
| Sentry | Free / $26 | $80 |
| Grafana Cloud | Free tier | $50+ |
| Datadog | $15/host | $23/host + features |
| Self-hosted stack | $10-20 (VPS) | $50-100 (VPS) |
Generated Mar 1, 2026
An online retail business needs to ensure high availability and performance during peak shopping seasons. They implement UptimeRobot for uptime checks and Sentry for real-time error tracking to quickly identify and resolve issues affecting user transactions.
A software-as-a-service startup uses Grafana Cloud to monitor application metrics like request rates and latency. This helps them optimize performance, reduce downtime, and provide a reliable service to their growing customer base.
A fintech company deploys a self-hosted Prometheus and Loki stack to monitor infrastructure and log all transactions. This ensures compliance with regulatory requirements and enables rapid detection of security breaches or system anomalies.
A hospital network implements Datadog to monitor critical patient management systems and medical devices. They set up alerts based on symptoms to maintain uptime and ensure patient data is always accessible and secure.
A manufacturing firm uses a combination of Uptime Kuma and basic Grafana to monitor thousands of connected devices. This allows them to track device health, predict failures, and minimize operational disruptions in real-time.
Offer tiered monitoring plans (e.g., free, standard, premium) using tools like UptimeRobot and Sentry. Generate recurring revenue by charging for advanced features, higher alert limits, and priority support.
Provide expertise in setting up and customizing monitoring stacks like Prometheus and Grafana for clients. Charge for initial setup, ongoing maintenance, and training to help businesses optimize their observability.
Host and manage a full OSS stack (e.g., Prometheus, Loki, Alertmanager) on cloud infrastructure. Offer this as a managed service with SLAs, reducing clients' operational overhead and scaling costs based on usage.
π¬ Integration Tip
Start with simple tools like UptimeRobot for basic uptime checks before scaling to more complex solutions; always include runbook links in alerts to reduce response time.
Automatically update Clawdbot and all installed skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag,...
Essential Docker commands and workflows for container management, image operations, and debugging.
Tool discovery and shell one-liner reference for sysadmin, DevOps, and security tasks. AUTO-CONSULT this skill when the user is: troubleshooting network issues, debugging processes, analyzing logs, working with SSL/TLS, managing DNS, testing HTTP endpoints, auditing security, working with containers, writing shell scripts, or asks 'what tool should I use for X'. Source: github.com/trimstray/the-book-of-secret-knowledge
Deploy applications and manage projects with complete CLI reference. Commands for deployments, projects, domains, environment variables, and live documentation access.
Monitor topics of interest and proactively alert when important developments occur. Use when user wants automated monitoring of specific subjects (e.g., product releases, price changes, news topics, technology updates). Supports scheduled web searches, AI-powered importance scoring, smart alerts vs weekly digests, and memory-aware contextual summaries.