sre-engineerUse when defining SLIs/SLOs, managing error budgets, or building reliable systems at scale. Invoke for incident management, chaos engineering, toil reduction, capacity planning.
Install via ClawdBot CLI:
clawdbot install veeramanikandanr48/sre-engineerGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Calls external URL not in known-safe list
http://localhost:8080/healthAudited Apr 16, 2026 · audit v1.0
Generated Mar 1, 2026
Define SLIs and SLOs for an online retail site to ensure 99.9% availability during peak shopping seasons, implement error budget policies to manage deployment risks, and automate incident response for payment gateway failures to reduce MTTR.
Monitor golden signals like latency and saturation for a video streaming platform, design chaos engineering experiments to test resilience against server failures, and automate toil in log analysis for capacity scaling during high-demand events.
Establish on-call practices and blameless postmortems for a banking app, implement Prometheus-based alerting for transaction errors, and reduce toil through automation of compliance reporting to maintain SLOs for uptime and security.
Set SLOs for a telemedicine platform to ensure 99.95% availability for patient consultations, build dashboards for error rates and traffic, and automate deployment processes with capacity planning to handle emergency surges.
Identify repetitive tasks in a multi-tenant SaaS environment, automate infrastructure provisioning with Terraform, and implement error budgets to balance feature releases with reliability targets for user satisfaction.
This model relies on recurring revenue from users, where high reliability and uptime are critical to retain customers and meet SLA commitments. SRE practices help manage error budgets to enable safe feature deployments while minimizing churn.
Revenue is generated per sale, making system availability and low latency essential during peak traffic. SRE focuses on SLOs for checkout processes and incident management to prevent revenue loss from downtime.
Income depends on user engagement and ad impressions, requiring scalable systems with reliable performance. SRE implements capacity planning and chaos engineering to ensure uptime for content delivery and ad serving.
💬 Integration Tip
Integrate this skill with existing monitoring tools like Prometheus and incident management platforms such as PagerDuty to streamline SLO tracking and automate alert responses for faster remediation.
Scored Apr 16, 2026
Automate web tasks like form filling, data scraping, testing, monitoring, and scheduled jobs with multi-browser support and retry mechanisms.
Automatically update Clawdbot and all installed skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.
Real-time security monitoring for Clawdbot. Detects intrusions, unusual API calls, credential usage patterns, and alerts on breaches.
Firecrawl CLI for web scraping, crawling, and search. Scrape single pages or entire websites, map site URLs, and search the web with full content extraction. Returns clean markdown optimized for LLM context. Use for research, documentation extraction, competitive intelligence, and content monitoring.
Monitor topics of interest and proactively alert when important developments occur. Use when user wants automated monitoring of specific subjects (e.g., prod...
A clean, reliable system resource monitor for CPU load, RAM, Swap, and Disk usage. Optimized for OpenClaw.