observability-lgtmSet up a full local LGTM observability stack (Loki + Grafana + Tempo + Prometheus + Alloy) for FastAPI apps. One Docker Compose, one Python import, unified d...
Install via ClawdBot CLI:
clawdbot install nissan/observability-lgtmSet up a full local observability stack (Loki + Grafana + Tempo + Prometheus + Alloy)
for FastAPI apps on macOS (Apple Silicon) or Linux. One command to start, one import
to instrument any app. Logs → Loki, metrics → Prometheus, traces → Tempo, all
unified in Grafana.
| Service | Port | Purpose |
|---|---|---|
| Grafana | 3000 | Dashboards — no login in dev mode |
| Prometheus | 9091 | Metrics scraping (avoids 9090 if MinIO running) |
| Loki | 3300 | Log storage (avoids 3100 if Langfuse running) |
| Tempo gRPC | 4317 | OTLP trace receiver |
| Tempo HTTP | 4318 | OTLP HTTP alternative |
| Alloy UI | 12345 | Agent status |
lsof -iTCP -sTCP:LISTEN -n -P 2>/dev/null | grep -E ":(3000|3300|9091|4317|4318|12345)" | awk '{print $9, $1}'
If any of the ports above are in use, update the relevant port in docker-compose.yml
and the matching url: in config/grafana/provisioning/datasources/datasources.yml.
Common conflicts: Langfuse on 3100, MinIO on 9090.
Copy these files from the skill directory into a projects/observability/ folder
in the workspace:
assets/docker-compose.ymlassets/config/ (entire directory tree)assets/lib/observability.pyassets/scripts/register_app.shmkdir -p projects/observability
cp -r SKILL_DIR/assets/* projects/observability/
mkdir -p projects/observability/logs
touch projects/observability/logs/.gitkeep
chmod +x projects/observability/scripts/register_app.sh
cd projects/observability
docker compose up -d
Wait ~15 seconds for all services to start, then verify:
curl -s -o /dev/null -w "Grafana: %{http_code}\n" http://localhost:3000/api/health
curl -s -o /dev/null -w "Prometheus: %{http_code}\n" http://localhost:9091/-/healthy
curl -s -o /dev/null -w "Loki: %{http_code}\n" http://localhost:3300/ready
curl -s -o /dev/null -w "Tempo: %{http_code}\n" http://localhost:4318/ready
All should return 200. If Loki or Tempo return 503, wait 10 more seconds and retry
(they have a slower startup than Grafana/Prometheus).
pip install \
"prometheus-fastapi-instrumentator>=7.0.0" \
"opentelemetry-sdk>=1.25.0" \
"opentelemetry-exporter-otlp-proto-grpc>=1.25.0" \
"opentelemetry-instrumentation-fastapi>=0.46b0" \
"python-json-logger>=2.0.7"
Add to the app's app.py (or main.py), just after app = FastAPI(...):
import sys
sys.path.insert(0, "path/to/projects/observability/lib")
from observability import setup_observability
logger = setup_observability(app, service_name="my-service-name")
That's it. The app now:
/metrics for Prometheusprojects/observability/logs/my-service-name/app.logcd projects/observability
./scripts/register_app.sh my-service-name <port>
# e.g.: ./scripts/register_app.sh image-gen-studio 7860
Prometheus hot-reloads the target within 30 seconds. Verify:
curl -s "http://localhost:9091/api/v1/targets" | python3 -c "
import json, sys
data = json.load(sys.stdin)
for t in data['data']['activeTargets']:
svc = t['labels'].get('service', '')
print(svc, '->', t['health'])
"
Open http://localhost:3000
The FastAPI — App Overview dashboard is pre-loaded. Select your service from
the dropdown at the top. You'll see:
To jump from a log line to its trace: click the trace_id link in the log detail panel.
It opens the full trace in Tempo automatically (datasource pre-wired).
In Grafana → Dashboards → Import:
# Reload Prometheus config after registering a new app:
curl -s -X POST http://localhost:9091/-/reload
# Restart a single service without losing data:
docker compose -f projects/observability/docker-compose.yml restart grafana
# Stop everything (data volumes preserved):
docker compose -f projects/observability/docker-compose.yml down
# Nuclear reset (wipes all stored data):
docker compose -f projects/observability/docker-compose.yml down -v
# Check Alloy log shipping status:
open http://localhost:12345
from observability import get_tracer
tracer = get_tracer(__name__)
@app.get("/expensive-endpoint")
async def handler():
with tracer.start_as_current_span("db-query") as span:
span.set_attribute("db.table", "users")
result = await db.query(...)
return result
The OTel instrumentation injects trace_id into every log record. Grafana Loki
is pre-configured with a derived field that turns "trace_id":"abc123" into a
clickable link to the Tempo trace.
To manually include trace context in your own log calls:
from opentelemetry import trace
def trace_ctx() -> dict:
ctx = trace.get_current_span().get_span_context()
return {"trace_id": format(ctx.trace_id, "032x")} if ctx.is_valid else {}
logger.info("Processing request", extra=trace_ctx())
projects/observability/logs//app.log as JSON.Alloy tails these files and ships to Loki — no code changes needed beyond setup_observability().
data_classification: LOCAL_ONLY is the default for all traces/logs.config/alloy/config.alloy to remove the stage.drop block if you need debug logs.
Generated Mar 1, 2026
A developer building a suite of microservices with FastAPI needs to monitor each service's performance, logs, and traces locally before deployment. This skill provides a unified dashboard to debug latency issues, track errors across services, and correlate logs with traces for root cause analysis.
A data scientist deploying FastAPI-based model inference endpoints wants to monitor request rates, error percentages, and latency metrics in real-time. The skill enables visualization of model performance, logs for debugging predictions, and traces to identify bottlenecks in preprocessing or inference steps.
A company uses internal FastAPI tools for data processing or reporting and needs observability without external dependencies. This skill sets up local dashboards to track usage patterns, ensure uptime, and debug issues without exposing data to cloud services.
Students or instructors learning about observability in web development can use this skill to instrument FastAPI projects. It offers hands-on experience with logs, metrics, and traces in a controlled local environment, avoiding complex cloud setups.
A startup developing a FastAPI-based MVP requires quick observability to iterate on features and fix bugs. This skill provides a lightweight stack to monitor user interactions, API health, and performance trends during early development phases.
Offer this skill as part of an open-source toolkit for developers, with potential revenue from consulting, custom integrations, or premium support services. It attracts users by simplifying local observability setup.
Integrate this skill into a larger developer platform or IDE extension, providing observability as a value-added feature. Revenue can come from subscription fees or upsells for enhanced analytics and cloud migration tools.
Use the skill as a hands-on component in training programs or workshops focused on DevOps and observability. Revenue is generated through course fees, certifications, or corporate training packages.
💬 Integration Tip
Ensure Docker is running and ports are free before starting; use the provided scripts to automate app registration and avoid manual configuration errors.
Automatically update Clawdbot and all installed skills once daily. Runs via cron, checks for updates, applies them, and messages the user with a summary of what changed.
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag,...
Essential Docker commands and workflows for container management, image operations, and debugging.
Tool discovery and shell one-liner reference for sysadmin, DevOps, and security tasks. AUTO-CONSULT this skill when the user is: troubleshooting network issues, debugging processes, analyzing logs, working with SSL/TLS, managing DNS, testing HTTP endpoints, auditing security, working with containers, writing shell scripts, or asks 'what tool should I use for X'. Source: github.com/trimstray/the-book-of-secret-knowledge
Deploy applications and manage projects with complete CLI reference. Commands for deployments, projects, domains, environment variables, and live documentation access.
Monitor topics of interest and proactively alert when important developments occur. Use when user wants automated monitoring of specific subjects (e.g., product releases, price changes, news topics, technology updates). Supports scheduled web searches, AI-powered importance scoring, smart alerts vs weekly digests, and memory-aware contextual summaries.