Overview
- Skill Key
- 1kalin/afrexai-observability-engine
- Author
- 1kalin
- Source Repo
- openclaw/skills
- Version
- 1.0.0
- Source Path
- skills/1kalin/afrexai-observability-engine
- Latest Commit SHA
- cfe76ce391e1005e71da6327df141556dd559ddf
Complete observability & reliability engineering system. Use when designing monitoring, implementing structured logging, setting up distributed tracing, building alerting systems, creating SLO/SLI frameworks, running incident response, conducting post-mortems, or auditing system reliability. Covers all three pillars (logs/metrics/traces), alert design, dashboard architecture, on-call operations, chaos engineering, and cost optimization.
Stars
0
Installs
0
Status
ACTIVE
Visibility
PUBLIC
直接复制以下提示词,发送给你的 AI 助手即可完成安装。
请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 afrexai-observability-engine 技能。 若已安装,则直接安装 afrexai-observability-engine 技能。
# Observability & Reliability Engineering
Complete system for building observable, reliable services — from structured logging to incident response to SLO-driven development.
---
## Quick Health Check (/16)
Score your current observability posture:
| Signal | Healthy (2) | Weak (1) | Missing (0) |
|--------|-------------|----------|-------------|
| Structured logging | JSON logs with trace_id correlation | Logs exist but unstructured | Console.log / print statements |
| Metrics collection | RED/USE metrics with dashboards | Some metrics, no dashboards | No metrics |
| Distributed tracing | Full request path with sampling | Partial traces, key services only | No tracing |
| Alerting | SLO-based alerts with runbooks | Threshold alerts, some runbooks | No alerts or all-noise |
| Incident response | Defined process with roles + post-mortems | Ad-hoc response, some docs | "Whoever notices fixes it" |
| SLOs defined | SLOs with error budgets tracked weekly | Informal availability targets | No reliability targets |
| On-call rotation | Structured rotation with escalation | Informal "call someone" | No on-call |
| Cost management | Observability budget tracked monthly | Some awareness of costs | No idea what you spend |
**12-16:** Production-grade. Focus on optimization.
**8-11:** Foundation exists. Fill the gaps systematically.
**4-7:** Significant risk. Prioritize alerting + incident response.
**0-3:** Flying blind. Start with Phase 1 immediately.
---
## Phase 1: Structured Logging
### Log Architecture
```
Application → Structured JSON → Log Router → Storage → Query Engine
↓
Alert Pipeline
```
### Required Fields (Every Log Line)
| Field | Type | Purpose | Example |
|-------|------|---------|---------|
| `timestamp` | ISO-8601 UTC | When | `2026-02-22T18:30:00.123Z` |
| `level` | enum | Severity | `info`, `warn`, `error`, `fatal` |
| `service` | string | Which service | `payment-api` |
| `versio...
# AfrexAI Observability Engine 🔭 Complete observability & reliability engineering system for AI agents. Covers structured logging, metrics, distributed tracing, SLOs, alerting, incident response, post-mortems, on-call, chaos engineering, and cost optimization. ## Install ```bash clawhub install afrexai-observability-engine ``` ## What's Inside A 12-phase system covering the full observability lifecycle: 1. **Structured Logging** — JSON schema, PII scrubbing, log level decision tree, setup by language (Node/Python/Go) 2. **Metrics Collection** — RED + USE methods, naming conventions, label design, instrumentation checklist 3. **Distributed Tracing** — OpenTelemetry setup, sampling strategies, context propagation 4. **SLOs & Error Budgets** — SLI selection, target setting, burn-rate alerts, weekly tracking 5. **Alert Design** — Severity levels, anti-patterns, Prometheus templates, runbook templates 6. **Dashboard Architecture** — 4-level hierarchy (executive → infrastructure), panel specs 7. **Incident Response** — Severity classification, roles, step-by-step workflow, channel templates 8. **Post-Mortem Framework** — Blameless template, 5 Whys, meeting agenda 9. **On-Call Operations** — Rotation design, health metrics, weekly review 10. **Chaos Engineering** — Experiment templates, maturity levels, abort conditions 11. **Cost Optimization** — Cost driver ranking, reduction checklist, monthly review 12. **Advanced Patterns** — Correlation, synthetic monitoring, feature flag observability, maturity model Plus: /16 health check, 100-point quality rubric, 10 commandments, 12 natural language commands. ## Quick Start Tell your agent: > "Audit our observability posture" Or: > "Design SLOs for our payment API" ## ⚡ Level Up For industry-specific observability patterns, check out [AfrexAI Context Packs ($47)](https://afrexai-cto.github.io/context-packs/) — SaaS, Fintech, Healthcare, and 7 more verticals. ### More Free Skills by AfrexAI - `afrexai-devops-engine...
0xnyk
X Intelligence CLI — search, monitor, analyze, and engage on X/Twitter. TypeScript + Bun. AI agent skill.
heyixuan2
Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).
jackculpan
Track flight prices from Google Flights with this OpenClaw skill. Search routes, monitor prices, and get alerts when prices drop.
openclaw-trade
openclaw trading assistant| openclaw trading skill | nof1.ai & openclaw [moltbot] collaboration | We get the best practices from alpha arena trading seasons and bring it to clawdbot All top AI agents, realtime monitoring and news research, gather info from private insiders and many other! Using Hyperliquid API.
xquik-dev
X (Twitter) automation skill for AI coding agents. Tweet search, user lookup, follower/following extraction, media download, reply/retweet/quote extraction, 40+ tools, account monitoring & trending topics. REST API, MCP server, HMAC webhooks. Works with Claude Code, Cursor, Codex, Copilot, Windsurf & 40+ agents.
mohsinkhadim59
Step-by-step guides for installing and running OpenClaw, an open-source AI agent, on Mac, Linux VPS, and AWS covering setup, security, messaging channels, Google integration, skills, and monitoring.