TopRank Skills

Home / Claw Skills / 监控 / afrexai-observability-engine
Official OpenClaw rules 72%

afrexai-observability-engine

Complete observability & reliability engineering system. Use when designing monitoring, implementing structured logging, setting up distributed tracing, building alerting systems, creating SLO/SLI frameworks, running incident response, conducting post-mortems, or auditing system reliability. Covers all three pillars (logs/metrics/traces), alert design, dashboard architecture, on-call operations, chaos engineering, and cost optimization.

Stars

0

Installs

0

Status

ACTIVE

Visibility

PUBLIC

安装方式

直接复制以下提示词,发送给你的 AI 助手即可完成安装。

请先检查是否已安装 SkillHub 商店,若未安装,请根据 https://skillhub-1388575217.cos.ap-guangzhou.myqcloud.com/install/skillhub.md 安装 SkillHub 商店,然后安装 afrexai-observability-engine 技能。 若已安装,则直接安装 afrexai-observability-engine 技能。

Overview

Skill Key
1kalin/afrexai-observability-engine
Author
1kalin
Source Repo
openclaw/skills
Version
1.0.0
Source Path
skills/1kalin/afrexai-observability-engine
Latest Commit SHA
cfe76ce391e1005e71da6327df141556dd559ddf

Extracted Content

SKILL.md excerpt

# Observability & Reliability Engineering

Complete system for building observable, reliable services — from structured logging to incident response to SLO-driven development.

---

## Quick Health Check (/16)

Score your current observability posture:

| Signal | Healthy (2) | Weak (1) | Missing (0) |
|--------|-------------|----------|-------------|
| Structured logging | JSON logs with trace_id correlation | Logs exist but unstructured | Console.log / print statements |
| Metrics collection | RED/USE metrics with dashboards | Some metrics, no dashboards | No metrics |
| Distributed tracing | Full request path with sampling | Partial traces, key services only | No tracing |
| Alerting | SLO-based alerts with runbooks | Threshold alerts, some runbooks | No alerts or all-noise |
| Incident response | Defined process with roles + post-mortems | Ad-hoc response, some docs | "Whoever notices fixes it" |
| SLOs defined | SLOs with error budgets tracked weekly | Informal availability targets | No reliability targets |
| On-call rotation | Structured rotation with escalation | Informal "call someone" | No on-call |
| Cost management | Observability budget tracked monthly | Some awareness of costs | No idea what you spend |

**12-16:** Production-grade. Focus on optimization.
**8-11:** Foundation exists. Fill the gaps systematically.
**4-7:** Significant risk. Prioritize alerting + incident response.
**0-3:** Flying blind. Start with Phase 1 immediately.

---

## Phase 1: Structured Logging

### Log Architecture

```
Application → Structured JSON → Log Router → Storage → Query Engine
                                    ↓
                              Alert Pipeline
```

### Required Fields (Every Log Line)

| Field | Type | Purpose | Example |
|-------|------|---------|---------|
| `timestamp` | ISO-8601 UTC | When | `2026-02-22T18:30:00.123Z` |
| `level` | enum | Severity | `info`, `warn`, `error`, `fatal` |
| `service` | string | Which service | `payment-api` |
| `versio...

README excerpt

# AfrexAI Observability Engine 🔭

Complete observability & reliability engineering system for AI agents. Covers structured logging, metrics, distributed tracing, SLOs, alerting, incident response, post-mortems, on-call, chaos engineering, and cost optimization.

## Install

```bash
clawhub install afrexai-observability-engine
```

## What's Inside

A 12-phase system covering the full observability lifecycle:

1. **Structured Logging** — JSON schema, PII scrubbing, log level decision tree, setup by language (Node/Python/Go)
2. **Metrics Collection** — RED + USE methods, naming conventions, label design, instrumentation checklist
3. **Distributed Tracing** — OpenTelemetry setup, sampling strategies, context propagation
4. **SLOs & Error Budgets** — SLI selection, target setting, burn-rate alerts, weekly tracking
5. **Alert Design** — Severity levels, anti-patterns, Prometheus templates, runbook templates
6. **Dashboard Architecture** — 4-level hierarchy (executive → infrastructure), panel specs
7. **Incident Response** — Severity classification, roles, step-by-step workflow, channel templates
8. **Post-Mortem Framework** — Blameless template, 5 Whys, meeting agenda
9. **On-Call Operations** — Rotation design, health metrics, weekly review
10. **Chaos Engineering** — Experiment templates, maturity levels, abort conditions
11. **Cost Optimization** — Cost driver ranking, reduction checklist, monthly review
12. **Advanced Patterns** — Correlation, synthetic monitoring, feature flag observability, maturity model

Plus: /16 health check, 100-point quality rubric, 10 commandments, 12 natural language commands.

## Quick Start

Tell your agent:
> "Audit our observability posture"

Or:
> "Design SLOs for our payment API"

## ⚡ Level Up

For industry-specific observability patterns, check out [AfrexAI Context Packs ($47)](https://afrexai-cto.github.io/context-packs/) — SaaS, Fintech, Healthcare, and 7 more verticals.

### More Free Skills by AfrexAI

- `afrexai-devops-engine...

Related Claw Skills

0xnyk

xint

★ 49

X Intelligence CLI — search, monitor, analyze, and engage on X/Twitter. TypeScript + Bun. AI agent skill.

heyixuan2

bambu-studio-ai

★ 41

Bambu Lab 3D printer control and automation. Activate when user mentions: printer status, 3D printing, slice, analyze model, generate 3D, AMS filament, print monitor, Bambu Lab, or any 3D printing task. Full pipeline: search → generate → analyze → colorize → preview → open BS → user slice → print → monitor. Supports all 9 Bambu Lab printers (A1 Mini, A1, P1S, P2S, X1C, X1E, H2C, H2S, H2D).

jackculpan

flightclaw

★ 32

Track flight prices from Google Flights with this OpenClaw skill. Search routes, monitor prices, and get alerts when prices drop.

openclaw-trade

openclaw-trading-assistant

★ 24

openclaw trading assistant| openclaw trading skill | nof1.ai & openclaw [moltbot] collaboration | We get the best practices from alpha arena trading seasons and bring it to clawdbot All top AI agents, realtime monitoring and news research, gather info from private insiders and many other! Using Hyperliquid API.

xquik-dev

x-twitter-scraper

★ 16

X (Twitter) automation skill for AI coding agents. Tweet search, user lookup, follower/following extraction, media download, reply/retweet/quote extraction, 40+ tools, account monitoring & trending topics. REST API, MCP server, HMAC webhooks. Works with Claude Code, Cursor, Codex, Copilot, Windsurf & 40+ agents.

mohsinkhadim59

Openclaw-Setup

★ 8

Step-by-step guides for installing and running OpenClaw, an open-source AI agent, on Mac, Linux VPS, and AWS covering setup, security, messaging channels, Google integration, skills, and monitoring.