name: observability-debugging description: Debug and troubleshoot SEA-Forge™ services using OpenTelemetry traces, metrics, and logs. Use for investigating performance issues, distributed tracing, log analysis, and production incident response. Integrates with OTel Collector, OpenObserve, and Logfire. license: Complete terms in LICENSE.txt

Observability & Debugging

Debug SEA-Forge™ services using unified telemetry (traces, metrics, logs) with OpenTelemetry. This skill covers distributed tracing, performance investigation, and log analysis.

For reference: Observability Handbook | SDS-030: Semantic Observability

When to Use This Skill

Performance Issues: Slow requests, high latency
Errors/Failures: 500 errors, exceptions, crashes
Distributed Tracing: Track requests across services
Production Incidents: Real-time troubleshooting
Optimization: Identify bottlenecks

Quick Reference

View Traces

# OpenObserve UI
http://localhost:5080

# Query traces with specific service
curl -X POST http://localhost:5080/api/v1/traces \
  -d '{"serviceName": "case-management"}'

Query Metrics

# Counter metrics
cmmn_cases_active_total{type="research"}

# Histogram metrics
http_request_duration_seconds_bucket{le="0.5"}

# Gauge metrics
knowledge_graph_triples_total

Search Logs

# Logfire structured logs
SELECT * FROM logs
WHERE level='ERROR'
  AND service='artifact-engine'
  AND timestamp > NOW() - INTERVAL '1 hour'

Debugging Workflows

1. Slow Request Investigation

Symptoms: User reports slow page load

Steps:

Find trace ID from logs or HTTP headers
Query OpenObserve for trace
Identify slowest spans
Check for N+1 queries, external API calls
Correlate with metrics (CPU, memory)

2. Error Root Cause Analysis

Symptoms: 500 error reported

Steps:

Search logs for error message
Get trace ID from error context
Visualize full trace to see failure point
Check span attributes for exception details
Correlate with recent deployments

3. Performance Regression

Symptoms: Latency increased after deployment

Steps:

Compare p95 latency metrics before/after
Identify services with increased latency
Diff traces from both periods
Check for new database queries or API calls

Integration with SEA-Forge™

Semantic Envelope (SDS-030)

Every telemetry signal includes semantic context:

# Traces include semantic refs
with tracer.start_as_current_span("generate_artifact") as span:
    span.set_attribute("sea.conceptId", "sea:CognitiveArtifact")
    span.set_attribute("sea.caseId", "case-2026-001")
    span.set_attribute("sea.boundedContext", "cognitive-extension")

Case Management Correlation

Link observability to CMMN cases:

# Log with case context
logger.info(
    "Task activated",
    extra={
        "caseId": "case-2026-001",
        "taskId": "task-003",
        "stageId": "whitelabeling"
    }
)

TopRank Skills

observability-debugging

Observability & Debugging

When to Use This Skill

Quick Reference

View Traces

Query Metrics

Search Logs

Debugging Workflows

1. Slow Request Investigation

2. Error Root Cause Analysis

3. Performance Regression

Integration with SEA-Forge™

Semantic Envelope (SDS-030)

Case Management Correlation

References

chat Comments (0)

Skill Details

Related Skills

dagger-design-proposals

fabric

docker-expert

typescript-expert

nestjs-expert

Build your own?

Sign in to Comment

observability-debugging

Observability & Debugging

When to Use This Skill

Quick Reference

View Traces

Query Metrics

Search Logs

Debugging Workflows

1. Slow Request Investigation

2. Error Root Cause Analysis

3. Performance Regression

Integration with SEA-Forge™

Semantic Envelope (SDS-030)

Case Management Correlation

References

chat Comments (0)

Skill Details

Related Skills

dagger-design-proposals

fabric

docker-expert

typescript-expert

nestjs-expert

Build your own?