server-management | Skill Performance & Reviews | TopRankSkills

TopRank Skills

Home / Skills / tools / server-management

server-management

maintained by xenitV1

star 128 account_tree 21 verified_user MIT License
bolt View GitHub

name: server-management description: Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands. allowed-tools: Read, Write, Edit, Glob, Grep, Bash

Server Management

Server management principles for production operations. Learn to THINK, not memorize commands.


1. Process Management Principles

Tool Selection

Scenario Tool
Node.js app PM2 (clustering, reload)
Any app systemd (Linux native)
Containers Docker/Podman
Orchestration Kubernetes, Docker Swarm

Process Management Goals

Goal What It Means
Restart on crash Auto-recovery
Zero-downtime reload No service interruption
Clustering Use all CPU cores
Persistence Survive server reboot

2. Monitoring Principles

What to Monitor

Category Key Metrics
Availability Uptime, health checks
Performance Response time, throughput
Errors Error rate, types
Resources CPU, memory, disk

Alert Severity Strategy

Level Response
Critical Immediate action
Warning Investigate soon
Info Review daily

Monitoring Tool Selection

Need Options
Simple/Free PM2 metrics, htop
Full observability Grafana, Datadog
Error tracking Sentry
Uptime UptimeRobot, Pingdom

3. Log Management Principles

Log Strategy

Log Type Purpose
Application logs Debug, audit
Access logs Traffic analysis
Error logs Issue detection

Log Principles

  1. Rotate logs to prevent disk fill
  2. Structured logging (JSON) for parsing
  3. Appropriate levels (error/warn/info/debug)
  4. No sensitive data in logs

4. Scaling Decisions

When to Scale

Symptom Solution
High CPU Add instances (horizontal)
High memory Increase RAM or fix leak
Slow response Profile first, then scale
Traffic spikes Auto-scaling

Scaling Strategy

Type When to Use
Vertical Quick fix, single instance
Horizontal Sustainable, distributed
Auto Variable traffic

5. Health Check Principles

What Constitutes Healthy

Check Meaning
HTTP 200 Service responding
Database connected Data accessible
Dependencies OK External services reachable
Resources OK CPU/memory not exhausted

Health Check Implementation

  • Simple: Just return 200
  • Deep: Check all dependencies
  • Choose based on load balancer needs

6. Security Principles

Area Principle
Access SSH keys only, no passwords
Firewall Only needed ports open
Updates Regular security patches
Secrets Environment vars, not files
Audit Log access and changes

7. Troubleshooting Priority

When something's wrong:

  1. Check if running (process status)
  2. Check logs (error messages)
  3. Check resources (disk, memory, CPU)
  4. Check network (ports, DNS)
  5. Check dependencies (database, APIs)

8. Anti-Patterns

❌ Don't ✅ Do
Run as root Use non-root user
Ignore logs Set up log rotation
Skip monitoring Monitor from day one
Manual restarts Auto-restart config
No backups Regular backup schedule

Remember: A well-managed server is boring. That's the goal.

chat Comments (0)

chat_bubble_outline

No comments yet. Be the first to share your thoughts!

Skill Details

GitHub Stars 128
GitHub Forks 21
Created Mar 2026
Last Updated il y a 3 mois
tools tools system admin

Related Skills

docker-expert
chevron_right
telnyx-network
chevron_right
plex

plex

openclaw
star 2.4k
chevron_right
discord-governance
chevron_right
hetzner-provisioner
chevron_right

Build your own?

Join 12,000+ developers contributing to the Claude ecosystem.