Monitoring Containers: Knowing What's Happening

You can't manage what you can't measure. You can't fix what you don't know is broken. That's why monitoring matters.

🎯 The Big Picture

Think of monitoring like a dashboard in a car. You see speed (CPU). You see fuel (memory). You see temperature (health). You know what's happening. That's monitoring.

Monitoring tells you what's happening. Right now. In real-time. It's your window into containers.

Why Monitor Containers?

The problem without monitoring:

Don't know if containers are healthy
Don't know resource usage
Don't know when things break
Can't troubleshoot effectively
Flying blind

The solution with monitoring:

Know container health
Know resource usage
Know when things break
Can troubleshoot effectively
Always informed

Real example: I once had a production issue. No monitoring. Took hours to find the problem. With monitoring, I would have known immediately. Never again.

Monitoring isn't optional. It's essential.

What to Monitor

Key metrics to monitor:

1. Container Health

Is the container running?

docker ps
# Shows running containers

Is the container healthy?

docker inspect container | grep Health
# Shows health status

Think of it as: Is the car running? Is the engine healthy?

2. Resource Usage

CPU usage:

docker stats container
# Shows CPU usage

Memory usage:

docker stats container
# Shows memory usage

Disk usage:

docker system df
# Shows disk usage

Think of it as: How much fuel? How much power? How much storage?

3. Application Metrics

Request rate:

Requests per second
Response times
Error rates

Business metrics:

User activity
Transaction volume
Revenue impact

Think of it as: How many customers? How fast service? How many errors?

4. Logs

Application logs:

docker logs container
# Shows application logs

System logs:

journalctl -u docker
# Shows Docker logs

Think of it as: What happened? When? Why?

Basic Monitoring: docker stats

See resource usage:

# Live stats for all containers
docker stats

# Stats for specific container
docker stats container-name

# One-time snapshot
docker stats --no-stream container-name

# Custom format
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"

What you see:

CONTAINER   CPU %     MEM USAGE / LIMIT     MEM %     NET I/O
app         45.2%     256MiB / 512MiB       50.0%     1.2MB / 800KB
db          12.5%     1.2GiB / 2GiB        60.0%     500KB / 1.5MB

Think of it as: Dashboard. See everything. At a glance.

The Car Dashboard Analogy

Think of monitoring like a car dashboard:

Container health: Engine status CPU usage: RPM gauge Memory usage: Fuel gauge Network I/O: Speedometer Logs: Event recorder

Once you see it this way, monitoring makes perfect sense.

Advanced Monitoring: Prometheus & Grafana

Production monitoring setup:

docker-compose.yml:

services:
  # Prometheus - Metrics collection
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    ports:
      - "9090:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
    networks:
      - monitoring

  # Grafana - Visualization
  grafana:
    image: grafana/grafana:latest
    volumes:
      - grafana-data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards:ro
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    depends_on:
      - prometheus
    networks:
      - monitoring

  # Node Exporter - Host metrics
  node-exporter:
    image: prom/node-exporter:latest
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    ports:
      - "9100:9100"
    networks:
      - monitoring

volumes:
  prometheus-data:
  grafana-data:

networks:
  monitoring:
    driver: bridge

What this provides:

Metrics collection (Prometheus)
Visualization (Grafana)
Host metrics (Node Exporter)
Complete monitoring stack

Monitoring Best Practices

1. Monitor Everything

Monitor:

Container health
Resource usage
Application metrics
Logs
Network traffic

Why: Complete picture. Nothing missed.

2. Set Alerts

Alert on:

Container down
High CPU usage
High memory usage
High error rate
Slow response times

Why: Know immediately. Fix quickly.

3. Use Dashboards

Create dashboards:

Overview dashboard
Per-service dashboards
Infrastructure dashboard
Business metrics dashboard

Why: Visual. Easy to understand. Quick insights.

4. Log Aggregation

Centralize logs:

ELK Stack (Elasticsearch, Logstash, Kibana)
Loki + Grafana
Cloud logging (CloudWatch, Stackdriver)

Why: All logs in one place. Easy to search. Easy to analyze.

5. Monitor Trends

Track over time:

Resource usage trends
Error rate trends
Performance trends
User activity trends

Why: Predict problems. Plan capacity. Optimize.

Real-World Example: Complete Monitoring

Let me show you a complete setup:

1. Container metrics:

services:
  app:
    labels:
      - "prometheus.scrape=true"
      - "prometheus.port=3000"
      - "prometheus.path=/metrics"

2. Prometheus config:

# prometheus.yml
scrape_configs:
  - job_name: 'containers'
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
        refresh_interval: 5s

3. Grafana dashboard:

CPU usage graph
Memory usage graph
Request rate graph
Error rate graph
Response time graph

Complete monitoring. Production-ready.

My Take: Monitoring Strategy

Here's what I monitor:

Infrastructure:

Container health
Resource usage
Network traffic
Disk usage

Application:

Request rate
Response times
Error rates
Business metrics

Security:

Failed logins
Unusual activity
Access patterns
Compliance

The key: Monitor everything. Alert on important. Visualize for understanding. Act on alerts.

Memory Tip: The Car Dashboard Analogy

Monitoring = Car dashboard

Health: Engine status CPU: RPM Memory: Fuel Network: Speed Logs: Recorder

Once you see it this way, monitoring makes perfect sense.

Common Mistakes

Not monitoring: Flying blind
Too many alerts: Alert fatigue
No dashboards: Hard to understand
Not acting on alerts: Wasted effort
Not monitoring trends: Miss patterns

Hands-On Exercise: Monitor Containers

1. Run a container:

docker run -d --name test nginx

2. Monitor with stats:

docker stats test
# Watch CPU and memory

3. Check health:

docker inspect test | grep -A 5 Health

4. View logs:

docker logs -f test
# Follow logs

5. Check resource usage:

docker stats --no-stream test
# One-time snapshot

Key Takeaways

Monitor everything - Complete picture
Set alerts - Know immediately
Use dashboards - Visual understanding
Aggregate logs - Centralized search
Track trends - Predict problems
Act on alerts - Fix issues quickly

What's Next?

Now that you understand monitoring, let's learn about logging strategies. Next: Logging Strategies.

Remember: Monitoring is like a car dashboard. You see everything. You know what's happening. Essential for production. Never skip it.

🎯 The Big Picture​

Why Monitor Containers?​

What to Monitor​

1. Container Health​

2. Resource Usage​

3. Application Metrics​

4. Logs​

Basic Monitoring: docker stats​

The Car Dashboard Analogy​

Advanced Monitoring: Prometheus & Grafana​

Monitoring Best Practices​

1. Monitor Everything​

2. Set Alerts​

3. Use Dashboards​

4. Log Aggregation​

5. Monitor Trends​

Real-World Example: Complete Monitoring​

My Take: Monitoring Strategy​

Memory Tip: The Car Dashboard Analogy​

Common Mistakes​

Hands-On Exercise: Monitor Containers​

Key Takeaways​

What's Next?​