Monitoring Containers: Knowing What's Happening
You can't manage what you can't measure. You can't fix what you don't know is broken. That's why monitoring matters.
🎯 The Big Picture​
Think of monitoring like a dashboard in a car. You see speed (CPU). You see fuel (memory). You see temperature (health). You know what's happening. That's monitoring.
Monitoring tells you what's happening. Right now. In real-time. It's your window into containers.
Why Monitor Containers?​
The problem without monitoring:
- Don't know if containers are healthy
- Don't know resource usage
- Don't know when things break
- Can't troubleshoot effectively
- Flying blind
The solution with monitoring:
- Know container health
- Know resource usage
- Know when things break
- Can troubleshoot effectively
- Always informed
Real example: I once had a production issue. No monitoring. Took hours to find the problem. With monitoring, I would have known immediately. Never again.
Monitoring isn't optional. It's essential.
What to Monitor​
Key metrics to monitor:
1. Container Health​
Is the container running?
docker ps
# Shows running containers
Is the container healthy?
docker inspect container | grep Health
# Shows health status
Think of it as: Is the car running? Is the engine healthy?
2. Resource Usage​
CPU usage:
docker stats container
# Shows CPU usage
Memory usage:
docker stats container
# Shows memory usage
Disk usage:
docker system df
# Shows disk usage
Think of it as: How much fuel? How much power? How much storage?
3. Application Metrics​
Request rate:
- Requests per second
- Response times
- Error rates
Business metrics:
- User activity
- Transaction volume
- Revenue impact
Think of it as: How many customers? How fast service? How many errors?
4. Logs​
Application logs:
docker logs container
# Shows application logs
System logs:
journalctl -u docker
# Shows Docker logs
Think of it as: What happened? When? Why?
Basic Monitoring: docker stats​
See resource usage:
# Live stats for all containers
docker stats
# Stats for specific container
docker stats container-name
# One-time snapshot
docker stats --no-stream container-name
# Custom format
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
What you see:
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O
app 45.2% 256MiB / 512MiB 50.0% 1.2MB / 800KB
db 12.5% 1.2GiB / 2GiB 60.0% 500KB / 1.5MB
Think of it as: Dashboard. See everything. At a glance.
The Car Dashboard Analogy​
Think of monitoring like a car dashboard:
Container health: Engine status CPU usage: RPM gauge Memory usage: Fuel gauge Network I/O: Speedometer Logs: Event recorder
Once you see it this way, monitoring makes perfect sense.
Advanced Monitoring: Prometheus & Grafana​
Production monitoring setup:
docker-compose.yml:
services:
# Prometheus - Metrics collection
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus-data:/prometheus
ports:
- "9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
networks:
- monitoring
# Grafana - Visualization
grafana:
image: grafana/grafana:latest
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/dashboards:/etc/grafana/provisioning/dashboards:ro
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
depends_on:
- prometheus
networks:
- monitoring
# Node Exporter - Host metrics
node-exporter:
image: prom/node-exporter:latest
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- "9100:9100"
networks:
- monitoring
volumes:
prometheus-data:
grafana-data:
networks:
monitoring:
driver: bridge
What this provides:
- Metrics collection (Prometheus)
- Visualization (Grafana)
- Host metrics (Node Exporter)
- Complete monitoring stack
Monitoring Best Practices​
1. Monitor Everything​
Monitor:
- Container health
- Resource usage
- Application metrics
- Logs
- Network traffic
Why: Complete picture. Nothing missed.
2. Set Alerts​
Alert on:
- Container down
- High CPU usage
- High memory usage
- High error rate
- Slow response times
Why: Know immediately. Fix quickly.
3. Use Dashboards​
Create dashboards:
- Overview dashboard
- Per-service dashboards
- Infrastructure dashboard
- Business metrics dashboard
Why: Visual. Easy to understand. Quick insights.
4. Log Aggregation​
Centralize logs:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Loki + Grafana
- Cloud logging (CloudWatch, Stackdriver)
Why: All logs in one place. Easy to search. Easy to analyze.
5. Monitor Trends​
Track over time:
- Resource usage trends
- Error rate trends
- Performance trends
- User activity trends
Why: Predict problems. Plan capacity. Optimize.
Real-World Example: Complete Monitoring​
Let me show you a complete setup:
1. Container metrics:
services:
app:
labels:
- "prometheus.scrape=true"
- "prometheus.port=3000"
- "prometheus.path=/metrics"
2. Prometheus config:
# prometheus.yml
scrape_configs:
- job_name: 'containers'
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
3. Grafana dashboard:
- CPU usage graph
- Memory usage graph
- Request rate graph
- Error rate graph
- Response time graph
Complete monitoring. Production-ready.
My Take: Monitoring Strategy​
Here's what I monitor:
Infrastructure:
- Container health
- Resource usage
- Network traffic
- Disk usage
Application:
- Request rate
- Response times
- Error rates
- Business metrics
Security:
- Failed logins
- Unusual activity
- Access patterns
- Compliance
The key: Monitor everything. Alert on important. Visualize for understanding. Act on alerts.
Memory Tip: The Car Dashboard Analogy​
Monitoring = Car dashboard
Health: Engine status CPU: RPM Memory: Fuel Network: Speed Logs: Recorder
Once you see it this way, monitoring makes perfect sense.
Common Mistakes​
- Not monitoring: Flying blind
- Too many alerts: Alert fatigue
- No dashboards: Hard to understand
- Not acting on alerts: Wasted effort
- Not monitoring trends: Miss patterns
Hands-On Exercise: Monitor Containers​
1. Run a container:
docker run -d --name test nginx
2. Monitor with stats:
docker stats test
# Watch CPU and memory
3. Check health:
docker inspect test | grep -A 5 Health
4. View logs:
docker logs -f test
# Follow logs
5. Check resource usage:
docker stats --no-stream test
# One-time snapshot
Key Takeaways​
- Monitor everything - Complete picture
- Set alerts - Know immediately
- Use dashboards - Visual understanding
- Aggregate logs - Centralized search
- Track trends - Predict problems
- Act on alerts - Fix issues quickly
What's Next?​
Now that you understand monitoring, let's learn about logging strategies. Next: Logging Strategies.
Remember: Monitoring is like a car dashboard. You see everything. You know what's happening. Essential for production. Never skip it.