Service Troubleshooting: When Services Don't Work

Service issues are frustrating. Your pods are running. Your service exists. But nothing works. Here's how to fix it.

🎯 The Big Picture

Think of service issues like a hotel phone system. The phones exist. The rooms exist. But calls aren't connecting. The problem isn't that things exist. The problem is why they're not connecting.

Service troubleshooting involves checking selectors, endpoints, ports, and network policies. Here's how to fix it.

Common Service Issues

Symptoms:

Service exists but not accessible
Connection refused
Timeout errors
Service not routing traffic

Step-by-Step Debugging Process

Step 1: Check Service Status

kubectl get svc
kubectl describe svc <service-name>

Look for:

Service type
ClusterIP/NodePort/LoadBalancer
Selectors
Ports

Step 2: Check Endpoints

kubectl get endpoints <service-name>
kubectl describe endpoints <service-name>

Key check:

Are endpoints empty? → Selectors don't match pods
Are endpoints correct? → Verify pod IPs

Example:

NAME         ENDPOINTS                    AGE
my-service   10.244.1.5:8080,10.244.2.3:8080   5m

Step 3: Check Pod Selectors

kubectl get pods --show-labels
kubectl get svc <service-name> -o yaml | grep selector

Key check:

Do pod labels match service selectors?
Are labels correct?

Common Causes and Solutions

Cause 1: Selector Mismatch

Symptoms:

Service has no endpoints
Endpoints list is empty
Pods exist but not connected

Solutions:

Check service selectors:

kubectl get svc <service-name> -o yaml
# Look for selector section

Check pod labels:
```
kubectl get pods --show-labels
```

Fix selector or labels:

# Service
apiVersion: v1
kind: Service
spec:
  selector:
    app: my-app  # Must match pod labels

# Pod
metadata:
  labels:
    app: my-app  # Must match service selector

Cause 2: Wrong Port

Symptoms:

Service exists
Endpoints exist
Connection refused

Solutions:

Check service port:

kubectl get svc <service-name>
# Check PORT(S) column

Check pod port:

kubectl describe pod <pod-name>
# Look for container port

Fix port mapping:

apiVersion: v1
kind: Service
spec:
  ports:
  - port: 80        # Service port
    targetPort: 8080  # Pod port (must match)

Cause 3: Pods Not Ready

Symptoms:

Endpoints exist
Pods are running
But not receiving traffic

Solutions:

Check pod readiness:

kubectl get pods
# Check READY column (should be 1/1)

Check readiness probe:

kubectl describe pod <pod-name>
# Look for Readiness section

Fix readiness probe:

readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

Cause 4: Network Policies

Symptoms:

Everything looks correct
But traffic blocked
Network policy blocking

Solutions:

Check network policies:

kubectl get networkpolicies
kubectl describe networkpolicy <policy-name>

Check if policy blocks traffic:
- Review ingress rules
- Review egress rules
- Check pod selectors

Fix or adjust policy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
  podSelector:
    matchLabels:
      app: my-app
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: client
  egress:
  - {}  # Allow all egress

Cause 5: Service Type Issues

Symptoms:

Service exists
Can't access from outside
Wrong service type

Solutions:

Check service type:
```
kubectl get svc
# Check TYPE column
```

Use correct type:

# ClusterIP (internal only)
apiVersion: v1
kind: Service
spec:
  type: ClusterIP

# NodePort (external access)
spec:
  type: NodePort
  ports:
  - port: 80
    nodePort: 30080

# LoadBalancer (cloud)
spec:
  type: LoadBalancer

Real-World Example: Selector Mismatch

Problem: Service exists but no endpoints. Can't access application.

Debugging:

Checked service:

kubectl get svc my-service
# Service exists

Checked endpoints:

kubectl get endpoints my-service
# Endpoints: <none>

Checked selectors:

kubectl get svc my-service -o yaml | grep selector
# selector: app: my-app

kubectl get pods --show-labels
# Labels: app: myapplication  # Mismatch!

Fixed labels:

# Updated pod labels to match service selector
metadata:
  labels:
    app: my-app  # Changed from myapplication

Verified:

kubectl get endpoints my-service
# Endpoints now populated

Solution: Selector mismatch. Fixed labels. Service working.

Hands-On Exercise: Debug Service

Create service with wrong selector:

apiVersion: v1
kind: Service
metadata:
  name: test-service
spec:
  selector:
    app: wrong-label  # Won't match pods
  ports:
  - port: 80
    targetPort: 8080

Create pod with different label:

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  labels:
    app: correct-label  # Doesn't match service
spec:
  containers:
  - name: app
    image: nginx:alpine
    ports:
    - containerPort: 80

Debug it:

Check service: kubectl get svc test-service
Check endpoints: kubectl get endpoints test-service
Check selectors: Compare service selector with pod labels
Fix the issue (match labels or selector)

This is how you learn. Break things. Fix them.

My Take: Service Troubleshooting

Service issues used to confuse me. I'd see services but nothing worked.

Then I learned the systematic approach:

Check endpoints - Are they populated?
Check selectors - Do they match pod labels?
Check ports - Are they correct?
Check readiness - Are pods ready?
Check network policies - Are they blocking?

Now I fix service issues in minutes, not hours.

Memory Tip: The Hotel Phone System Analogy

Service issues are like hotel phone system:

Phones exist (Service exists)
Rooms exist (Pods exist)
Wrong room number (Selector mismatch)
Wrong extension (Port mismatch)
Phone off hook (Pod not ready)
Call blocked (Network policy)

Check each component. Find the mismatch.

Common Mistakes

Not checking endpoints: Empty endpoints = selector mismatch
Port mismatch: Service port vs pod port
Labels don't match: Selector vs pod labels
Not checking readiness: Pods not ready
Ignoring network policies: Policies might block

Key Takeaways

Check endpoints first - Empty = selector issue
Verify selectors match labels - Must be exact
Check ports - Service port vs pod port
Check readiness - Pods must be ready
Check network policies - Might be blocking

What's Next?

Now that you understand service troubleshooting, let's tackle storage issues. Next: Storage Troubleshooting.

Remember: Service issues are usually selector mismatches or port issues. Check endpoints first. Verify selectors. Check ports. Fix the mismatch.

🎯 The Big Picture​

Common Service Issues​

Step-by-Step Debugging Process​

Step 1: Check Service Status​

Step 2: Check Endpoints​

Step 3: Check Pod Selectors​

Common Causes and Solutions​

Cause 1: Selector Mismatch​

Cause 2: Wrong Port​

Cause 3: Pods Not Ready​

Cause 4: Network Policies​

Cause 5: Service Type Issues​

Real-World Example: Selector Mismatch​

Hands-On Exercise: Debug Service​

My Take: Service Troubleshooting​

Memory Tip: The Hotel Phone System Analogy​

Common Mistakes​

Key Takeaways​

What's Next?​

🎯 The Big Picture

Common Service Issues

Step-by-Step Debugging Process

Step 1: Check Service Status

Step 2: Check Endpoints

Step 3: Check Pod Selectors

Common Causes and Solutions

Cause 1: Selector Mismatch

Cause 2: Wrong Port

Cause 3: Pods Not Ready

Cause 4: Network Policies

Cause 5: Service Type Issues

Real-World Example: Selector Mismatch

Hands-On Exercise: Debug Service

My Take: Service Troubleshooting

Memory Tip: The Hotel Phone System Analogy

Common Mistakes

Key Takeaways

What's Next?