Pending Pods: When Pods Can't Be Scheduled

Pending pods are frustrating. Your pod is created. It's waiting. It's not starting. Kubernetes can't schedule it. Here's how to fix it.

🎯 The Big Picture

Think of Pending pods like a guest waiting for a room. The guest is here. The room isn't ready. The problem isn't that the guest is waiting. The problem is why the room isn't ready.

Pending means Kubernetes can't schedule the pod. No resources. Node affinity. Taints. Here's how to fix it.

What is Pending?

Pending is a pod state that means:

Pod is created
Kubernetes can't schedule it
Pod is waiting
No container is running

Common reasons:

No nodes available
Insufficient resources
Node affinity/taints
Storage issues
Network issues

Understanding the Pending State

Pod states you'll see:

Pending  ← Pod waiting to be scheduled
    ↓
(No change until scheduled)

The pod stays Pending until the issue is resolved.

Step-by-Step Debugging Process

Step 1: Identify the Problem Pod

kubectl get pods

Look for:

Status: Pending
Age: How long has it been pending?

Example output:

NAME                    READY   STATUS    RESTARTS   AGE
my-app-abc123           0/1     Pending   0          10m

Step 2: Describe the Pod

kubectl describe pod <pod-name>

Look for:

Events: Why can't it be scheduled?
Conditions: What's the issue?
Node: Which node should it run on?

Key sections to check:

Events:
  Warning  FailedScheduling  5m ago   default-scheduler  
    0/3 nodes are available: 3 Insufficient cpu.
  
Conditions:
  Type           Status
  PodScheduled   False
  Reason:        Unschedulable

Step 3: Check Node Resources

kubectl top nodes
kubectl describe nodes

Look for:

Available CPU
Available memory
Node capacity
Allocatable resources

Common Causes and Solutions

Cause 1: Insufficient Resources

Symptoms:

Error: "Insufficient cpu" or "Insufficient memory"
Nodes don't have enough resources
Resource requests too high

Solutions:

Check node resources:
```
kubectl top nodes
kubectl describe nodes
```

Check pod resource requests:

kubectl describe pod <pod-name>
# Look for Requests section

Reduce resource requests:

resources:
  requests:
    cpu: "100m"      # Reduce if too high
    memory: "128Mi"  # Reduce if too high

Add more nodes:

# Scale cluster
# Or use cluster autoscaler

Cause 2: Node Affinity Rules

Symptoms:

Error: "node(s) didn't match node selector"
Pod has affinity rules
No nodes match

Solutions:

Check node affinity:

kubectl get pod <pod-name> -o yaml | grep -A 10 affinity

Check node labels:
```
kubectl get nodes --show-labels
```

Fix affinity rules:

# Remove or adjust affinity
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: zone
          operator: In
          values:
          - us-east-1a

Cause 3: Node Taints

Symptoms:

Error: "node(s) had taint"
Nodes have taints
Pod doesn't have toleration

Solutions:

Check node taints:

kubectl describe node <node-name>
# Look for Taints section

Add toleration to pod:

tolerations:
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoSchedule"

Or remove taint from node:

kubectl taint nodes <node-name> key1=value1:NoSchedule-

Cause 4: Storage Issues

Symptoms:

Error: "persistentvolumeclaim not found"
Error: "volume binding failed"
PVC doesn't exist or can't bind

Solutions:

Check PVC:

kubectl get pvc
kubectl describe pvc <pvc-name>

Check storage class:
```
kubectl get storageclass
```
Fix PVC:
- Create missing PVC
- Fix storage class
- Check available storage

Cause 5: No Nodes Available

Symptoms:

Error: "0/X nodes are available"
All nodes are unschedulable
Cluster has no nodes

Solutions:

Check nodes:
```
kubectl get nodes
```
Check node status:
```
kubectl describe nodes
```
Fix nodes:
- Uncordon nodes: kubectl uncordon <node-name>
- Add nodes to cluster
- Fix node issues

Real-World Example: Resource Constraints

Problem: Pod in Pending. Error:

0/3 nodes are available: 3 Insufficient cpu.

Debugging:

Checked node resources:

kubectl top nodes
# All nodes at 100% CPU

Checked pod requests:

kubectl describe pod <pod-name>
# Requesting 2 CPU

Reduced resource requests:

resources:
  requests:
    cpu: "500m"  # Reduced from 2
    memory: "256Mi"

Restarted deployment: Pod scheduled successfully

Solution: Resource requests too high. Reduced requests. Pod scheduled.

Hands-On Exercise: Debug Pending Pod

Create a pod with high resource requests:

apiVersion: v1
kind: Pod
metadata:
  name: pending-test
spec:
  containers:
  - name: app
    image: nginx:alpine
    resources:
      requests:
        cpu: "100"  # Too high! Will be pending
        memory: "1000Gi"  # Too high!

Apply it:

kubectl apply -f pending-test.yaml

Debug it:

Check pod status: kubectl get pods
Describe pod: kubectl describe pod pending-test
Check events: Look for scheduling errors
Fix the issue (reduce resource requests)

This is how you learn. Break things. Fix them.

My Take: Pending Pod Debugging

Pending pods used to confuse me. I'd see them and not know why.

Then I learned the systematic approach:

Describe the pod - See why it can't be scheduled
Check events - Error message tells you why
Check resources - CPU, memory, storage
Check affinity/taints - Node selection rules
Fix the root cause - Not just wait

Now I fix Pending pods in minutes, not hours.

Memory Tip: The Guest Waiting Analogy

Pending pods are like a guest waiting for a room:

Guest is here (Pod is created)
Room not ready (Can't be scheduled)
No rooms available (No resources)
Wrong room type (Affinity/taints)
Room key missing (Storage issues)

The error message tells you why. Read it carefully.

Common Mistakes

Not checking events: Events tell you why
Too high resource requests: Request what you need
Wrong affinity rules: Check node labels
Ignoring taints: Add tolerations or remove taints
Not checking storage: PVC might not exist

Key Takeaways

Pending means can't schedule - Find why
Check events - Error message tells you why
Check resources - CPU, memory, storage
Check affinity/taints - Node selection rules
Fix the root cause - Not just wait

What's Next?

Now that you understand Pending pods, let's tackle service discovery issues. Next: Service Troubleshooting.

Remember: Pending isn't the problem. It's the symptom. The events tell you why. Read them carefully. Fix the root cause.

🎯 The Big Picture​

What is Pending?​

Understanding the Pending State​

Step-by-Step Debugging Process​

Step 1: Identify the Problem Pod​

Step 2: Describe the Pod​

Step 3: Check Node Resources​

Common Causes and Solutions​

Cause 1: Insufficient Resources​

Cause 2: Node Affinity Rules​

Cause 3: Node Taints​

Cause 4: Storage Issues​

Cause 5: No Nodes Available​

Real-World Example: Resource Constraints​

Hands-On Exercise: Debug Pending Pod​

My Take: Pending Pod Debugging​

Memory Tip: The Guest Waiting Analogy​

Common Mistakes​

Key Takeaways​

What's Next?​

🎯 The Big Picture

What is Pending?

Understanding the Pending State

Step-by-Step Debugging Process

Step 1: Identify the Problem Pod

Step 2: Describe the Pod

Step 3: Check Node Resources

Common Causes and Solutions

Cause 1: Insufficient Resources

Cause 2: Node Affinity Rules

Cause 3: Node Taints

Cause 4: Storage Issues

Cause 5: No Nodes Available

Real-World Example: Resource Constraints

Hands-On Exercise: Debug Pending Pod

My Take: Pending Pod Debugging

Memory Tip: The Guest Waiting Analogy

Common Mistakes

Key Takeaways

What's Next?