Production Operations

PodDisruptionBudgets & Zero-Downtime

● Advanced ⏱ 12 min read

Kubernetes moves pods constantly — during node drains, rolling updates, cluster autoscaler scale-downs, and cluster upgrades. Without controls, a drain that evicts three pods from a five-replica Deployment at once drops capacity by 60% instantly. PodDisruptionBudgets (PDBs) tell Kubernetes the minimum availability your application requires, so voluntary disruptions respect it.

Voluntary vs Involuntary

Type	Cause	PDB applies?
Voluntary	Node drain, rolling update, cluster autoscaler scale-down, admin eviction	Yes — PDB blocks the eviction if it would violate the budget
Involuntary	Node hardware failure, kernel panic, OOM kill, kubelet crash	No — these are not controlled evictions

PodDisruptionBudget

A PDB selects pods by label and declares either the minimum number that must be available (minAvailable) or the maximum number that can be unavailable (maxUnavailable) at any time during voluntary disruptions.

PDB in action — drain blocked until budget is safe

Without PDB

5 replicas, drain node-1

pod-1 ✗

pod-2 ✗

pod-3 ✗

pod-4 ✓

pod-5 ✓

60% capacity lost instantly. Requests fail during rescheduling.

With PDB (minAvailable: 4)

drain evicts 1 at a time, waits for reschedule

pod-1 ✗

pod-2 ✓

pod-3 ✓

pod-4 ✓

pod-5 ✓

Always ≥ 4 pods. Drain takes longer but no capacity drop.

PDB with minAvailable: 4 forces drain to evict one pod at a time and wait for rescheduling before proceeding. Slower drain, zero capacity loss.

poddisruptionbudget.yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
  namespace: production
spec:
  selector:
    matchLabels:
      app: myapp          # must match the Deployment's pod labels
  minAvailable: 2         # OR use maxUnavailable: 1

minAvailable vs maxUnavailable

Field	Value	Meaning (with 5 replicas)
`minAvailable: 2`	Absolute	At least 2 pods must be running. Can evict 3.
`minAvailable: "80%"`	Percentage	At least 80% = 4 pods. Can evict 1.
`maxUnavailable: 1`	Absolute	At most 1 pod can be down. Must have 4 running.
`maxUnavailable: "20%"`	Percentage	At most 20% = 1 pod. Must have 4 running.

⚠️

minAvailable: 100% blocks all drains

Setting minAvailable equal to the total replica count (or maxUnavailable: 0) means no pod can ever be voluntarily evicted — node drains will block indefinitely. Use minAvailable: N-1 where N is your replica count, or a percentage below 100%.

Rolling Update Strategy

Deployment rolling updates also respect PDBs but are configured separately via strategy:

deployment rolling update strategy

spec:
  replicas: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1           # temporarily run 6 pods during update
      maxUnavailable: 0     # never drop below 5 running pods
                            # (requires maxSurge >= 1 or update stalls)

maxUnavailable: 0 with maxSurge: 1 is the zero-downtime rolling update: spin up one new pod, wait for it to pass readiness checks, then terminate one old pod. Slower but zero dropped requests.

preStop Hook & terminationGracePeriod

When a pod is evicted or a rolling update terminates a pod, Kubernetes sends SIGTERM and starts the termination grace period (default 30s). The container must finish in-flight requests before SIGTERM kills it. A preStop hook adds a deliberate sleep to let the load balancer remove the pod from its backends before SIGTERM arrives — avoiding a window where traffic still routes to a terminating pod.

preStop hook — drain connections before shutdown

spec:
  terminationGracePeriodSeconds: 60   # total budget for preStop + SIGTERM handler
  containers:
  - name: api
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 5"]  # give LB time to deregister
    # The app's SIGTERM handler should stop accepting new requests,
    # drain in-flight ones, then exit cleanly within the remaining budget

Readiness Gates

By default, a pod is considered ready when all containers pass their readiness probe. Pod Readiness Gates add external conditions — the pod isn't marked ready until an external controller (e.g. a load balancer controller) confirms it's registered in the backend pool. This eliminates the race condition where traffic hits a new pod before the LB backend is updated.

spec:
  readinessGates:
  - conditionType: "target-health.elbv2.k8s.aws/my-tg"  # AWS ALB controller sets this
  # Pod stays unready (gets no traffic) until ALB marks it healthy in target group

Topology Spread Constraints

Distribute pods evenly across failure domains (zones, nodes) so a single zone outage doesn't take down all replicas.

spec:
  topologySpreadConstraints:
  - maxSkew: 1                              # max difference in pod count between zones
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule        # hard constraint
    labelSelector:
      matchLabels:
        app: myapp
  - maxSkew: 1
    topologyKey: kubernetes.io/hostname     # also spread across nodes within a zone
    whenUnsatisfiable: ScheduleAnyway       # soft constraint — try but don't block

kubectl Commands

# List all PDBs in a namespace
kubectl get pdb -n production

# Describe a PDB — shows current allowed disruptions
kubectl describe pdb myapp-pdb -n production

# Check if drain would block (dry run)
kubectl drain node-1 --ignore-daemonsets --dry-run

# Check disruption status
kubectl get pdb -n production -o json | jq '.items[] | {name:.metadata.name, allowed:.status.disruptionsAllowed, desired:.status.desiredHealthy, current:.status.currentHealthy}'

# List all pods and their disruption budget
kubectl get pods -n production -l app=myapp -o wide