PodDisruptionBudgets & Zero-Downtime
Kubernetes moves pods constantly — during node drains, rolling updates, cluster autoscaler scale-downs, and cluster upgrades. Without controls, a drain that evicts three pods from a five-replica Deployment at once drops capacity by 60% instantly. PodDisruptionBudgets (PDBs) tell Kubernetes the minimum availability your application requires, so voluntary disruptions respect it.
Voluntary vs Involuntary
| Type | Cause | PDB applies? |
|---|---|---|
| Voluntary | Node drain, rolling update, cluster autoscaler scale-down, admin eviction | Yes — PDB blocks the eviction if it would violate the budget |
| Involuntary | Node hardware failure, kernel panic, OOM kill, kubelet crash | No — these are not controlled evictions |
PodDisruptionBudget
A PDB selects pods by label and declares either the minimum number that must be available (minAvailable) or the maximum number that can be unavailable (maxUnavailable) at any time during voluntary disruptions.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
namespace: production
spec:
selector:
matchLabels:
app: myapp # must match the Deployment's pod labels
minAvailable: 2 # OR use maxUnavailable: 1
minAvailable vs maxUnavailable
| Field | Value | Meaning (with 5 replicas) |
|---|---|---|
minAvailable: 2 | Absolute | At least 2 pods must be running. Can evict 3. |
minAvailable: "80%" | Percentage | At least 80% = 4 pods. Can evict 1. |
maxUnavailable: 1 | Absolute | At most 1 pod can be down. Must have 4 running. |
maxUnavailable: "20%" | Percentage | At most 20% = 1 pod. Must have 4 running. |
Setting minAvailable equal to the total replica count (or maxUnavailable: 0) means no pod can ever be voluntarily evicted — node drains will block indefinitely. Use minAvailable: N-1 where N is your replica count, or a percentage below 100%.
Rolling Update Strategy
Deployment rolling updates also respect PDBs but are configured separately via strategy:
spec:
replicas: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # temporarily run 6 pods during update
maxUnavailable: 0 # never drop below 5 running pods
# (requires maxSurge >= 1 or update stalls)
maxUnavailable: 0 with maxSurge: 1 is the zero-downtime rolling update: spin up one new pod, wait for it to pass readiness checks, then terminate one old pod. Slower but zero dropped requests.
preStop Hook & terminationGracePeriod
When a pod is evicted or a rolling update terminates a pod, Kubernetes sends SIGTERM and starts the termination grace period (default 30s). The container must finish in-flight requests before SIGTERM kills it. A preStop hook adds a deliberate sleep to let the load balancer remove the pod from its backends before SIGTERM arrives — avoiding a window where traffic still routes to a terminating pod.
spec:
terminationGracePeriodSeconds: 60 # total budget for preStop + SIGTERM handler
containers:
- name: api
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"] # give LB time to deregister
# The app's SIGTERM handler should stop accepting new requests,
# drain in-flight ones, then exit cleanly within the remaining budget
Readiness Gates
By default, a pod is considered ready when all containers pass their readiness probe. Pod Readiness Gates add external conditions — the pod isn't marked ready until an external controller (e.g. a load balancer controller) confirms it's registered in the backend pool. This eliminates the race condition where traffic hits a new pod before the LB backend is updated.
spec:
readinessGates:
- conditionType: "target-health.elbv2.k8s.aws/my-tg" # AWS ALB controller sets this
# Pod stays unready (gets no traffic) until ALB marks it healthy in target group
Topology Spread Constraints
Distribute pods evenly across failure domains (zones, nodes) so a single zone outage doesn't take down all replicas.
spec:
topologySpreadConstraints:
- maxSkew: 1 # max difference in pod count between zones
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule # hard constraint
labelSelector:
matchLabels:
app: myapp
- maxSkew: 1
topologyKey: kubernetes.io/hostname # also spread across nodes within a zone
whenUnsatisfiable: ScheduleAnyway # soft constraint — try but don't block
kubectl Commands
# List all PDBs in a namespace
kubectl get pdb -n production
# Describe a PDB — shows current allowed disruptions
kubectl describe pdb myapp-pdb -n production
# Check if drain would block (dry run)
kubectl drain node-1 --ignore-daemonsets --dry-run
# Check disruption status
kubectl get pdb -n production -o json | jq '.items[] | {name:.metadata.name, allowed:.status.disruptionsAllowed, desired:.status.desiredHealthy, current:.status.currentHealthy}'
# List all pods and their disruption budget
kubectl get pods -n production -l app=myapp -o wide