PodDisruptionBudgets Deep Dive#
A PodDisruptionBudget (PDB) limits how many pods from a set can be simultaneously down during voluntary disruptions – node drains, cluster upgrades, autoscaler scale-down. PDBs do not protect against involuntary disruptions like node crashes or OOM kills. They are the mechanism by which you tell Kubernetes “this service needs at least N healthy pods at all times during maintenance.”
minAvailable vs maxUnavailable#
PDBs support two fields. Use one or the other, not both.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2 # at least 2 pods must remain available
selector:
matchLabels:
app: my-appapiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
maxUnavailable: 1 # at most 1 pod can be unavailable
selector:
matchLabels:
app: my-appUse maxUnavailable in almost all cases. Here is why: minAvailable is an absolute floor, so if you set minAvailable: 2 on a 3-replica deployment and scale it down to 2, the PDB now allows zero disruptions. maxUnavailable: 1 always allows exactly 1 pod to be evicted regardless of the current replica count, which is what you actually want during maintenance.
You can also use percentages:
spec:
maxUnavailable: "25%" # rounds up -- for 4 pods, allows 1 unavailablePercentage-based PDBs scale with your deployment, which makes them better for workloads that autoscale.
The Single-Replica Gotcha#
This is the most common PDB mistake in production:
# DO NOT DO THIS
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: my-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1 # only one pod
# ...With 1 replica and minAvailable: 1, the PDB allows zero disruptions. Node drains will block forever. The Cluster Autoscaler cannot scale down the node. Kubernetes upgrades stall.
Fixes:
- Best: Run at least 2 replicas for anything that needs a PDB.
- Acceptable: Use
maxUnavailable: 1instead, which always allows at least 1 eviction. - Last resort: Do not create a PDB for single-replica workloads that can tolerate brief downtime.
# Correct for any replica count
spec:
maxUnavailable: 1PDB and Cluster Autoscaler#
The Cluster Autoscaler evaluates PDBs before deciding to scale down a node. If evicting pods from a node would violate any PDB, the autoscaler skips that node.
Symptoms of PDB-blocked scale-down:
# Check autoscaler status
kubectl -n kube-system describe configmap cluster-autoscaler-status
# Look for entries like:
# ScaleDown: NoCandidates
# Reason: pod my-namespace/my-app-xyz with PDB my-namespace/my-app-pdbThis is often caused by:
- The single-replica PDB problem described above.
- Multiple PDBs selecting the same pods with conflicting constraints.
- Pods that are already unhealthy, reducing
currentHealthybelowdesiredHealthy.
If the autoscaler cannot scale down nodes, you waste money on idle compute. Audit PDBs regularly.
PDB with StatefulSets#
StatefulSets have ordered, sticky identity. During disruptions, pods are evicted one at a time in reverse ordinal order (pod-2, then pod-1, then pod-0). PDBs still apply:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: postgres-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: postgresFor databases and quorum-based systems, calculate the PDB value from the quorum requirement. A 3-node etcd cluster needs 2 nodes for quorum, so you can tolerate 1 unavailable:
spec:
maxUnavailable: 1 # preserves quorum for 3-node clustersA 5-node etcd cluster needs 3 for quorum, so you can tolerate 2:
spec:
maxUnavailable: 2unhealthyPodEvictionPolicy#
By default, PDBs protect unhealthy pods the same as healthy ones. This creates a deadlock: if a pod is stuck in CrashLoopBackOff, it counts against the PDB budget but will never become healthy. The PDB blocks eviction of the broken pod, which blocks the drain, which blocks the node upgrade.
Kubernetes 1.31+ (stable) supports unhealthyPodEvictionPolicy:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
maxUnavailable: 1
unhealthyPodEvictionPolicy: AlwaysAllow
selector:
matchLabels:
app: my-appAlwaysAllow means pods that are not in a Ready condition can always be evicted, even if it would violate the PDB. This is almost always what you want. The alternative value IfHealthy (the default) only allows unhealthy pod eviction when the PDB has disruption budget remaining.
Monitoring PDB Status#
# List all PDBs with their current status
kubectl get pdb --all-namespaces
# Output:
# NAMESPACE NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
# default my-app-pdb N/A 1 1 30d
# default redis-pdb 2 N/A 0 15d
# Detailed view
kubectl describe pdb my-app-pdb
# Status:
# Current Healthy: 3
# Desired Healthy: 2
# Disruptions Allowed: 1
# Expected Pods: 3Key fields:
- Disruptions Allowed: 0 means all eviction requests will be denied. Investigate immediately.
- Current Healthy: Should equal Expected Pods under normal conditions. If lower, pods are unhealthy.
- Desired Healthy: Calculated from
minAvailableorreplicas - maxUnavailable.
Voluntary vs Involuntary Disruptions#
PDBs only govern voluntary disruptions:
kubectl drain- Cluster Autoscaler scale-down
- Kubernetes upgrades
kubectl delete podthrough the Eviction API
PDBs do not protect against:
- Node hardware failure
- Kernel panic
- OOM kills
kubectl delete pod(direct delete, not eviction)
Direct pod deletion (kubectl delete pod my-pod) bypasses the Eviction API and therefore ignores PDBs entirely. If you are writing automation that removes pods, use the Eviction API to respect PDBs:
# This respects PDBs
kubectl evict pod my-pod
# This does NOT respect PDBs
kubectl delete pod my-podCommon PDB Patterns#
# Web frontend: tolerate 25% down
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: frontend-pdb
spec:
maxUnavailable: "25%"
selector:
matchLabels:
app: frontend
---
# Database: never lose quorum
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: db-pdb
spec:
maxUnavailable: 1
unhealthyPodEvictionPolicy: AlwaysAllow
selector:
matchLabels:
app: postgres
---
# Batch processor: can tolerate full restart
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: batch-pdb
spec:
maxUnavailable: "100%"
selector:
matchLabels:
app: batch-processor