Kubernetes Deployment Strategies#
Every deployment strategy answers the same question: how do you replace running pods with new ones without breaking things for users? The answer depends on your tolerance for downtime, risk appetite, and infrastructure complexity.
Rolling Update (Default)#
Rolling updates replace pods incrementally. Kubernetes creates new pods before killing old ones, keeping the service available throughout. This is the default strategy for Deployments.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-api
spec:
replicas: 4
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 10
selector:
matchLabels:
app: web-api
template:
metadata:
labels:
app: web-api
spec:
containers:
- name: web-api
image: web-api:2.1.0
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 5Key parameters:
- maxSurge – How many pods above the desired count can exist during the update.
maxSurge: 1with 4 replicas means at most 5 pods exist at once. Can be a number or percentage (25%). - maxUnavailable – How many pods can be unavailable during the update.
maxUnavailable: 0means every old pod stays running until its replacement is ready. This is the safest option but slowest. - minReadySeconds – How long a new pod must report Ready before Kubernetes considers it available. Without this, a pod that passes its readiness probe once and then crashes will still be counted as successfully rolled out. Set this to at least 10 seconds for production workloads.
A readiness probe is critical here. Without one, Kubernetes considers a pod ready as soon as the container starts, meaning traffic hits pods before your application is actually listening.
Common combinations:
# Conservative: no downtime, slower rollout
maxSurge: 1
maxUnavailable: 0
# Balanced: some overlap, some unavailability
maxSurge: 25%
maxUnavailable: 25%
# Fast: aggressive replacement
maxSurge: 50%
maxUnavailable: 50%Recreate#
The Recreate strategy kills all existing pods before creating new ones. There is downtime. Use this when your application cannot tolerate two versions running simultaneously – for example, a database migration runner or a singleton worker that holds a lock.
spec:
strategy:
type: RecreateThat is the entire configuration. No other parameters apply.
Blue-Green with Labels and Services#
Kubernetes does not have a native blue-green strategy, but you can build one with two Deployments and a Service selector switch.
Deploy both versions simultaneously. The Service points to one of them.
# Blue deployment (currently live)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-api-blue
spec:
replicas: 4
selector:
matchLabels:
app: web-api
slot: blue
template:
metadata:
labels:
app: web-api
slot: blue
version: "2.0.0"
spec:
containers:
- name: web-api
image: web-api:2.0.0
---
# Green deployment (new version, not receiving traffic yet)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-api-green
spec:
replicas: 4
selector:
matchLabels:
app: web-api
slot: green
template:
metadata:
labels:
app: web-api
slot: green
version: "2.1.0"
spec:
containers:
- name: web-api
image: web-api:2.1.0
---
# Service pointing to blue
apiVersion: v1
kind: Service
metadata:
name: web-api
spec:
selector:
app: web-api
slot: blue
ports:
- port: 80
targetPort: 8080Once the green deployment is healthy, switch traffic by patching the Service:
kubectl patch svc web-api -p '{"spec":{"selector":{"slot":"green"}}}'To roll back, patch it back to blue. The old deployment is still running. After you confirm the new version is stable, scale down the old deployment:
kubectl scale deployment web-api-blue --replicas=0The advantage is instant rollback. The cost is running double the pods during the transition.
Canary Deployments#
A canary sends a small percentage of traffic to the new version before committing fully. The simplest approach uses two Deployments behind one Service with no slot label – the Service selects on app: web-api only, and Kubernetes round-robins across all matching pods.
# Main deployment: 9 replicas of v2.0.0
# Canary deployment: 1 replica of v2.1.0
# Result: ~10% of traffic hits the canaryThis is coarse. For fine-grained traffic splitting, use Argo Rollouts:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: web-api
spec:
replicas: 4
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
- setWeight: 30
- pause: {duration: 5m}
- setWeight: 60
- pause: {duration: 5m}
canaryService: web-api-canary
stableService: web-api-stable
trafficRouting:
nginx:
stableIngress: web-api-ingressThis ramps traffic from 10% to 30% to 60% with pauses between each step. If metrics look bad at any step, run kubectl argo rollouts abort web-api.
Rollbacks#
For standard Deployments, Kubernetes stores rollout history:
# See rollout history
kubectl rollout history deployment/web-api
# Roll back to the previous revision
kubectl rollout undo deployment/web-api
# Roll back to a specific revision
kubectl rollout undo deployment/web-api --to-revision=3
# Check rollout status
kubectl rollout status deployment/web-apiBy default Kubernetes keeps 10 revisions (revisionHistoryLimit). Set this explicitly in production so you always have rollback targets:
spec:
revisionHistoryLimit: 5Which Strategy to Use#
| Scenario | Strategy | Why |
|---|---|---|
| Standard web service | Rolling update | Zero downtime, simple |
| Database migration job | Recreate | Cannot run two versions |
| Risk-sensitive production | Blue-green | Instant rollback |
| Gradual validation needed | Canary | Controlled exposure |
Start with rolling updates. Move to blue-green or canary when you have the monitoring in place to actually detect problems during a rollout – without observability, a canary is just a rolling update with extra steps.