Building a Kubernetes Deployment Pipeline: From Code Push to Production#
A deployment pipeline connects a code commit to a running container in your cluster. This operational sequence walks through building one end-to-end, with decision points at each phase and verification steps to confirm the pipeline works before moving on.
Phase 1 – Source Control and CI#
Step 1: Repository Structure#
Every deployable service needs three things alongside its application code: a Dockerfile, deployment manifests, and a CI pipeline definition.
my-service/
src/
Dockerfile
helm/
Chart.yaml
values.yaml
templates/
.github/workflows/ci.yamlIf you use Kustomize instead of Helm, replace the helm/ directory with k8s/base/ and k8s/overlays/.
Step 2: Branching Strategy#
Trunk-based development is the recommended approach. Developers commit to main frequently, feature flags gate incomplete work, and every commit to main is a release candidate. This avoids long-lived branches and merge conflicts.
GitFlow uses develop, release/*, and hotfix/* branches. It adds process overhead and slows down feedback. Use it only if your organization requires formal release trains with multiple concurrent versions.
Step 3: CI Pipeline Setup#
A GitHub Actions workflow that builds, tests, scans, and pushes a container image:
# .github/workflows/ci.yaml
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-push:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Run linting
run: make lint
- name: Run unit tests
run: make test-unit
- name: Run integration tests
run: make test-integration
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to container registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Generate image tag
id: tag
run: |
SHA=$(git rev-parse --short HEAD)
echo "tag=${SHA}" >> "$GITHUB_OUTPUT"
- name: Build and push image
uses: docker/build-push-action@v5
with:
context: .
push: ${{ github.event_name == 'push' }}
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
platforms: linux/amd64,linux/arm64
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Scan image for CVEs
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
format: table
exit-code: 1
severity: CRITICAL,HIGHUse the short Git SHA as the image tag, not :latest. Every build produces a unique, traceable tag. The :latest tag is included as a convenience but should never be referenced in deployment manifests.
Step 4: Verification#
Push a commit to main. Confirm the CI pipeline runs, the image appears in your container registry with the correct SHA tag, and the Trivy scan passes. If the scan reports critical CVEs, fix them before proceeding.
Phase 2 – Configuration Repository#
Step 5: Separate Config Repository#
Create a dedicated repository (or a deploy/ directory in a monorepo) for deployment manifests. This separates application concerns from deployment concerns and lets ArgoCD or Flux watch a single source of truth.
deploy-configs/
apps/
my-service/
base/
deployment.yaml
service.yaml
kustomization.yaml
overlays/
dev/
kustomization.yaml
staging/
kustomization.yaml
prod/
kustomization.yamlStep 6: Environment Overlays#
The base contains the common deployment spec. Overlays patch per-environment values.
# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml# overlays/dev/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: my-service-dev
resources:
- ../../base
patches:
- patch: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-service
spec:
replicas: 1
template:
spec:
containers:
- name: my-service
resources:
requests:
cpu: 100m
memory: 128Mi
images:
- name: my-service
newName: ghcr.io/myorg/my-service
newTag: abc1234Step 7: Image Tag Update Strategy#
Option A – CI-driven update: After pushing the image, the CI pipeline updates the image tag in the config repo via a commit. This is simple and explicit.
# Add to the CI workflow after the image push step
- name: Update image tag in config repo
run: |
git clone https://x-access-token:${{ secrets.CONFIG_REPO_TOKEN }}@github.com/myorg/deploy-configs.git
cd deploy-configs
cd apps/my-service/overlays/dev
kustomize edit set image my-service=ghcr.io/myorg/my-service:${{ steps.tag.outputs.tag }}
git add .
git commit -m "Update my-service to ${{ steps.tag.outputs.tag }}"
git pushOption B – ArgoCD Image Updater or Flux Image Automation: The GitOps tool watches the container registry for new tags matching a pattern and automatically updates the config repo. This removes CI’s knowledge of the config repo but adds complexity.
Step 8: Verification#
After the CI pipeline completes, check that the config repo contains the new image tag. Run kustomize build overlays/dev and confirm the rendered manifests reference the correct image.
Phase 3 – GitOps Deployment#
Step 9: Install ArgoCD#
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yamlRetrieve the initial admin password:
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -dStep 10: Create an ArgoCD Application#
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-service-dev
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/myorg/deploy-configs.git
targetRevision: main
path: apps/my-service/overlays/dev
destination:
server: https://kubernetes.default.svc
namespace: my-service-dev
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=trueStep 11: Sync Policy Per Environment#
- Dev:
automatedwithselfHeal: true– any change in Git deploys immediately. Drift is corrected automatically. - Staging:
automatedbut withoutselfHeal, or use a PR-based promotion workflow. - Production: No
automatedblock – sync is manual, triggered by an operator or a release automation tool after approval.
Step 12: Notifications#
# ArgoCD notification configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-notifications-cm
namespace: argocd
data:
service.slack: |
token: $slack-token
trigger.on-sync-succeeded: |
- send: [app-sync-succeeded]
template.app-sync-succeeded: |
message: "{{.app.metadata.name}} synced to {{.app.status.sync.revision}}"Step 13: Verification#
Update the image tag in the config repo. Confirm ArgoCD detects the change (visible in the ArgoCD UI or argocd app get my-service-dev). Verify the new pod is running the expected image: kubectl get pods -n my-service-dev -o jsonpath='{.items[0].spec.containers[0].image}'.
Phase 4 – Progressive Delivery#
Step 14: Decision Point#
| Strategy | Complexity | Risk | Use When |
|---|---|---|---|
| Rolling update | Low | Medium | Most workloads, acceptable brief mixed-version window |
| Canary | Medium | Low | User-facing services where bad deploys cost money |
| Blue-green | Medium | Low | Need instant rollback, can afford 2x resources briefly |
Step 15: Canary with Argo Rollouts#
Install Argo Rollouts:
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yamlReplace your Deployment with a Rollout:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-service
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 10
- pause: { duration: 5m }
- setWeight: 30
- pause: { duration: 5m }
- setWeight: 60
- pause: { duration: 5m }
canaryService: my-service-canary
stableService: my-service-stable
analysis:
templates:
- templateName: success-rate
startingStep: 2
args:
- name: service-name
value: my-service
selector:
matchLabels:
app: my-service
template:
metadata:
labels:
app: my-service
spec:
containers:
- name: my-service
image: ghcr.io/myorg/my-service:abc1234Step 16: Rolling Update Configuration#
If you stay with rolling updates, tune the parameters:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 0
minReadySeconds: 15Setting maxUnavailable: 0 ensures no capacity loss during rollout. minReadySeconds: 15 prevents a pod that briefly passes its readiness probe from being counted as healthy.
Step 17: Rollback#
For rolling updates, rollback is manual: kubectl rollout undo deployment/my-service. For canary with Argo Rollouts, configure an AnalysisTemplate that checks error rates. If the analysis fails, the rollout aborts automatically and traffic returns to the stable version.
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
args:
- name: service-name
metrics:
- name: success-rate
interval: 60s
successCondition: result[0] >= 0.99
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
sum(rate(http_requests_total{service="{{args.service-name}}",status=~"2.."}[2m]))
/
sum(rate(http_requests_total{service="{{args.service-name}}"}[2m]))Step 18: Verification#
Deploy a version that returns 500 errors on a percentage of requests. With canary analysis, verify the rollout pauses and then aborts. With rolling updates, verify kubectl rollout undo restores the previous version.
Phase 5 – Observability#
Step 19: Deployment Annotations#
Add annotations so Grafana can mark deployments on dashboards:
metadata:
annotations:
deployment.kubernetes.io/revision: "{{ .Release.Revision }}"
app.kubernetes.io/version: "{{ .Values.image.tag }}"Step 20: Deployment Dashboard#
Track these metrics in a Grafana dashboard: deployment frequency (how often you deploy), deployment duration (time from commit to running pod), change failure rate (percentage of deployments that cause incidents), and mean time to recovery (how quickly you roll back or fix).
Step 21: Alerting#
# Prometheus alerting rule
groups:
- name: deployment-alerts
rules:
- alert: DeploymentReplicasMismatch
expr: kube_deployment_spec_replicas != kube_deployment_status_ready_replicas
for: 10m
labels:
severity: warning
annotations:
summary: "Deployment {{ $labels.deployment }} has replica mismatch for 10 minutes"Step 22: Verification#
Deploy a new version. Confirm a deployment marker appears on the Grafana dashboard timeline. Verify the alert fires if you intentionally scale down replicas without updating the spec.
Summary#
The complete pipeline flow: developer pushes code, CI builds and scans the image, CI (or image automation) updates the config repo, ArgoCD syncs the change to the cluster, the rollout strategy controls how traffic shifts to the new version, and observability tools confirm the deployment succeeded. Each phase is independently verifiable, so failures are isolated to a specific stage.