Velero Backup and Restore#

Velero backs up Kubernetes resources and persistent volume data to object storage. It handles scheduled backups, on-demand snapshots, and restores to the same or a different cluster. It is the standard tool for Kubernetes disaster recovery.

Velero captures two things: Kubernetes API objects (stored as JSON) and persistent volume data (via cloud volume snapshots or file-level backup with Kopia).

Installation#

You need an object storage bucket (S3, GCS, Azure Blob, or MinIO) and write credentials.

AWS S3#

velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.10.0 \
  --bucket velero-backups \
  --backup-location-config region=us-east-1 \
  --snapshot-location-config region=us-east-1 \
  --secret-file ./credentials-velero \
  --use-node-agent \
  --default-volumes-to-fs-backup

Credentials file format:

[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

MinIO (On-Prem or Dev)#

MinIO is S3-compatible, so use the AWS plugin with a custom endpoint. The s3ForcePathStyle=true is critical – without it, Velero tries virtual-hosted-style URLs and fails.

velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.10.0 \
  --bucket velero-backups \
  --backup-location-config region=minio,s3ForcePathStyle=true,s3Url=http://minio.minio.svc:9000 \
  --secret-file ./credentials-velero \
  --use-node-agent \
  --default-volumes-to-fs-backup

Backup Schedules#

Full Cluster Backup (Daily)#

velero schedule create daily-full \
  --schedule="0 2 * * *" \
  --ttl 720h

Runs at 2 AM daily, retains for 30 days. The TTL matters – without it, backups accumulate indefinitely.

Namespace-Scoped Backup#

velero schedule create payments-hourly \
  --schedule="0 * * * *" \
  --include-namespaces payments \
  --ttl 168h

Excluding Resources#

velero backup create pre-migration \
  --exclude-namespaces kube-system,velero \
  --exclude-resources events,events.events.k8s.io

Persistent Volume Backup#

Volume Snapshots (Cloud Provider)#

If your provider supports volume snapshots (EBS, GCE PD, Azure Disk), Velero creates them natively. Fast but cloud-specific and not portable cross-provider.

File-Level Backup with Kopia#

For portable backups, the node agent DaemonSet mounts PVs and copies files to object storage. Works with any storage backend. Enable per-pod:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    backup.velero.io/backup-volumes: "data"
spec:
  containers:
  - name: app
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: app-data

Or set --default-volumes-to-fs-backup during install to back up all volumes by default. Opt out specific volumes:

backup.velero.io/backup-volumes-excludes: "cache,tmp"

Restoring#

Restore to Same Cluster#

velero backup get

velero restore create --from-backup daily-full-20260221020000

# Restore only specific namespaces
velero restore create --from-backup daily-full-20260221020000 \
  --include-namespaces payments

Restore to a Different Cluster#

The target cluster needs Velero installed pointing to the same bucket. Velero automatically discovers existing backups.

# On target cluster
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.10.0 \
  --bucket velero-backups \
  --backup-location-config region=us-east-1 \
  --secret-file ./credentials-velero \
  --use-node-agent

velero backup get
velero restore create --from-backup daily-full-20260221020000

Namespace Remapping#

velero restore create --from-backup daily-full-20260221020000 \
  --namespace-mappings payments:payments-staging

Disaster Recovery Workflow#

  1. Daily: Scheduled backups run automatically. Verify with velero backup get.
  2. Weekly: Check storage location status: velero backup-location get. Status Available means the bucket is reachable.
  3. Monthly: Test a restore to a staging cluster. An untested backup is not a backup.
  4. During incident: Find the last good backup, restore to a new cluster, verify, update DNS.

Common Issues#

Partial restore failures. CRD-dependent resources may fail if CRDs are not restored first. Restore CRDs separately, then dependent resources.

PV snapshots not supported. If your storage class lacks CSI snapshot support, volume snapshots silently produce empty results. Use file-level backup and verify with velero backup describe <name> --details.

Restore conflicts. Velero skips existing resources by default. Delete conflicting resources first, or use --existing-resource-policy=update.

Node agent CrashLoopBackOff. Usually a permissions issue – the agent needs privileged access to mount host paths.

# Diagnose issues
velero backup describe <backup-name> --details
velero backup logs <backup-name>
velero restore describe <restore-name> --details