Validation Path Selection#

Not every infrastructure change needs a full Kubernetes cluster to validate. Some changes can be verified with a linter in under a second. Others genuinely need a multi-node cluster with ingress, persistent volumes, and network policies. The cost of choosing wrong is real in both directions: too little validation lets broken configs reach production, while too much wastes minutes or hours on environments you did not need.

This article defines five validation paths, explains what each can and cannot verify, and provides a decision tree for selecting the right one.

The Five Validation Paths#

Path 0: Static Validation (No Compute)#

Static validation runs entirely on the local filesystem using linters, schema validators, and dry-run commands. No containers, no clusters, no cloud accounts.

What it validates:

YAML/JSON syntax correctness
Kubernetes manifest schema compliance (kubectl --dry-run=client, kubeconform)
Helm template rendering (helm template, helm lint)
Terraform plan syntax (terraform validate, terraform fmt)
Dockerfile linting (hadolint)
Policy compliance (conftest, OPA Rego policies)
Kustomize overlay rendering (kustomize build)

What it misses:

Runtime behavior (does the container actually start?)
Resource interactions (does the service connect to the database?)
Network policies in action
Volume mounts and persistent storage
Ingress routing
Anything that requires a running API server

Resources required: Local filesystem, CLI tools installable via package manager. No Docker, no internet (beyond initial tool install).

Setup time: Seconds. Install tools once, run instantly afterward.

Teardown: Nothing to tear down.

Fidelity level: Low. Catches syntax and schema errors. Cannot catch logic or runtime errors.

When to use: Every time. Path 0 should be the first step in any validation workflow. It is cheap enough to run on every change.

# Example: validate a Helm chart statically
helm lint ./my-chart
helm template my-release ./my-chart --values values-prod.yaml | kubeconform -strict
conftest test ./my-chart/templates/ -p ./policy/

Path 1: Local Lightweight (Docker / kind)#

Spin up a lightweight Kubernetes cluster using kind (Kubernetes IN Docker) or validate individual services with plain Docker containers. This gives you a real API server and kubelet, but in a minimal configuration.

What it validates:

Container image builds and startup
Kubernetes resource creation (pods actually schedule and run)
Service discovery within the cluster
Helm chart install/upgrade cycles
Basic health check and readiness probe behavior
ConfigMap and Secret mounting
Simple multi-service communication

What it misses:

Ingress controllers (unless explicitly configured)
Persistent volume behavior with real storage classes
Network policies (kind’s default CNI does not enforce them)
Multi-node scheduling behavior (single-node by default)
LoadBalancer service type (no cloud LB integration)
Performance characteristics under real workloads

Resources required: Docker daemon running. Approximately 2 GB RAM for a single-node kind cluster. No cloud account needed.

Setup time: 30-90 seconds for cluster creation. Add 1-3 minutes for pulling images on first run.

Teardown: kind delete cluster – under 10 seconds.

Fidelity level: Medium-low. Confirms resources deploy and containers run. Does not replicate production networking or storage.

# Example: validate on kind
kind create cluster --name validation
helm install my-release ./my-chart --wait --timeout 120s
kubectl get pods -o wide
kubectl run test --image=busybox --rm -it --restart=Never -- wget -qO- http://my-service:8080/health
kind delete cluster --name validation

Path 2: Local Full-Fidelity (minikube)#

Minikube provides a more feature-rich local cluster with addons for ingress, storage, metrics, and more. It supports multiple drivers (Docker, hyperkit, QEMU) and can simulate multi-node setups.

What it validates:

Everything Path 1 validates, plus:
Ingress controller behavior (via minikube addons enable ingress)
Persistent volume claims with real storage provisioner
Metrics server and HPA behavior
DNS resolution between services
Multi-node scheduling (with --nodes flag)
Dashboard and monitoring stack integration
LoadBalancer via minikube tunnel

What it misses:

Cloud-specific storage classes (EBS, PD, Azure Disk)
Cloud load balancer integration
IAM and cloud identity federation
Cross-region or cross-zone behavior
Real network latency characteristics
Cloud-specific network policies (Calico on GKE, Azure CNI)

Resources required: Docker or a hypervisor. Minimum 4 GB RAM recommended (8 GB for multi-node). No cloud account.

Setup time: 1-3 minutes. Addons add 30-60 seconds each.

Teardown: minikube delete – under 30 seconds.

Fidelity level: Medium-high. Closest to a real cluster you can get locally. Misses only cloud-specific behavior.

# Example: validate with minikube including ingress
minikube start --cpus=4 --memory=8192 --addons=ingress,metrics-server
helm install my-release ./my-chart --wait --timeout 180s
kubectl get pods,svc,ingress
minikube tunnel &
curl -H "Host: myapp.local" http://127.0.0.1/health
minikube delete

Path 3: Cloud Ephemeral (Terraform / Pulumi)#

Spin up a real cloud cluster using infrastructure-as-code, validate against it, then destroy it. This provides full cloud fidelity at the cost of time and money.

What it validates:

Everything Path 2 validates, plus:
Cloud-specific storage classes and volume behavior
Cloud load balancers and DNS integration
IAM roles, service accounts, and IRSA/Workload Identity
Network policies with cloud CNI plugins
Cloud-specific node pools and autoscaling
Real network latency and cross-AZ behavior
Managed add-ons (CoreDNS, kube-proxy, VPC CNI)

What it misses: Very little from a Kubernetes perspective. May miss organizational policies (SCPs, landing zone constraints) if using a sandbox account.

Resources required: Cloud account with permissions to create and destroy clusters. Active billing. Terraform or Pulumi installed. Cloud CLI tools (aws, gcloud, az).

Setup time: 10-25 minutes for a managed Kubernetes cluster (EKS, GKE, AKS). Add 5-10 minutes for node pools to become ready.

Teardown: terraform destroy – 10-20 minutes. Critical: forgetting teardown costs real money.

Fidelity level: High. This is the real thing. The only gap is organizational-level policies and multi-cluster scenarios.

# Example: ephemeral EKS cluster for validation
cd terraform/validation-cluster
terraform init
terraform apply -auto-approve
aws eks update-kubeconfig --name validation-cluster
helm install my-release ./my-chart --wait --timeout 300s
# ... run validation tests ...
terraform destroy -auto-approve

Path 4: Free-Tier Cloud (Codespaces / Gitpod / Killercoda)#

Use a free cloud development environment that provides a pre-configured container with Docker-in-Docker or a lightweight Kubernetes setup. No local resources needed beyond a browser or SSH client.

What it validates: Equivalent to Path 1 or Path 2 depending on the platform. Codespaces and Gitpod give you Docker and can run kind. Killercoda provides pre-built Kubernetes environments.

What it misses: Same limitations as Path 1/2, plus: resource constraints from the free tier (limited CPU, RAM, storage), session time limits, and no persistent state between sessions.

Resources required: GitHub account (Codespaces), Gitpod account, or Killercoda account. No local Docker, no local compute. Internet connection required.

Setup time: 1-3 minutes for environment provisioning. Pre-configured images reduce tool installation time.

Teardown: Close the session. Environment auto-destroys after timeout.

Fidelity level: Medium-low to medium, depending on platform and resources allocated.

# Example: Codespaces with kind
# In devcontainer.json, enable docker-in-docker feature
# Then in the terminal:
kind create cluster
helm install my-release ./my-chart --wait
kubectl get pods
kind delete cluster

Comparison Table#

Dimension	Path 0: Static	Path 1: kind/Docker	Path 2: minikube	Path 3: Cloud	Path 4: Free-Tier
Compute needed	None	Docker daemon	Docker/hypervisor	Cloud account	Browser only
Setup time	Seconds	30-90 sec	1-3 min	10-25 min	1-3 min
Teardown time	None	~10 sec	~30 sec	10-20 min	Automatic
Cost	Free	Free	Free	$0.50-5/run	Free (limits)
RAM required	Minimal	~2 GB	~4-8 GB	N/A (cloud)	N/A (cloud)
Fidelity	Low	Medium-low	Medium-high	High	Medium-low
Ingress testing	No	Manual setup	Addon	Full	Platform-dependent
Storage testing	No	EmptyDir only	Local provisioner	Cloud volumes	Limited
Network policy	Syntax only	No enforcement	With Calico addon	Full	No
Offline capable	Yes (after install)	Yes	Yes	No	No
Session limit	None	None	None	Cost-based	Time-based

Decision Tree#

Work through these three questions in order. Each narrows the field.

Question 1: What am I validating?#

Syntax, schema, or policy compliance only? Use Path 0. You do not need a running cluster to check whether your YAML is valid or your Helm templates render correctly. Start here regardless, then escalate if needed.

Container startup and basic resource creation? Path 1 is sufficient. If you need to verify that pods schedule, containers start, and services resolve, kind or Docker Compose handles this.

Ingress routing, storage behavior, or addon interaction? You need Path 2 at minimum. Minikube’s addon system provides ingress controllers, storage provisioners, and metrics servers that kind does not include by default.

Cloud-specific behavior (IAM, cloud LBs, managed storage)? Path 3 is the only option that provides real cloud APIs. No local tool can simulate IAM role assumption or cloud load balancer provisioning.

Question 2: What resources do I have?#

No Docker daemon available? You are limited to Path 0 or Path 4. If you have internet access and a GitHub account, Path 4 (Codespaces) gives you Docker without local installation.

Docker available but limited RAM (under 4 GB free)? Path 1 with a single-node kind cluster. Do not attempt minikube with addons.

Docker available with 4+ GB free RAM? Path 1 or Path 2. Choose based on fidelity needs from Question 1.

Cloud account with billing? All paths available. Choose based on fidelity needs.

Question 3: How much fidelity do I need?#

Catching obvious mistakes before a human reviews? Path 0 is sufficient. The goal is to avoid embarrassing the agent with broken YAML or missing required fields.

Confidence that the deployment will work in a real cluster? Path 1 or Path 2. If the change involves only deployments, services, and config, Path 1 is enough. If it touches ingress, storage, or autoscaling, use Path 2.

Certainty that the deployment will work in the specific target cloud? Path 3. There is no substitute for deploying to the actual cloud provider.

Combining the Answers#

The recommended path is the minimum path that satisfies all three answers. If your validation target requires Path 2 but you only have resources for Path 1, you have two options: run Path 1 and document what you could not validate, or use Path 4 to get a more capable environment in the cloud.

Always run Path 0 first regardless of your chosen path. It costs nothing and catches the cheapest errors.

Path Selection in Practice#

Here is how an agent should execute path selection:

1. Run Path 0 validation (lint, schema check, dry-run).
   If errors found: fix them. Do not escalate to a higher path.
   If clean: proceed to step 2.

2. Determine the minimum fidelity needed:
   - Manifest changes only? -> Path 1 sufficient.
   - Ingress/storage/addon changes? -> Path 2 minimum.
   - Cloud-specific resources? -> Path 3 minimum.

3. Check available resources:
   - Docker available? -> Path 1 or 2.
   - No Docker? -> Path 0 + Path 4.
   - Cloud creds? -> Path 3 available as option.

4. Execute the selected path.
   - Set up environment.
   - Deploy the change.
   - Run verification checks.
   - Capture results.
   - Tear down environment.

5. Report results with fidelity disclaimer:
   "Validated on [path]. This confirms [what was tested].
    Not validated: [what the path cannot test]."

The fidelity disclaimer is important. An agent that says “I validated this on kind” without mentioning that kind does not enforce network policies is giving a false sense of security if the change includes a NetworkPolicy resource. Always state what was and was not covered by the chosen validation path.

Escalation and Graceful Degradation#

Sometimes the ideal path is not available. When this happens, degrade gracefully rather than skipping validation entirely.

Cannot reach Path 3 (cloud)? Fall back to Path 2 and document that cloud-specific behavior (IAM, LB provisioning, managed storage classes) was not validated. For IAM specifically, you can validate the policy documents syntactically with Path 0 tools.

Cannot reach Path 2 (minikube)? Fall back to Path 1 (kind) and document that ingress routing, storage provisioner behavior, and HPA scaling were not tested. For ingress, you can at least verify the Ingress resource renders correctly with Path 0.

Cannot reach Path 1 (Docker)? Fall back to Path 0 and use Path 4 (Codespaces/Gitpod) if internet is available. If neither is possible, run Path 0 and explicitly flag that no runtime validation was performed.

Cannot reach Path 0? Something is seriously wrong with the environment. Do not proceed with the change. Inform the operator that basic tooling (a YAML linter, helm, kubeconform) must be installed before validation can occur.

The key principle: partial validation with an honest fidelity disclaimer is always better than no validation. An agent should never skip all validation because the ideal path is unavailable.