kubectl Debugging#

When something breaks in Kubernetes, you need to move through a specific sequence of commands. Here is every debugging command you will reach for, plus a step-by-step workflow for a pod that will not start.

Logs#

kubectl logs <pod-name> -n <namespace>                           # basic
kubectl logs <pod-name> -c <container-name> -n <namespace>       # specific container
kubectl logs <pod-name> --previous -n <namespace>                # previous crash (essential for CrashLoopBackOff)
kubectl logs -f <pod-name> -n <namespace>                        # stream in real-time
kubectl logs --since=5m <pod-name> -n <namespace>                # last 5 minutes
kubectl logs -l app=payments-api -n payments-prod --all-containers  # all pods matching label

The --previous flag is critical for crash-looping pods where the current container has no logs yet. The --all-containers flag captures init containers and sidecars.

describe: The First Stop for Pod Issues#

describe shows the full picture: pod spec, conditions, events, and current state.

kubectl describe pod <pod-name> -n <namespace>

Critical sections to check in the output:

  • Status – Is it Pending, Running, CrashLoopBackOff, ImagePullBackOff?
  • Conditions – Look for PodScheduled, Initialized, ContainersReady, Ready. A False condition tells you exactly where things stalled.
  • Events – Sorted chronologically at the bottom. This is where you find scheduling failures, image pull errors, probe failures, and OOMKills.

Works on non-pod resources too: kubectl describe node, kubectl describe service, kubectl describe ingress.

Events#

kubectl get events -n <namespace> --sort-by='.lastTimestamp'       # sorted by time
kubectl get events -n <namespace> --field-selector type=Warning    # warnings only
kubectl get events -A --sort-by='.lastTimestamp'                   # cluster-wide

Events expire after about an hour. If your pod failed and you waited too long, they may be gone.

exec: Get a Shell Inside a Running Container#

kubectl exec -it <pod-name> -n <namespace> -- /bin/sh

Use /bin/sh over /bin/bash – many minimal images lack bash. For distroless images, exec will not work; use debug containers instead. Run single commands without an interactive shell:

kubectl exec <pod-name> -n <namespace> -- cat /etc/app/config.yaml
kubectl exec <pod-name> -n <namespace> -- wget -qO- http://localhost:8080/health

port-forward: Access Services Without Ingress#

Forward a local port to a pod or service:

# Forward local port 8080 to pod port 8080
kubectl port-forward pod/<pod-name> 8080:8080 -n <namespace>

# Forward to a service (picks a backing pod automatically)
kubectl port-forward svc/<service-name> 8080:80 -n <namespace>

Indispensable for testing unexposed services and debugging database connections.

top: Resource Usage#

kubectl top pods -n <namespace>                    # per-pod CPU and memory
kubectl top pods -n <namespace> --sort-by=cpu      # sorted by CPU
kubectl top nodes                                  # node-level usage

Requires metrics-server. On minikube: minikube addons enable metrics-server.

Debug Containers (Ephemeral Containers)#

When a container is distroless or has no shell, attach an ephemeral debug container:

# Attach a debug container with common tools
kubectl debug -it <pod-name> -n <namespace> \
  --image=busybox:latest --target=<container-name>

The --target flag shares the process namespace with the specified container. For node-level debugging:

kubectl debug node/<node-name> -it --image=ubuntu:22.04
# Host filesystem available at /host

Copy a failing pod with a different entrypoint (useful when the process crashes immediately):

kubectl debug <pod-name> -it --copy-to=debug-pod \
  --container=app --image=busybox -- /bin/sh

jsonpath and custom-columns: Extracting Specific Data#

# jsonpath: extract specific fields
kubectl get pods -n <namespace> -o jsonpath='{.items[*].status.podIP}'
kubectl get pod <pod-name> -n <namespace> \
  -o jsonpath='{.status.containerStatuses[0].restartCount}'

# custom-columns: readable tabular output
kubectl get pods -n <namespace> \
  -o custom-columns='NAME:.metadata.name,STATUS:.status.phase,RESTARTS:.status.containerStatuses[0].restartCount,NODE:.spec.nodeName'

Step-by-Step: Pod Will Not Start#

Work through these steps in order. Each narrows the problem.

Step 1: Get the pod status.

kubectl get pod <pod-name> -n <namespace>

The STATUS column categorizes the problem:

  • Pending – Scheduling or resource issue
  • ImagePullBackOff – Wrong image name/tag or missing pull secret
  • CrashLoopBackOff – Container starts and immediately exits
  • Init:Error – Init container failing
  • ContainerCreating (stuck) – Volume mount or secret reference problem

Step 2: Describe the pod.

kubectl describe pod <pod-name> -n <namespace>

Read the Events section bottom-to-top. Common findings:

  • FailedScheduling – Node resources exhausted or requests too high
  • Failed to pull image – Wrong image name, tag, or missing pull secret
  • FailedMount – Referenced Secret or ConfigMap does not exist
  • Back-off restarting failed container – Container process is exiting; check logs

Step 3: Check logs.

# If the pod has restarted, get previous container logs
kubectl logs <pod-name> --previous -n <namespace>

# If it is an init container failing
kubectl logs <pod-name> -c <init-container-name> -n <namespace>

Step 4: Verify dependencies exist.

kubectl get configmap <name> -n <namespace>    # referenced ConfigMap?
kubectl get secret <name> -n <namespace>       # referenced Secret?
kubectl get pvc -n <namespace>                 # PVC exists and bound?

Step 5: For scheduling failures, check node capacity.

kubectl describe nodes | grep -A 5 "Allocated resources"

This sequence covers the vast majority of pod startup failures. Work through it methodically and you will find the root cause within minutes.