kubectl Debugging#
When something breaks in Kubernetes, you need to move through a specific sequence of commands. Here is every debugging command you will reach for, plus a step-by-step workflow for a pod that will not start.
Logs#
kubectl logs <pod-name> -n <namespace> # basic
kubectl logs <pod-name> -c <container-name> -n <namespace> # specific container
kubectl logs <pod-name> --previous -n <namespace> # previous crash (essential for CrashLoopBackOff)
kubectl logs -f <pod-name> -n <namespace> # stream in real-time
kubectl logs --since=5m <pod-name> -n <namespace> # last 5 minutes
kubectl logs -l app=payments-api -n payments-prod --all-containers # all pods matching labelThe --previous flag is critical for crash-looping pods where the current container has no logs yet. The --all-containers flag captures init containers and sidecars.
describe: The First Stop for Pod Issues#
describe shows the full picture: pod spec, conditions, events, and current state.
kubectl describe pod <pod-name> -n <namespace>Critical sections to check in the output:
- Status – Is it
Pending,Running,CrashLoopBackOff,ImagePullBackOff? - Conditions – Look for
PodScheduled,Initialized,ContainersReady,Ready. AFalsecondition tells you exactly where things stalled. - Events – Sorted chronologically at the bottom. This is where you find scheduling failures, image pull errors, probe failures, and OOMKills.
Works on non-pod resources too: kubectl describe node, kubectl describe service, kubectl describe ingress.
Events#
kubectl get events -n <namespace> --sort-by='.lastTimestamp' # sorted by time
kubectl get events -n <namespace> --field-selector type=Warning # warnings only
kubectl get events -A --sort-by='.lastTimestamp' # cluster-wideEvents expire after about an hour. If your pod failed and you waited too long, they may be gone.
exec: Get a Shell Inside a Running Container#
kubectl exec -it <pod-name> -n <namespace> -- /bin/shUse /bin/sh over /bin/bash – many minimal images lack bash. For distroless images, exec will not work; use debug containers instead. Run single commands without an interactive shell:
kubectl exec <pod-name> -n <namespace> -- cat /etc/app/config.yaml
kubectl exec <pod-name> -n <namespace> -- wget -qO- http://localhost:8080/healthport-forward: Access Services Without Ingress#
Forward a local port to a pod or service:
# Forward local port 8080 to pod port 8080
kubectl port-forward pod/<pod-name> 8080:8080 -n <namespace>
# Forward to a service (picks a backing pod automatically)
kubectl port-forward svc/<service-name> 8080:80 -n <namespace>Indispensable for testing unexposed services and debugging database connections.
top: Resource Usage#
kubectl top pods -n <namespace> # per-pod CPU and memory
kubectl top pods -n <namespace> --sort-by=cpu # sorted by CPU
kubectl top nodes # node-level usageRequires metrics-server. On minikube: minikube addons enable metrics-server.
Debug Containers (Ephemeral Containers)#
When a container is distroless or has no shell, attach an ephemeral debug container:
# Attach a debug container with common tools
kubectl debug -it <pod-name> -n <namespace> \
--image=busybox:latest --target=<container-name>The --target flag shares the process namespace with the specified container. For node-level debugging:
kubectl debug node/<node-name> -it --image=ubuntu:22.04
# Host filesystem available at /hostCopy a failing pod with a different entrypoint (useful when the process crashes immediately):
kubectl debug <pod-name> -it --copy-to=debug-pod \
--container=app --image=busybox -- /bin/shjsonpath and custom-columns: Extracting Specific Data#
# jsonpath: extract specific fields
kubectl get pods -n <namespace> -o jsonpath='{.items[*].status.podIP}'
kubectl get pod <pod-name> -n <namespace> \
-o jsonpath='{.status.containerStatuses[0].restartCount}'
# custom-columns: readable tabular output
kubectl get pods -n <namespace> \
-o custom-columns='NAME:.metadata.name,STATUS:.status.phase,RESTARTS:.status.containerStatuses[0].restartCount,NODE:.spec.nodeName'Step-by-Step: Pod Will Not Start#
Work through these steps in order. Each narrows the problem.
Step 1: Get the pod status.
kubectl get pod <pod-name> -n <namespace>The STATUS column categorizes the problem:
Pending– Scheduling or resource issueImagePullBackOff– Wrong image name/tag or missing pull secretCrashLoopBackOff– Container starts and immediately exitsInit:Error– Init container failingContainerCreating(stuck) – Volume mount or secret reference problem
Step 2: Describe the pod.
kubectl describe pod <pod-name> -n <namespace>Read the Events section bottom-to-top. Common findings:
FailedScheduling– Node resources exhausted or requests too highFailed to pull image– Wrong image name, tag, or missing pull secretFailedMount– Referenced Secret or ConfigMap does not existBack-off restarting failed container– Container process is exiting; check logs
Step 3: Check logs.
# If the pod has restarted, get previous container logs
kubectl logs <pod-name> --previous -n <namespace>
# If it is an init container failing
kubectl logs <pod-name> -c <init-container-name> -n <namespace>Step 4: Verify dependencies exist.
kubectl get configmap <name> -n <namespace> # referenced ConfigMap?
kubectl get secret <name> -n <namespace> # referenced Secret?
kubectl get pvc -n <namespace> # PVC exists and bound?Step 5: For scheduling failures, check node capacity.
kubectl describe nodes | grep -A 5 "Allocated resources"This sequence covers the vast majority of pod startup failures. Work through it methodically and you will find the root cause within minutes.