Jenkins Debugging#

Jenkins failures fall into a few categories: builds stuck waiting, cryptic pipeline errors, performance degradation, and Kubernetes agent pods that refuse to launch.

Builds Stuck in Queue#

When a build sits in the queue and never starts, check the queue tooltip in the UI – it tells you why. Common causes:

No agents with matching labels. The pipeline requests agent { label 'docker-arm64' } but no agent has that label. Check Manage Jenkins > Nodes to see available labels.

All executors busy. Every agent is at its concurrent build limit. Increase numExecutors, add agents, or use the kubernetes plugin for elastic capacity.

Node is offline. Check the node’s log in Manage Jenkins > Nodes > [node name] > Log. Common causes: network issues, SSH key expiry, or disk full on the agent.

To check the queue programmatically:

curl -s "http://jenkins:8080/queue/api/json?pretty=true" --user admin:$TOKEN | jq '.items[] | {task: .task.name, why}'

Pipeline Failures#

Script Approval Errors#

Groovy sandbox restricts what pipeline code can do. When a pipeline calls a method not on the allowlist, you see:

Scripts not permitted to use method org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper getRawBuild

Fix: Go to Manage Jenkins > In-process Script Approval and approve the pending signature. For trusted shared libraries, configure them to run outside the sandbox.

Credential Not Found#

ERROR: Could not find credentials entry with ID 'my-cred-id'

Check: (1) the credential ID is spelled correctly (case-sensitive), (2) the credential exists in the correct scope (folder-scoped credentials are not visible outside that folder), (3) the credential type matches the binding – credentials('id') expects username-password or secret text; for SSH keys use withCredentials with sshUserPrivateKey.

// Correct binding for SSH key
withCredentials([sshUserPrivateKey(credentialsId: 'my-ssh-key', keyFileVariable: 'SSH_KEY')]) {
    sh 'ssh -i $SSH_KEY user@host "echo connected"'
}

Workspace Issues#

java.io.IOException: Cannot delete workspace or stale files from previous runs causing failures. Use cleanWs() in a post { cleanup { } } block – cleanup runs after all other post-conditions:

post {
    cleanup {
        cleanWs()
    }
}

Stale Checkout#

When a multibranch pipeline does not pick up new commits, the SCM polling interval may be too long. Configure a webhook from your Git provider to trigger http://jenkins:8080/github-webhook/ on push events for immediate detection.

Jenkins Running Slow#

Too Many Stored Builds#

Without rotation, $JENKINS_HOME/jobs/ grows unbounded. Add to every pipeline:

options {
    buildDiscarder(logRotator(numToKeepStr: '30', daysToKeepStr: '90'))
}

To clean up retroactively, use the Script Console (Manage Jenkins > Script Console):

Jenkins.instance.allItems(Job).each { job ->
    job.builds.findAll { it.number < job.lastBuild.number - 50 }.each { it.delete() }
}

Plugin Bloat#

Every plugin increases memory usage and boot time. Audit periodically:

curl -s "http://jenkins:8080/pluginManager/api/json?depth=1" --user admin:$TOKEN \
  | jq '[.plugins[] | {name: .shortName, active: .active, enabled: .enabled}] | sort_by(.name)'

Remove plugins not actively used. The Plugin Usage Analyzer plugin helps identify them.

Heap Tuning#

Default heap settings are often too low for busy instances. Set heap explicitly:

JAVA_OPTS="-Xms2g -Xmx4g -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError"

In the Helm chart:

controller:
  javaOpts: "-Xms2g -Xmx4g -XX:+UseG1GC"
  resources:
    requests:
      memory: "3Gi"
    limits:
      memory: "5Gi"

Set the memory request to about 75% of heap max plus overhead. If Jenkins freezes periodically, take a thread dump via the /threadDump URL and look for blocked threads.

Kubernetes Agent Pods Not Launching#

Pod Stays Pending#

kubectl get pods -n jenkins -l jenkins=agent
kubectl describe pod <pod-name> -n jenkins

Look at the Events section. Common causes: insufficient CPU/memory (lower resource requests or scale the cluster), missing node selectors or tolerations for tainted nodes, or a PVC that cannot bind (wrong StorageClass or quota exceeded).

Image Pull Errors#

ErrImagePull or ImagePullBackOff means Kubernetes cannot pull the container image. Verify the image name and tag are correct, private registries have imagePullSecrets configured on the pod or ServiceAccount, and nodes have network access to the registry.

JNLP Connection Fails#

The agent pod starts but the build never begins. The JNLP container cannot connect back to Jenkins. Verify that jenkinsTunnel matches the agent listener service name and port, port 50000 is exposed via a Service, and network policies allow traffic from agent pods to the controller.

Check the JNLP container logs:

kubectl logs <pod-name> -c jnlp -n jenkins

Pod Deleted Before Build Completes#

If activeDeadlineSeconds is set too low on the pod template, Kubernetes kills the pod before the build finishes. Remove the deadline or increase it. Also check for OOMKilled in kubectl describe pod output, indicating the container exceeded its memory limit.

Reading Console Output Effectively#

Jenkins console output can be thousands of lines. Use the “Pipeline Steps” view (classic UI) or Blue Ocean to jump directly to the failing step. Search raw console output for ERROR, FATAL, Exception, or exit code. The Timestamps plugin and AnsiColor plugin make logs significantly more readable. For programmatic access:

curl "http://jenkins:8080/job/myapp/lastBuild/consoleText" --user admin:$TOKEN