Kubernetes CrashLoopBackOff — Complete Fix Guide

What Is CrashLoopBackOff?

A state where a Pod starts → crashes → restarts in a loop, with increasing wait times.
Kubernetes doubles the backoff interval on each restart, up to a maximum of 5 minutes.

NAME          READY   STATUS             RESTARTS   AGE
my-app-xyz    0/1     CrashLoopBackOff   5          8m

Five Root Causes

1. Application Crash

The most common cause. The app throws an exception or panic immediately on startup.

# Read logs from the already-terminated container
kubectl logs my-app-xyz --previous

# Last 100 lines only
kubectl logs my-app-xyz --previous --tail=100

2. Liveness Probe Failure

The app is running fine, but the probe configuration is too aggressive.

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30   # Too short → probe fires before app is ready
  periodSeconds: 10
  failureThreshold: 3

Fix: Set initialDelaySeconds generously based on actual startup time.

3. OOMKilled (Memory Limit Exceeded)

kubectl describe pod my-app-xyz | grep -A5 "Last State"
# State: Terminated
# Reason: OOMKilled

resources:
  requests:
    memory: "256Mi"
  limits:
    memory: "512Mi"   # Set above actual peak usage

4. ConfigMap / Secret Mount Failure

The referenced ConfigMap or Secret doesn’t exist.

kubectl describe pod my-app-xyz | grep -A10 "Events"
# Warning  Failed    MountVolume.SetUp failed: secret "db-secret" not found

# Verify Secret exists
kubectl get secret db-secret -n your-namespace

5. Missing Entrypoint Permission

kubectl describe pod my-app-xyz | grep "Error"
# exec /app/start.sh: permission denied

Grant execute permission in your Dockerfile:

COPY start.sh /app/start.sh
RUN chmod +x /app/start.sh

Step-by-Step Debugging Checklist

# 1. Check Pod status
kubectl get pod my-app-xyz -o wide

# 2. Check Events (80% of causes are revealed here)
kubectl describe pod my-app-xyz

# 3. Read previous container logs
kubectl logs my-app-xyz --previous

# 4. Shell into the container (while it's alive)
kubectl exec -it my-app-xyz -- /bin/sh

# 5. Run a debug Pod with the same image
kubectl run debug --image=your-image:tag --restart=Never -- sleep 3600
kubectl exec -it debug -- /bin/sh

Quick Cause Identification

Symptom	Likely Cause
Java exception in logs	Application crash
`OOMKilled`	Memory limit exceeded
`permission denied`	File execute permission
`secret not found`	Missing Secret
No logs + probe failure	Liveness probe misconfiguration

Automatic Detection with TestForge

TestForge’s Scan feature automatically monitors Pod status after each deploy.
On CrashLoopBackOff, it sends a Slack alert and analyzes the probable root cause — without you having to dig through logs manually.

A practical hub for operating and improving AI services