Jobs & CronJobs
Deployments and StatefulSets manage long-running processes that should never stop. Jobs manage the opposite: tasks that run to completion and then stop. A Job tracks success and failure, handles retries, and can run multiple pods in parallel for throughput. A CronJob schedules a Job on a repeating calendar schedule — the Kubernetes equivalent of a Unix cron.
What Is a Job?
A Job creates one or more pods, runs them until the specified number complete successfully, and then stops. Unlike a Deployment where restartPolicy: Always is required, Jobs use restartPolicy: OnFailure or restartPolicy: Never.
Common use cases:
- Database migrations before a new app version deploys
- One-time data imports or exports
- Batch ML model training runs
- Sending a mass email or notification
- Report generation on demand
Job YAML
apiVersion: batch/v1
kind: Job
metadata:
name: db-migrate
spec:
completions: 1 # how many pods must succeed
parallelism: 1 # how many pods run at once
backoffLimit: 4 # retry up to 4 times on failure
activeDeadlineSeconds: 600 # kill the job after 10 min
template:
spec:
restartPolicy: OnFailure # Never or OnFailure (not Always)
containers:
- name: migrate
image: myapp:1.0
command: ["./migrate", "--up"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
| Field | Description |
|---|---|
completions | Total successful pod completions required. Default 1. |
parallelism | Max pods running simultaneously. Default 1. |
backoffLimit | Retries before the Job is marked Failed. Default 6. |
activeDeadlineSeconds | Hard timeout. Job is killed if it runs past this. Overrides backoffLimit. |
ttlSecondsAfterFinished | Auto-delete the Job N seconds after it completes. Keeps the cluster tidy. |
Jobs require restartPolicy: OnFailure or restartPolicy: Never. Always is for long-running services and will cause a validation error if used in a Job spec.
Parallelism & Completions
Jobs support three work queue patterns controlled by completions and parallelism:
| Pattern | completions | parallelism | Use case |
|---|---|---|---|
| Single run | 1 (default) | 1 (default) | One task, one pod, one try |
| Fixed completions | N | M < N | Process N work items with M workers; each pod does one item |
| Work queue | unset | M | Pods pull from an external queue; Job ends when any pod completes |
spec:
completions: 10
parallelism: 3
# Kubernetes runs 3 pods at a time; as each succeeds,
# another starts until 10 total have completed.
Failure Handling
When a pod in a Job fails (non-zero exit code or OOMKilled), the Job controller retries according to backoffLimit using exponential back-off (10s, 20s, 40s…). After backoffLimit retries, the Job is marked Failed and no more pods are created.
| restartPolicy | On failure |
|---|---|
OnFailure | Container is restarted in-place on the same pod. Pod stays; container restarts. |
Never | Pod is marked Failed and a new pod is created. Old pod remains (for log inspection). |
Use Never when you need to inspect failed pod logs — OnFailure restarts the container in place, potentially overwriting the failure state.
Completed Jobs and their pods linger in the cluster consuming etcd space. Use ttlSecondsAfterFinished: 3600 to auto-delete after an hour, or add a regular cleanup job. Without this, old jobs accumulate indefinitely.
What Is a CronJob?
A CronJob creates a new Job on a schedule you define as a cron expression. Every trigger creates a fresh Job object (and thus fresh pods). The CronJob itself is just a scheduler — the actual work is done by the Job it spawns.
Common use cases:
- Daily database backups at 02:00
- Hourly report generation
- Weekly cache warmup
- Periodic cleanup of old files or records
CronJob YAML
apiVersion: batch/v1
kind: CronJob
metadata:
name: db-backup
spec:
schedule: "0 2 * * *" # daily at 02:00 UTC
timeZone: "UTC" # explicit tz (Kubernetes 1.27+)
concurrencyPolicy: Forbid # skip if previous run still active
startingDeadlineSeconds: 300 # give up if >5 min late to start
successfulJobsHistoryLimit: 3 # keep 3 successful job records
failedJobsHistoryLimit: 1 # keep 1 failed job record
jobTemplate:
spec:
backoffLimit: 2
activeDeadlineSeconds: 3600
template:
spec:
restartPolicy: OnFailure
containers:
- name: backup
image: myapp/backup:1.0
command: ["./backup.sh"]
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "500m"
Schedule Syntax
CronJob schedules use standard cron syntax: minute hour day-of-month month day-of-week.
| Schedule | Meaning |
|---|---|
0 * * * * | Every hour on the hour |
0 2 * * * | Every day at 02:00 |
0 2 * * 0 | Every Sunday at 02:00 |
*/15 * * * * | Every 15 minutes |
0 0 1 * * | First of each month at midnight |
@daily | Shorthand for 0 0 * * * |
@hourly | Shorthand for 0 * * * * |
Before Kubernetes 1.27, all cron schedules were interpreted as UTC. From 1.27 onward, you can set spec.timeZone to a tz database name (e.g. "America/New_York"). Always be explicit to avoid confusion during daylight-saving transitions.
Concurrency Policy
spec.concurrencyPolicy controls what happens when a new Job would be triggered while the previous one is still running:
| Policy | Behaviour | Use case |
|---|---|---|
| Allow (default) | New Job runs even if previous is still running. Can cause overlap. | Idempotent tasks with no shared state. |
| Forbid | Skip the new run if the previous Job is still active. | Backups, migrations — must not run concurrently. |
| Replace | Delete the running Job and start a fresh one. | Cache refresh — only the latest run matters. |
kubectl Commands
# Apply a Job or CronJob
kubectl apply -f job.yaml
kubectl apply -f cronjob.yaml
# Check Job status
kubectl get job db-migrate
kubectl describe job db-migrate
# Watch Job pods
kubectl get pods -l job-name=db-migrate -w
# Check Job logs
kubectl logs job/db-migrate
# List CronJobs
kubectl get cronjobs
# Manually trigger a CronJob (create a Job from it)
kubectl create job --from=cronjob/db-backup manual-backup-$(date +%s)
# Delete a completed Job (and its pods)
kubectl delete job db-migrate
# Suspend a CronJob (stop future runs without deleting)
kubectl patch cronjob db-backup -p '{"spec":{"suspend":true}}'