Pods — The Atomic Unit
A Pod is the smallest deployable unit in Kubernetes. It represents a single instance of a running process in your cluster. A pod encapsulates one or more containers, shared storage (volumes), a unique network IP, and options that govern how the containers run. Understanding pods is the foundation for everything else in Kubernetes.
What Is a Pod?
While Docker containers are the unit you build and ship, pods are the unit Kubernetes schedules and manages. A pod always runs on a single node. All containers in a pod:
- Share the same network namespace — they can communicate via
localhostand share port space. - Share the same IPC namespace — they can use inter-process communication (semaphores, shared memory).
- Can share volumes — a volume mounted to the pod is accessible to all containers in it.
In practice, most pods contain a single container. Multi-container pods are used for tightly coupled helper processes (sidecar, ambassador, adapter patterns).
nginx:1.27
fluent-bit
Pod Spec
Here is a minimal pod manifest:
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Key fields:
| Field | Description |
|---|---|
apiVersion: v1 | Pods are in the core API group |
metadata.name | Unique name within a namespace |
metadata.labels | Key-value pairs used by selectors |
spec.containers[].image | Container image (always pin a specific tag) |
spec.containers[].resources | CPU/memory requests and limits |
Pods created directly have no self-healing. If the node fails, the pod is lost. Always use a higher-level workload object: Deployment for stateless apps, StatefulSet for stateful apps, DaemonSet for per-node agents, or Job for batch tasks.
Multi-Container Patterns
When multiple containers in a pod are tightly coupled, they should live together. Three classical patterns:
Sidecar
A helper container that extends or enhances the main container without modifying it. Example: a Fluent Bit container that reads logs from a shared volume and ships them to Elasticsearch, alongside an nginx container.
Ambassador
A proxy container that handles outbound connections on behalf of the main container. Example: a proxy container that automatically retries failed API calls or handles service discovery, so the main app just talks to localhost:8080.
Adapter
Transforms output from the main container into a format expected by external systems. Example: a container that converts the main app's metrics from a custom format into Prometheus format.
Init Containers
Init containers run to completion before any main containers start. They run sequentially — each must succeed (exit 0) before the next one starts. If an init container fails, Kubernetes restarts it according to the pod's restartPolicy.
Common use cases:
- Wait for a dependency (database, service) to be ready before the main app starts
- Clone a Git repo or download config files into a shared volume
- Run database migrations on deploy
- Register the pod with an external service or perform security setup before the app runs
apiVersion: v1
kind: Pod
metadata:
name: app-with-init
spec:
initContainers:
- name: wait-for-db
image: busybox:1.36
command: ['sh', '-c',
'until nc -z postgres-svc 5432; do echo waiting for db; sleep 2; done']
- name: run-migrations
image: myapp:1.0
command: ['./migrate', '--up']
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
containers:
- name: app
image: myapp:1.0
ports:
- containerPort: 8080
Key differences from regular containers:
- Init containers run sequentially; regular containers start simultaneously.
- Init containers must exit successfully before the pod proceeds.
- Init containers do not support
livenessProbeorreadinessProbe. - Init containers can access secrets that main containers cannot — useful for bootstrap credentials that should not be exposed to the running app.
Pod Lifecycle
A pod's status.phase field represents where it is in its lifecycle:
| Phase | Meaning |
|---|---|
Pending | Pod accepted by the cluster but not yet running. Containers are being scheduled or image is being pulled. |
Running | Pod bound to a node and at least one container is running (or starting/restarting). |
Succeeded | All containers in the pod have terminated successfully and will not be restarted. |
Failed | All containers have terminated, and at least one exited with a non-zero code or was killed by the system. |
Unknown | Pod state cannot be determined, usually due to a communication error with the node. |
kubectl get pod nginx-podCheck pod status with:
kubectl get pod nginx-pod
kubectl describe pod nginx-pod # full status + events
Restart Policies
The spec.restartPolicy field controls what happens when a container in the pod exits. Options:
| Policy | Behaviour | Use case |
|---|---|---|
Always | Restart the container whenever it exits (default) | Long-running services (web servers, APIs) |
OnFailure | Restart only if container exits with non-zero code | Batch jobs that should retry on failure |
Never | Never restart regardless of exit code | One-shot tasks, debugging |
Restart attempts use exponential back-off: 10s, 20s, 40s, ... up to 5 minutes, then reset after 10 minutes of success. A pod that keeps restarting shows CrashLoopBackOff.
Resource Requests & Limits
Every container should declare its resource requirements. Kubernetes uses these values for two distinct purposes:
- Requests — the amount of CPU/memory the scheduler uses when deciding which node to place the pod on. The node must have at least this much available.
- Limits — the maximum CPU/memory the container is allowed to use. Exceeding memory limit causes the container to be OOM-killed. Exceeding CPU limit causes CPU throttling (no kill).
resources:
requests:
memory: "128Mi"
cpu: "250m" # 250 millicores = 0.25 of one CPU core
limits:
memory: "256Mi"
cpu: "500m"
CPU is compressible: exceeding the limit throttles the container but doesn't kill it. Memory is incompressible: exceeding the limit kills the container with OOMKilled. Set memory limits carefully — too low and your app gets killed unexpectedly; too high and you waste cluster resources.
Probes
Kubernetes uses probes to determine the health of containers. There are three types:
| Probe | Purpose | On failure |
|---|---|---|
| Liveness | Is the container still alive, or deadlocked and broken? | Container is killed and restarted |
| Readiness | Is the container ready to serve traffic? | Pod is removed from Service endpoints — no traffic routed to it |
| Startup | Has a slow-starting container finished initialising? | Container is killed and restarted; liveness and readiness are disabled until it passes |
Probes support three mechanisms: httpGet, tcpSocket, and exec.
containers:
- name: app
image: myapp:1.0
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30 # gives app up to 5 min to start (30 × 10s)
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
failureThreshold: 3
A failing liveness probe restarts the container. A failing readiness probe only stops traffic. Never use a liveness probe for transient failures (a slow downstream dependency) — that triggers a restart cascade. Use readiness for transient failures; use liveness only for unrecoverable states like deadlocks.
QoS Classes
Kubernetes assigns a Quality of Service (QoS) class to each pod based on its resource configuration. The class determines eviction priority when a node runs out of memory.
| Class | Condition | Eviction priority |
|---|---|---|
| Guaranteed | Every container sets equal requests and limits for both CPU and memory | Last to be evicted |
| Burstable | At least one container has a request or limit set, but they are not all equal | Evicted after BestEffort |
| BestEffort | No container sets any requests or limits | First to be evicted |
Check a pod's assigned class:
kubectl get pod nginx-pod -o jsonpath='{.status.qosClass}'
# Guaranteed | Burstable | BestEffort
For production workloads, set equal requests and limits on every container to achieve Guaranteed QoS. This prevents the pod from being OOM-evicted during node memory pressure and gives the scheduler predictable placement information.