Resource Quotas & LimitRanges
Without resource controls, one misbehaving team can exhaust cluster CPU or memory and starve every other workload. ResourceQuota caps total resource consumption per namespace. LimitRange sets defaults and bounds per container so that every pod gets a request and limit even if the developer forgot to specify one.
Why Quotas
Two failure modes that quotas prevent:
- Resource hoarding — a team deploys 200 replicas for a load test and starves other namespaces. ResourceQuota caps the namespace's total CPU and memory.
- Unbounded pods — a container with no limits gets scheduled anywhere and can consume entire node resources, causing OOM kills of co-located pods. LimitRange injects defaults.
ResourceQuota
ResourceQuota enforces aggregate limits across all resources in a namespace. When a quota is set, every new pod must specify resource requests and limits — otherwise the API server rejects it.
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
# Compute
requests.cpu: "20" # total CPU requests across all pods
requests.memory: 40Gi # total memory requests
limits.cpu: "40" # total CPU limits
limits.memory: 80Gi # total memory limits
# Object counts
pods: "100" # maximum number of pods
services: "20"
persistentvolumeclaims: "30"
secrets: "50"
configmaps: "50"
# Service types (prevent accidental LoadBalancer creation)
services.loadbalancers: "2"
services.nodeports: "0"
apiVersion: v1
kind: ResourceQuota
metadata:
name: besteffort-quota
namespace: production
spec:
hard:
pods: "10" # max 10 BestEffort pods (no requests/limits set)
scopeSelector:
matchExpressions:
- operator: In
scopeName: QOSClass
values: ["BestEffort"]
LimitRange
LimitRange is a namespace policy that sets default requests and limits, and enforces min/max bounds, for containers and pods. It prevents the "forgot to set limits" class of incident.
apiVersion: v1
kind: LimitRange
metadata:
name: production-limits
namespace: production
spec:
limits:
# Container-level defaults and bounds
- type: Container
default: # injected when container has no limits set
cpu: "500m"
memory: "256Mi"
defaultRequest: # injected when container has no requests set
cpu: "100m"
memory: "128Mi"
min: # request/limit must be at least this
cpu: "50m"
memory: "64Mi"
max: # request/limit cannot exceed this
cpu: "4"
memory: "8Gi"
# Pod-level bounds (sum of all containers)
- type: Pod
max:
cpu: "8"
memory: "16Gi"
# PVC storage bounds
- type: PersistentVolumeClaim
min:
storage: "1Gi"
max:
storage: "100Gi"
exceed 20 CPU cores total
gets 100m CPU request by default
QoS Classes
Kubernetes assigns a QoS class to every pod based on its resource configuration. When a node runs out of memory, the kubelet evicts pods in order: BestEffort first, then Burstable, then Guaranteed last.
| QoS Class | Condition | Eviction priority |
|---|---|---|
| Guaranteed | Every container has requests == limits for both CPU and memory. | Last evicted. Never CPU-throttled. |
| Burstable | At least one container has a request set; requests ≠ limits. | Evicted after BestEffort. |
| BestEffort | No container has any requests or limits set. | First evicted under memory pressure. |
containers:
- name: api
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m" # must equal request for Guaranteed
memory: "512Mi" # must equal request for Guaranteed
PriorityClasses
PriorityClass assigns an integer priority to pods. Higher priority pods preempt lower priority ones when the cluster is full — they can evict running pods to make room.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: critical-apps
value: 1000000
globalDefault: false
description: "Production-critical services. Preempts batch and dev workloads."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: batch-jobs
value: 100
preemptionPolicy: Never # won't preempt others even if higher priority exists
description: "Background batch jobs. BestEffort scheduling."
---
# Use in pod spec
spec:
priorityClassName: critical-apps
Multi-Tenant Strategy
For a multi-team cluster, apply this pattern per namespace:
- ResourceQuota — set per team based on capacity allocation agreements. Start with 2× expected peak.
- LimitRange — enforce sane defaults so teams that forget limits don't cause incidents. Set max per container to prevent single-container runaway.
- PriorityClass — tiered priority: production → staging → development → batch. Production workloads survive node pressure even when dev namespaces are overloaded.
- Namespace-level RBAC — teams can only see and edit resources in their own namespaces.
kubectl Commands
# View quota usage in a namespace
kubectl describe resourcequota -n production
# View all quotas cluster-wide
kubectl get resourcequota -A
# Check which containers have no resource requests/limits (quota violations)
kubectl get pods -n production -o json | jq '
.items[] | .spec.containers[] |
select(.resources.requests == null or .resources.limits == null) |
.name'
# Check QoS class of all pods in a namespace
kubectl get pods -n production -o json | jq -r \
'.items[] | "\(.metadata.name): \(.status.qosClass)"'
# Check LimitRange
kubectl describe limitrange -n production