Production Operations

Resource Quotas & LimitRanges

● Advanced ⏱ 12 min read

Without resource controls, one misbehaving team can exhaust cluster CPU or memory and starve every other workload. ResourceQuota caps total resource consumption per namespace. LimitRange sets defaults and bounds per container so that every pod gets a request and limit even if the developer forgot to specify one.

Why Quotas

Two failure modes that quotas prevent:

ResourceQuota

ResourceQuota enforces aggregate limits across all resources in a namespace. When a quota is set, every new pod must specify resource requests and limits — otherwise the API server rejects it.

resourcequota.yaml — namespace caps
apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    # Compute
    requests.cpu: "20"          # total CPU requests across all pods
    requests.memory: 40Gi       # total memory requests
    limits.cpu: "40"            # total CPU limits
    limits.memory: 80Gi         # total memory limits

    # Object counts
    pods: "100"                 # maximum number of pods
    services: "20"
    persistentvolumeclaims: "30"
    secrets: "50"
    configmaps: "50"

    # Service types (prevent accidental LoadBalancer creation)
    services.loadbalancers: "2"
    services.nodeports: "0"
scoped quotas — different limits by QoS class
apiVersion: v1
kind: ResourceQuota
metadata:
  name: besteffort-quota
  namespace: production
spec:
  hard:
    pods: "10"                  # max 10 BestEffort pods (no requests/limits set)
  scopeSelector:
    matchExpressions:
    - operator: In
      scopeName: QOSClass
      values: ["BestEffort"]

LimitRange

LimitRange is a namespace policy that sets default requests and limits, and enforces min/max bounds, for containers and pods. It prevents the "forgot to set limits" class of incident.

limitrange.yaml — defaults and bounds
apiVersion: v1
kind: LimitRange
metadata:
  name: production-limits
  namespace: production
spec:
  limits:
  # Container-level defaults and bounds
  - type: Container
    default:                    # injected when container has no limits set
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:             # injected when container has no requests set
      cpu: "100m"
      memory: "128Mi"
    min:                        # request/limit must be at least this
      cpu: "50m"
      memory: "64Mi"
    max:                        # request/limit cannot exceed this
      cpu: "4"
      memory: "8Gi"

  # Pod-level bounds (sum of all containers)
  - type: Pod
    max:
      cpu: "8"
      memory: "16Gi"

  # PVC storage bounds
  - type: PersistentVolumeClaim
    min:
      storage: "1Gi"
    max:
      storage: "100Gi"
ResourceQuota vs LimitRange — different scopes
ResourceQuota
Scope: namespace total
Tracks cumulative usage
Rejects pod if quota exceeded
Example: namespace cannot
exceed 20 CPU cores total
LimitRange
Scope: per container/pod
Injects defaults at admission
Enforces min/max bounds
Example: every container
gets 100m CPU request by default
ResourceQuota caps aggregate namespace usage. LimitRange sets per-container defaults and bounds. Use both together for full control.

QoS Classes

Kubernetes assigns a QoS class to every pod based on its resource configuration. When a node runs out of memory, the kubelet evicts pods in order: BestEffort first, then Burstable, then Guaranteed last.

QoS ClassConditionEviction priority
GuaranteedEvery container has requests == limits for both CPU and memory.Last evicted. Never CPU-throttled.
BurstableAt least one container has a request set; requests ≠ limits.Evicted after BestEffort.
BestEffortNo container has any requests or limits set.First evicted under memory pressure.
Guaranteed QoS — requests == limits
containers:
- name: api
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "500m"       # must equal request for Guaranteed
      memory: "512Mi"   # must equal request for Guaranteed

PriorityClasses

PriorityClass assigns an integer priority to pods. Higher priority pods preempt lower priority ones when the cluster is full — they can evict running pods to make room.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: critical-apps
value: 1000000
globalDefault: false
description: "Production-critical services. Preempts batch and dev workloads."

---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: batch-jobs
value: 100
preemptionPolicy: Never    # won't preempt others even if higher priority exists
description: "Background batch jobs. BestEffort scheduling."

---
# Use in pod spec
spec:
  priorityClassName: critical-apps

Multi-Tenant Strategy

For a multi-team cluster, apply this pattern per namespace:

  1. ResourceQuota — set per team based on capacity allocation agreements. Start with 2× expected peak.
  2. LimitRange — enforce sane defaults so teams that forget limits don't cause incidents. Set max per container to prevent single-container runaway.
  3. PriorityClass — tiered priority: production → staging → development → batch. Production workloads survive node pressure even when dev namespaces are overloaded.
  4. Namespace-level RBAC — teams can only see and edit resources in their own namespaces.

kubectl Commands

# View quota usage in a namespace
kubectl describe resourcequota -n production

# View all quotas cluster-wide
kubectl get resourcequota -A

# Check which containers have no resource requests/limits (quota violations)
kubectl get pods -n production -o json | jq '
  .items[] | .spec.containers[] |
  select(.resources.requests == null or .resources.limits == null) |
  .name'

# Check QoS class of all pods in a namespace
kubectl get pods -n production -o json | jq -r \
  '.items[] | "\(.metadata.name): \(.status.qosClass)"'

# Check LimitRange
kubectl describe limitrange -n production