Optimize Pod Performance with Manager Policies

This guide provides Kubernetes cluster administrators with a practical, ready-to-apply manual for enabling and validating CPU ManagerPolicy, Memory ManagerPolicy, and Topology ManagerPolicy. By aligning CPU pinning, NUMA affinity, and topology alignment, you can deliver consistent latency and improved performance for critical workloads.

Scope and Prerequisites

Roles and Permissions

Requires maintenance window access, kubectl admin privileges, and SSH access to nodes.

Workload Requirements

To achieve dedicated CPU and NUMA affinity, Pods must run in Guaranteed QoS class: requests = limits and CPU specified in full cores (e.g., 2, 4).

Not Covered

HugePages are out of scope. If you need HugePages support, contact your support team.

Quick Start: Sample Kubelet Config

The following kubelet configuration enables NUMA-aware scheduling. Adjust the values for your environment. How you apply it depends on the node operating system—see Applying the Configuration.

# —— CPU ManagerPolicy ——
cpuManagerPolicy: "static"              # Options: none | static
cpuManagerPolicyOptions:
  full-pcpus-only: "true"               # Recommended: allocate only full cores
cpuManagerReconcilePeriod: "5s"
reservedSystemCPUs: ""                  # e.g. "0-1" if reserving specific CPUs for the system

# —— Memory ManagerPolicy ——
memoryManagerPolicy: "Static"           # Options: none | Static 
reservedMemory:
  - numaNode: 0
    limits:
      memory: "2048Mi"
  - numaNode: 1
    limits:
      memory: "2048Mi"

# —— Topology ManagerPolicy ——
topologyManagerPolicy: "single-numa-node"     # Options: none | best-effort | restricted | single-numa-node
topologyManagerScope: "pod"                   # Options: container | pod

Notes:

full-pcpus-only: "true" improves latency consistency.
topologyManagerScope: pod ensures containers within the same Pod align to a common NUMA topology.
reservedMemory must be calculated based on kubelet config and eviction thresholds (see next section).

How to Calculate `reservedMemory`

Formula:

R_total = kubeReserved(memory) + systemReserved(memory) + evictionHard(memory.available)

The sum of reservedMemory across all NUMA nodes must equal R_total.

Steps (for N NUMA nodes):

Calculate R_total (Mi).
Compute division and remainder:
- base = floor(R_total / N)
- rem = R_total − base × N
Assign values:
- NUMA node 0 = base + rem
- Remaining NUMA nodes = base

Example (2 NUMA nodes):

kubeReserved=512Mi, systemReserved=512Mi, evictionHard=100Mi → R_total = 1124Mi

base = 562, rem = 0

reservedMemory:
- numaNode: 0
  limits:
    memory: "562Mi"
- numaNode: 1
  limits:
    memory: "562Mi"

Applying the Configuration

How you apply these kubelet settings depends on the node operating system. Immutable Infrastructure is the recommended approach.

Immutable Infrastructure (recommended)

On immutable nodes, kubelet settings are managed declaratively with Machine Configuration—you do not edit kubelet files on the node directly. Configure topologyManagerPolicy, topologyManagerScope, reservedSystemCPUs, and reserved resources by writing a kubelet configuration drop-in, as described in Configuring Kubelet.

Changing cpuManagerPolicy or memoryManagerPolicy to a static policy additionally requires draining the node and resetting the kubelet CPU and memory manager state before the kubelet restarts. On immutable infrastructure this orchestration is delivered through Machine Configuration; if it is not yet available in your environment, configure static CPU and Memory Manager policies with assistance from your support team.

Traditional Operating Systems

On traditional operating systems, apply the configuration on each node:

Cordon and Drain

kubectl cordon <node>
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data

Stop Kubelet and Clear State

sudo systemctl stop kubelet
sudo rm -f /var/lib/kubelet/cpu_manager_state
sudo rm -f /var/lib/kubelet/memory_manager_state

Restart Kubelet

sudo systemctl daemon-reload
sudo systemctl start kubelet

Reschedule Pods
kubectl uncordon <node>

For DaemonSets and system Pods, restart or delete Pods explicitly.

Verify Recovery

kubectl get nodes
kubectl get pods -A -o wide | grep <node>

Verification

CPU ManagerPolicy State

sudo cat /var/lib/kubelet/cpu_manager_state | jq .

Check:

.policyName = "static"
.defaultCpuSet lists non-dedicated CPUs
.entries show container-to-CPU assignments

Memory ManagerPolicy State

sudo cat /var/lib/kubelet/memory_manager_state | jq .

Check:

.policyName = "Static"
Sum of reserved memory matches R_total
Guaranteed Pods are assigned to NUMA nodes per single-numa-node policy

Key Policies and Behaviors

CPU ManagerPolicy

Purpose: Allocate exclusive physical CPUs to Guaranteed Pods
Config: cpuManagerPolicy: static, full-pcpus-only: "true"
Behavior: Only applies to Guaranteed Pods; Burstable/BestEffort are unaffected

Memory ManagerPolicy

Purpose: Reserve and align memory at NUMA node level
Config: memoryManagerPolicy: "Static", reservedMemory
Behavior: Works best with Topology ManagerPolicy for alignment

Topology ManagerPolicy

Purpose: Align CPU, memory, and device allocation on a single NUMA node
Config: topologyManagerPolicy: single-numa-node, topologyManagerScope: pod
Modes: best-effort, restricted, single-numa-node (strict)

Terminology

NUMA node: Non-Uniform Memory Access domain
CPU pinning: Binding containers to dedicated CPUs
NUMA affinity: Preferring memory from the same NUMA node as CPU
Topology alignment: Co-locating CPU, memory, and devices on one NUMA node
Guaranteed Pod: requests = limits; CPU specified as full cores

#Optimize Pod Performance with Manager Policies

#TOC

#Scope and Prerequisites

#Quick Start: Sample Kubelet Config

#How to Calculate reservedMemory

#Applying the Configuration

#Immutable Infrastructure (recommended)

#Traditional Operating Systems

#Verification

#Key Policies and Behaviors

#Terminology