Optimize Pod Performance with Manager Policies

This guide provides Kubernetes cluster administrators with a practical, ready-to-apply manual for enabling and validating CPU ManagerPolicy, Memory ManagerPolicy, and Topology ManagerPolicy. By aligning CPU pinning, NUMA affinity, and topology alignment, you can deliver consistent latency and improved performance for critical workloads.

Scope and Prerequisites

Roles and Permissions

  • Requires maintenance window access, kubectl admin privileges, and SSH access to nodes.

Workload Requirements

  • To achieve dedicated CPU and NUMA affinity, Pods must run in Guaranteed QoS class: requests = limits and CPU specified in full cores (e.g., 2, 4).

Not Covered

  • HugePages are out of scope. If you need HugePages support, contact your support team.

Quick Start: Sample Kubelet Config

The following kubelet configuration enables NUMA-aware scheduling. Adjust the values for your environment. How you apply it depends on the node operating system—see Applying the Configuration.

# —— CPU ManagerPolicy ——
cpuManagerPolicy: "static"              # Options: none | static
cpuManagerPolicyOptions:
  full-pcpus-only: "true"               # Recommended: allocate only full cores
cpuManagerReconcilePeriod: "5s"
reservedSystemCPUs: ""                  # e.g. "0-1" if reserving specific CPUs for the system

# —— Memory ManagerPolicy ——
memoryManagerPolicy: "Static"           # Options: none | Static 
reservedMemory:
  - numaNode: 0
    limits:
      memory: "2048Mi"
  - numaNode: 1
    limits:
      memory: "2048Mi"

# —— Topology ManagerPolicy ——
topologyManagerPolicy: "single-numa-node"     # Options: none | best-effort | restricted | single-numa-node
topologyManagerScope: "pod"                   # Options: container | pod

Notes:

  • full-pcpus-only: "true" improves latency consistency.
  • topologyManagerScope: pod ensures containers within the same Pod align to a common NUMA topology.
  • reservedMemory must be calculated based on kubelet config and eviction thresholds (see next section).

How to Calculate reservedMemory

Formula:

R_total = kubeReserved(memory) + systemReserved(memory) + evictionHard(memory.available)

The sum of reservedMemory across all NUMA nodes must equal R_total.

Steps (for N NUMA nodes):

  1. Calculate R_total (Mi).

  2. Compute division and remainder:

    • base = floor(R_total / N)
    • rem = R_total − base × N
  3. Assign values:

    • NUMA node 0 = base + rem
    • Remaining NUMA nodes = base

Example (2 NUMA nodes):

  • kubeReserved=512Mi, systemReserved=512Mi, evictionHard=100Mi → R_total = 1124Mi
  • base = 562, rem = 0
    reservedMemory:
    - numaNode: 0
      limits:
        memory: "562Mi"
    - numaNode: 1
      limits:
        memory: "562Mi"

Applying the Configuration

How you apply these kubelet settings depends on the node operating system. Immutable Infrastructure is the recommended approach.

On immutable nodes, kubelet settings are managed declaratively with Machine Configuration—you do not edit kubelet files on the node directly. Configure topologyManagerPolicy, topologyManagerScope, reservedSystemCPUs, and reserved resources by writing a kubelet configuration drop-in, as described in Configuring Kubelet.

Changing cpuManagerPolicy or memoryManagerPolicy to a static policy additionally requires draining the node and resetting the kubelet CPU and memory manager state before the kubelet restarts. On immutable infrastructure this orchestration is delivered through Machine Configuration; if it is not yet available in your environment, configure static CPU and Memory Manager policies with assistance from your support team.

Traditional Operating Systems

On traditional operating systems, apply the configuration on each node:

  1. Cordon and Drain

    kubectl cordon <node>
    kubectl drain <node> --ignore-daemonsets --delete-emptydir-data
  2. Stop Kubelet and Clear State

    sudo systemctl stop kubelet
    sudo rm -f /var/lib/kubelet/cpu_manager_state
    sudo rm -f /var/lib/kubelet/memory_manager_state
  3. Restart Kubelet

    sudo systemctl daemon-reload
    sudo systemctl start kubelet
  4. Reschedule Pods

    kubectl uncordon <node>
  • For DaemonSets and system Pods, restart or delete Pods explicitly.
  1. Verify Recovery

    kubectl get nodes
    kubectl get pods -A -o wide | grep <node>

Verification

CPU ManagerPolicy State

sudo cat /var/lib/kubelet/cpu_manager_state | jq .

Check:

  • .policyName = "static"
  • .defaultCpuSet lists non-dedicated CPUs
  • .entries show container-to-CPU assignments

Memory ManagerPolicy State

sudo cat /var/lib/kubelet/memory_manager_state | jq .

Check:

  • .policyName = "Static"
  • Sum of reserved memory matches R_total
  • Guaranteed Pods are assigned to NUMA nodes per single-numa-node policy

Key Policies and Behaviors

CPU ManagerPolicy

  • Purpose: Allocate exclusive physical CPUs to Guaranteed Pods
  • Config: cpuManagerPolicy: static, full-pcpus-only: "true"
  • Behavior: Only applies to Guaranteed Pods; Burstable/BestEffort are unaffected

Memory ManagerPolicy

  • Purpose: Reserve and align memory at NUMA node level
  • Config: memoryManagerPolicy: "Static", reservedMemory
  • Behavior: Works best with Topology ManagerPolicy for alignment

Topology ManagerPolicy

  • Purpose: Align CPU, memory, and device allocation on a single NUMA node
  • Config: topologyManagerPolicy: single-numa-node, topologyManagerScope: pod
  • Modes: best-effort, restricted, single-numa-node (strict)

Terminology

  • NUMA node: Non-Uniform Memory Access domain
  • CPU pinning: Binding containers to dedicated CPUs
  • NUMA affinity: Preferring memory from the same NUMA node as CPU
  • Topology alignment: Co-locating CPU, memory, and devices on one NUMA node
  • Guaranteed Pod: requests = limits; CPU specified as full cores