Run Plain Pods
This page shows how to leverage Kueue’s scheduling and resource management capabilities when running plain Pods. Kueue supports management of both individual Pods, or Pod groups.
This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue’s overview.
Before you begin
By default, the integration for
v1/pod
is not enabled. Learn how to install Kueue with a custom manager configuration and enable thepod
integration.To allow Kubernetes system pods to be successfully scheduled, you must limit the scope of the
pod
integration. The recomended mechanism for doing this is using themanagedJobsNamespaceSelector
.One approach is to only enable management only for specific namespaces:
apiVersion: config.kueue.x-k8s.io/v1beta1 kind: Configuration managedJobsNamespaceSelector: matchLabels: kueue-managed: "true" integrations: frameworks: - "pod"
An alternate approach is to exempt system namespaces from management:
apiVersion: config.kueue.x-k8s.io/v1beta1 kind: Configuration managedJobsNamespaceSelector: matchExpressions: - key: kubernetes.io/metadata.name operator: NotIn values: [ kube-system, kueue-system ] integrations: frameworks: - "pod"
Note
managedJobsNamespaceSelector
is a Beta feature that is enabled by default.
You can disable it by setting the ManagedJobsNamespaceSelector
feature gate. Check the Installation guide for details on feature gate configuration.
Prior to Kueue v0.10, the Configuration fields integrations.podOptions.namespaceSelector
and integrations.podOptions.podSelector
were used instead. Although podOptions
is
still supported in Kueue v0.10, it is expected to be deprecated in a future release.
Kueue will run webhooks for all created pods if the pod integration is enabled. The webhook namespaceSelector could be used to filter the pods to reconcile. The default webhook namespaceSelector is:
matchExpressions: - key: kubernetes.io/metadata.name operator: NotIn values: [ kube-system, kueue-system ]
When you install Kueue via Helm, the webhook namespace selector will match the
integrations.podOptions.namespaceSelector
in thevalues.yaml
.Make sure that namespaceSelector never matches the kueue namespace, otherwise the Kueue deployment won’t be able to create Pods.
Pods that belong to other API resources managed by Kueue are excluded from being queued by
pod
integration. For example, pods managed bybatch/v1.Job
won’t be managed bypod
integration.Check Administer cluster quotas for details on the initial Kueue setup.
Running a single Pod admitted by Kueue
When running Pods on Kueue, take into consideration the following aspects:
a. Queue selection
The target local queue should be specified in the metadata.labels
section of the Pod configuration.
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
b. Configure the resource needs
The resource needs of the workload can be configured in the spec.containers
.
- resources:
requests:
cpu: 3
c. The “managed” label
Kueue will inject the kueue.x-k8s.io/managed=true
label to indicate which pods are managed by it.
d. Limitations
- A Kueue managed Pod cannot be created in
kube-system
orkueue-system
namespaces. - In case of preemption, the Pod will be terminated and deleted.
Example Pod
Here is a sample Pod that just sleeps for a few seconds:
apiVersion: v1
kind: Pod
metadata:
generateName: kueue-sleep-
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
containers:
- name: sleep
image: busybox
command:
- sleep
args:
- 3s
resources:
requests:
cpu: 3
restartPolicy: OnFailure
You can create the Pod using the following command:
# Create the pod
kubectl create -f kueue-pod.yaml
Running a group of Pods to be admitted together
In order to run a set of Pods as a single unit, called Pod group, add the “pod-group-name” label, and the “pod-group-total-count” annotation to all members of the group, consistently:
metadata:
labels:
kueue.x-k8s.io/pod-group-name: "group-name"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
Feature limitations
Kueue provides only the minimal required functionality of running Pod groups, just for the need of environments where the Pods are managed by external controllers directly, without a Job-level CRD.
As a consequence of this design decision, Kueue does not re-implement core functionalities that are available in the Kubernetes Job API, such as advanced retry policies. In particular, Kueue does not re-create failed Pods.
This design choice impacts the scenario of preemption. When a Kueue needs to preempt a workload that represents a Pod group, kueue sends delete requests for all of the Pods in the group. It is the responsibility of the user or controller that created the original Pods to create replacement Pods.
Note
We recommend using the kubernetes Job API or similar CRDs such as JobSet, MPIJob, RayJob (see more here).Termination
Kueue considers a Pod group as successful, and marks the associated Workload as finished, when the number of succeeded Pods equals the Pod group size.
If a Pod group is not successful, there are two ways you may want to use to terminate execution of a Pod group to free the reserved resources:
- Issue a Delete request for the Workload object. Kueue will terminate all remaining Pods.
- Set the
kueue.x-k8s.io/retriable-in-group: false
annotation on at least one Pod in the group (can be a replacement Pod). Kueue will mark the workload as finished once all Pods are terminated.
Example Pod group
Here is a sample Pod group that just sleeps for a few seconds:
---
apiVersion: v1
kind: Pod
metadata:
generateName: sample-leader-
labels:
kueue.x-k8s.io/queue-name: user-queue
kueue.x-k8s.io/pod-group-name: "sample-group"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
spec:
restartPolicy: Never
containers:
- name: sleep
image: busybox
command: ["sh", "-c", 'echo "hello world from the leader pod" && sleep 3']
resources:
requests:
cpu: 3
---
apiVersion: v1
kind: Pod
metadata:
generateName: sample-worker-
labels:
kueue.x-k8s.io/queue-name: user-queue
kueue.x-k8s.io/pod-group-name: "sample-group"
annotations:
kueue.x-k8s.io/pod-group-total-count: "2"
spec:
restartPolicy: Never
containers:
- name: sleep
image: busybox
command: ["sh", "-c", 'echo "hello world from the worker pod" && sleep 2']
resources:
requests:
cpu: 3
You can create the Pod group using the following command:
kubectl create -f kueue-pod-group.yaml
The name of the associated Workload created by Kueue equals the name of the Pod
group. In this example it is sample-group
, you can inspect the workload using:
kubectl describe workload/sample-group
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.