Set Up Dynamic Resource Allocation
This page shows you how to configure Kueue to account for DRA devices in quota management.
The intended audience for this page are batch administrators.
For conceptual details, see Dynamic Resource Allocation concepts. For instructions on submitting workloads with DRA devices, see Run Workloads With DRA Devices.
Before you begin
Make sure the following conditions are met:
- A Kubernetes cluster running version 1.34 or later.
- A DRA driver installed in the cluster (e.g., dra-example-driver for testing, or a vendor driver like NVIDIA k8s-dra-driver-gpu for production).
- Kueue is installed.
Warning
In Kueue 0.18, the DRA feature gates were renamed to avoid conflicts with upstream Kubernetes feature gates:DynamicResourceAllocation is now KueueDRAIntegration,
and DRAExtendedResources is now KueueDRAIntegrationExtendedResource.Choose a quota accounting path
Kueue supports two paths for accounting DRA devices in quota. Choose the one that matches how your users submit workloads:
| Path | User’s Pod spec | Kueue feature gate | Admin configuration |
|---|---|---|---|
| ResourceClaimTemplate | References a ResourceClaimTemplate | KueueDRAIntegration | deviceClassMappings required |
| Extended resource | Uses resources.requests (e.g., nvidia.com/gpu: 1) | KueueDRAIntegration + KueueDRAIntegrationExtendedResource | No mapping needed |
Set up the ResourceClaimTemplate path
Use this path when your users submit workloads that explicitly reference
ResourceClaimTemplate objects.
1. Configure deviceClassMappings
Add a deviceClassMappings entry to the Kueue Configuration that maps each
DeviceClass to a logical resource name for quota:
apiVersion: config.kueue.x-k8s.io/v1beta2
kind: Configuration
resources:
deviceClassMappings:
- name: example.com/gpu # Logical resource name for quota
deviceClassNames:
- gpu.example.com # DeviceClass name(s)
name: The resource name used inClusterQueuequotas andWorkloadstatus.deviceClassNames: One or moreDeviceClassnames that map to this resource.
Multiple device classes can map to the same logical resource name. For example, if you have separate device classes for different GPU models but want a single quota pool:
resources:
deviceClassMappings:
- name: example.com/gpu
deviceClassNames:
- gpu-a100.example.com
- gpu-h100.example.com
2. Add the DRA resource to your ClusterQueue
Include the logical resource name from deviceClassMappings in the
coveredResources of your ClusterQueue:
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: "default-flavor"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "cluster-queue"
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory", "example.com/gpu"]
flavors:
- name: "default-flavor"
resources:
- name: "cpu"
nominalQuota: 40
- name: "memory"
nominalQuota: 200Gi
- name: "example.com/gpu"
nominalQuota: 8
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: "default"
name: "user-queue"
spec:
clusterQueue: "cluster-queue"
kubectl apply -f https://kueue.sigs.k8s.io/examples/dra/sample-dra-queues.yaml
The example.com/gpu resource in the ClusterQueue corresponds to the name
field in deviceClassMappings. Each device request referencing a mapped
DeviceClass consumes count units of this quota (default 1 when omitted).
Set up the extended resource path
Use this path when your users submit workloads using the standard
resources.requests syntax (e.g., nvidia.com/gpu: 1) and a DeviceClass
with spec.extendedResourceName exists in the cluster.
1. Enable the feature gates
Install or reconfigure Kueue with both feature gates enabled:
apiVersion: config.kueue.x-k8s.io/v1beta2
kind: Configuration
featureGates:
KueueDRAIntegration: true
KueueDRAIntegrationExtendedResource: true
The Kubernetes cluster also needs the DRAExtendedResource feature gate
enabled on kube-apiserver and kube-scheduler (beta in Kubernetes 1.36).
2. Verify the DeviceClass
Ensure the DeviceClass has spec.extendedResourceName set. This is
typically configured by the DRA driver or cluster administrator:
kubectl get deviceclass gpu.example.com -o jsonpath='{.spec.extendedResourceName}'
If you need to create or update the DeviceClass:
apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: gpu.example.com
spec:
extendedResourceName: example.com/gpu
selectors:
- cel:
expression: device.driver == "gpu.example.com"
No deviceClassMappings configuration is needed for this path. Kueue
auto-discovers the mapping by indexing DeviceClass objects.
3. Add the extended resource to your ClusterQueue
The coveredResources must include the extended resource name that matches
spec.extendedResourceName on the DeviceClass:
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: "default-flavor"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "cluster-queue"
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory", "example.com/gpu"]
flavors:
- name: "default-flavor"
resources:
- name: "cpu"
nominalQuota: 40
- name: "memory"
nominalQuota: 200Gi
- name: "example.com/gpu"
nominalQuota: 8
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: "default"
name: "user-queue"
spec:
clusterQueue: "cluster-queue"
kubectl apply -f https://kueue.sigs.k8s.io/examples/dra/sample-dra-queues.yaml
Why this path exists
Without the KueueDRAIntegrationExtendedResource feature gate, Kueue charges quota for both
the resources.requests entry and the auto-created ResourceClaim, double
counting the same device. With the feature gate enabled, Kueue detects the
matching DeviceClass and charges quota only for the extended resource.
Set up counter-based quota (partitionable devices)
Use this when your cluster has partitionable devices and you want quota to
reflect actual device capacity rather than device count. This requires
Kubernetes 1.35+ with the DRAPartitionableDevices feature gate enabled
and a DRA driver that publishes consumesCounters in ResourceSlice objects.
1. Enable the feature gate and configure counter sources
Install or reconfigure Kueue with the KueueDRAIntegrationPartitionableDevices
feature gate enabled and a sources entry in deviceClassMappings. Follow the
custom configuration installation instructions.
apiVersion: config.kueue.x-k8s.io/v1beta2
kind: Configuration
featureGates:
KueueDRAIntegrationPartitionableDevices: true
resources:
deviceClassMappings:
- name: gpu.memory
deviceClassNames:
- gpu.example.com
sources:
- counter:
name: memory
driver: gpu.example.com
deviceSelector:
cel:
expression: "device.driver == 'gpu.example.com'"
The sources[].counter.name must match a counter key published by your DRA
driver in ResourceSlice devices. You can inspect these with:
kubectl get resourceslices -o jsonpath='{range .items[*]}{.spec.driver}{"\t"}{range .spec.devices[*]}{.name}: {.consumesCounters}{"\n"}{end}{end}'
The output is similar to the following:
gpu.example.com gpu-0: [{"counterSet":"shared","counters":{"memory":{"value":"10Gi"}}}]
2. Add the counter resource to your ClusterQueue
Set the quota in counter units instead of device count:
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: "default-flavor"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "cluster-queue"
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory", "gpu.memory"]
flavors:
- name: "default-flavor"
resources:
- name: "cpu"
nominalQuota: 40
- name: "memory"
nominalQuota: 200Gi
- name: "gpu.memory"
nominalQuota: 800Gi
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: "default"
name: "user-queue"
spec:
clusterQueue: "cluster-queue"
kubectl apply -f https://kueue.sigs.k8s.io/examples/dra/sample-dra-counter-queues.yaml
3. Verify counter-based quota is working
Submit a test workload:
apiVersion: resource.k8s.io/v1
kind: ResourceClaimTemplate
metadata:
namespace: default
name: gpu-partition
spec:
spec:
devices:
requests:
- name: gpu
exactly:
deviceClassName: gpu.example.com
count: 1
---
apiVersion: batch/v1
kind: Job
metadata:
generateName: sample-dra-counter-job-
namespace: default
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
template:
spec:
containers:
- name: dummy-job
image: registry.k8s.io/e2e-test-images/agnhost:2.53
args: ["pause"]
resources:
claims:
- name: gpu
requests:
cpu: "1"
memory: "200Mi"
resourceClaims:
- name: gpu
resourceClaimTemplateName: gpu-partition
restartPolicy: Never
kubectl create -f https://kueue.sigs.k8s.io/examples/dra/sample-dra-counter-job.yaml
Check the workload’s resourceUsage to confirm quota was charged by
counter value:
kubectl -n default get workloads.kueue.x-k8s.io -o jsonpath='{range .items[*]}{.metadata.name}: {.status.admission.podSetAssignments[0].resourceUsage}{"\n"}{end}'
The output is similar to the following:
job-sample-dra-counter-job-xxxxx: {"gpu.memory":"85899345920"}
Troubleshooting counter-based quota
Workload rejected with “insufficient matching devices”: Kueue could not
find enough devices matching the deviceSelector CEL expression. This can
happen if ResourceSlice objects are not yet populated (e.g., during driver
startup or node registration). Verify that ResourceSlices exist and contain
devices matching your selector.
Workload rejected with “no consumesCounters entry for counter”: The
devices in ResourceSlice objects do not have a consumesCounters entry
matching the name configured in sources[].counter.name. Verify the
counter name matches what your DRA driver publishes (see step 1).
Kueue fails to start with “CEL compilation failed”: The deviceSelector
CEL expression has a syntax or type error. Check the expression against the
DRA CEL environment.
Path separation
The two paths are independent. Do not configure the same DeviceClass in
both paths for the same workload. If overlap occurs, Kueue merges the
resources using the deviceClassMappings logical name as the quota key,
which may result in incorrect quota accounting.
Recommended: enable WaitForPodsReady
There is a timing gap between Kueue admitting a workload and the kube-scheduler allocating the actual device. If the cluster state changes between these two steps, the scheduler may fail to allocate. Enabling WaitForPodsReady provides a safety net by evicting workloads that fail to become ready within a configured timeout, allowing them to be re-queued and retried.
MultiKueue considerations
DRA workloads are supported with MultiKueue.
MultiKueue syncs the workload and its owning job to worker clusters, but
ResourceClaimTemplate and DeviceClass objects are not automatically
synced. These must be created on each worker cluster separately.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.