Dynamic Resource Allocation

Quota management for workloads using Kubernetes Dynamic Resource Allocation (DRA).

Dynamic Resource Allocation

Dynamic Resource Allocation (DRA) is a Kubernetes API for requesting and managing hardware devices such as GPUs, FPGAs, and network adapters. Kueue can account for DRA devices in quota management through two paths:

  1. ResourceClaimTemplate path: Pods explicitly reference a ResourceClaimTemplate that specifies a device request. Kueue maps each DeviceClass referenced by the claim to a logical resource name using deviceClassMappings in the Kueue Configuration.

  2. Extended resource path: Pods request DRA devices using the traditional resources.requests syntax (e.g., nvidia.com/gpu: 1). When the Kubernetes DeviceClass has an extendedResourceName field set (KEP-5004), the kube-scheduler automatically creates ResourceClaim objects from these requests. Kueue detects this and avoids double counting.

Which path should I use? If your workloads already use resources.requests for devices (e.g., nvidia.com/gpu: 1), use the extended resource path. If your workloads explicitly create ResourceClaimTemplate objects, use the ResourceClaimTemplate path.

How the ResourceClaimTemplate path works

Feature state beta since Kueue v0.18

When a Pod references a ResourceClaimTemplate, Kueue reads the deviceClassName from the template’s exactly field and looks it up in deviceClassMappings. This mapping tells Kueue which logical resource name to charge quota against. The number of units charged is determined by the count field in the device request (default 1).

Only the ExactCount allocation mode is supported. The All allocation mode is not supported.

For setup instructions, see Set Up Dynamic Resource Allocation.

How the extended resource path works

Feature state alpha since Kueue v0.18

When a Pod requests an extended resource backed by DRA (e.g., nvidia.com/gpu: 1), the kube-scheduler auto-creates a ResourceClaim. Without the KueueDRAIntegrationExtendedResource feature gate enabled, Kueue would charge quota for both the resources.requests entry and the auto-created claim, double counting the same device.

With KueueDRAIntegrationExtendedResource enabled, Kueue detects the matching DeviceClass, uses extendedResourceName as the quota key, and drops the auto-created claim from accounting. No deviceClassMappings configuration is needed — the mapping is discovered from the DeviceClass automatically.

Path separation

The two paths are independent:

  • ResourceClaimTemplate path: uses deviceClassMappings configuration.
  • Extended resource path: uses auto-discovery from DeviceClass objects.

Do not configure the same DeviceClass in both paths for the same workload. If overlap occurs, Kueue merges the resources using the deviceClassMappings logical name as the quota key, which may result in incorrect quota accounting.

Quota accounting

DRA resources are tracked in ClusterQueue quotas just like CPU or memory. The administrator includes the DRA resource name in coveredResources and sets a nominalQuota. When a workload is admitted, Kueue charges quota based on the count value in each device request (default 1 when omitted).

For example, a ClusterQueue with example.com/gpu: 8 allows up to 8 concurrent device allocations across all workloads using that queue.

Admission and scheduling gap

There is a timing gap between Kueue admitting a workload (quota check) and the kube-scheduler allocating the actual device. Kueue does not know which specific device will be allocated — it only verifies that quota is available.

If the cluster state changes between these two steps (e.g., another system consumes the device), the scheduler may fail to allocate. The WaitForPodsReady feature provides a safety net by evicting workloads that fail to become ready within a configured timeout.

MultiKueue

DRA workloads are supported with MultiKueue. MultiKueue syncs the workload and its owning job to worker clusters, but ResourceClaimTemplate and DeviceClass objects are not automatically synced. These must be created on each worker cluster separately by the cluster administrator.

Counter-based quota for partitionable devices

Feature state alpha since Kueue v0.18

By default, Kueue tracks DRA quota by device count: each device request charges count units regardless of the device’s capacity. This means a small GPU partition and a full GPU both count as “1 device”, which does not reflect the actual resource consumption.

With the KueueDRAIntegrationPartitionableDevices feature gate enabled, Kueue can track quota using counter values published by DRA drivers in ResourceSlice objects. This allows quota to reflect actual device capacity (e.g., GPU memory) rather than device count.

A DeviceClass uses either device-count quota (no sources configured) or counter-based quota (with sources), not both. Kueue rejects configurations that map the same DeviceClass to multiple resource names.

How it works

  1. The administrator configures a sources entry in deviceClassMappings that specifies which counter to track, which DRA driver to query, and a CEL expression to scope eligible devices.

  2. When a workload is submitted, Kueue reads the consumesCounters field from the matching devices in ResourceSlice objects to determine the actual counter charge.

  3. Kueue uses conservative charging: it takes the maximum consumesCounters value across all matched devices and multiplies by the request count. This ensures quota is not undercharged when different devices consume different amounts.

  4. The ClusterQueue quota is set in counter units (e.g., 800Gi for GPU memory) instead of device count.

Prerequisites

  • Kubernetes 1.35 or later with the DRAPartitionableDevices feature gate enabled (beta in Kubernetes 1.36).
  • A DRA driver that publishes consumesCounters on devices in ResourceSlice objects.
  • The KueueDRAIntegrationPartitionableDevices Kueue feature gate enabled.

For setup instructions, see Set Up Dynamic Resource Allocation.

Limitations

The following limitations apply:

  • ResourceClaimTemplates only: Only ResourceClaimTemplate references are supported. Direct ResourceClaim references in the Pod spec are not supported and will result in inadmissible workloads.
  • ExactCount allocation mode only: Only device requests using exactly are supported. FirstAvailable device selection and the All allocation mode are not supported.
  • No device constraints or config: Device constraints (MatchAttribute) and per-request config are not supported.
  • No AdminAccess: Device requests with adminAccess: true are not supported.
  • No DRA + Topology Aware Scheduling (TAS): DRA resources are not accounted for in TAS capacity calculations. Using both features together may result in incorrect topology assignments for DRA devices.
  • No support for DRADeviceTaints or DRAPrioritizedLists: These Kubernetes DRA features are not factored into Kueue’s quota decisions.
  • No GPU time-slicing or MPS: Software-based GPU sharing mechanisms are not yet supported in Kueue. These require KEP-5075 (Consumable Capacity) (beta in Kubernetes 1.36) and KEP-5691 (Restricted Sharing) integration, which is planned for a future Kueue release.