Share Quotas Across Resource Flavors

Use resource transformations to enforce a shared quota across multiple resource flavors

This page demonstrates how to use resource transformations to enforce a unified quota that applies across multiple resource flavors.

A common use case is when you have multiple hardware types (for example, different CPU architectures or GPU models) but want to limit the total usage per team regardless of which flavor they consume. By defining a virtual “credits” resource, you can assign a cost to each flavor and enforce a combined quota.

The intended audience for this page are batch administrators.

Before you begin

Make sure the following conditions are met:

  • A Kubernetes cluster is running.
  • The kubectl command-line tool has communication with your cluster.
  • Kueue is installed.

Use case: limit total CPU usage across flavors

Assume your cluster has two CPU flavors (for example, on-demand and spot nodes) and you want to limit the total CPU usage per team to 14 CPUs, regardless of which flavor they use.

You can accomplish this by:

  1. Defining a virtual resource called cpu_credits
  2. Configuring a resource transformation that generates 1 cpu_credit per CPU requested
  3. Creating a separate ResourceGroup in the ClusterQueue that limits cpu_credits

1. Configure the resource transformation

Follow the installation instructions for using a custom configuration and extend the Kueue configuration:

apiVersion: config.kueue.x-k8s.io/v1beta2
kind: Configuration
resources:
  transformations:
  - input: cpu
    strategy: Retain
    outputs:
      cpu_credits: 1

This configuration tells Kueue to generate 1 cpu_credit for every CPU requested, while retaining the original cpu resource for scheduling purposes.

2. Create the queuing infrastructure

Create the ResourceFlavors, ClusterQueue, and LocalQueue. You can apply the complete setup at once:

kubectl apply -f https://kueue.sigs.k8s.io/examples/admin/shared-quota-setup.yaml

Or create the resources individually:

# Example setup for sharing quotas across multiple resource flavors
# using resource transformations with virtual "credits"
#
# This example demonstrates:
# - Two CPU flavors (on-demand and spot)
# - A credits-based quota that limits total CPU usage across both flavors
# - Each CPU requested generates 1 cpu_credit
# - The combined quota is limited to 14 CPUs via the cpu_credits resource
#
# Prerequisites:
# Configure Kueue with the following resource transformation in the Configuration:
#
#   resources:
#     transformations:
#     - input: cpu
#       strategy: Retain
#       outputs:
#         cpu_credits: 1

---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
  name: "on-demand"
spec:
  nodeLabels:
    node-type: on-demand
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
  name: "spot"
spec:
  nodeLabels:
    node-type: spot
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
  name: "credits"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
  name: "team-cluster-queue"
spec:
  namespaceSelector: {}
  resourceGroups:
  # First ResourceGroup: actual CPU resources per flavor
  - coveredResources: ["cpu"]
    flavors:
    - name: "on-demand"
      resources:
      - name: "cpu"
        nominalQuota: 9
    - name: "spot"
      resources:
      - name: "cpu"
        nominalQuota: 9
  # Second ResourceGroup: virtual credits for combined quota
  - coveredResources: ["cpu_credits"]
    flavors:
    - name: "credits"
      resources:
      - name: cpu_credits
        nominalQuota: 14
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
  namespace: "default"
  name: "team-queue"
spec:
  clusterQueue: "team-cluster-queue"

With this configuration:

  • Jobs can use up to 9 CPUs from on-demand nodes
  • Jobs can use up to 9 CPUs from spot nodes
  • The total combined CPU usage is limited to 14 CPUs (enforced via cpu_credits)

Example workload

Here is an example Job that requests 3 CPUs:

# Sample Job for the shared quota setup
# This job requests 3 CPUs, which will generate 3 cpu_credits
apiVersion: batch/v1
kind: Job
metadata:
  generateName: sample-job-
  namespace: default
  labels:
    kueue.x-k8s.io/queue-name: team-queue
spec:
  parallelism: 1
  completions: 1
  suspend: true
  template:
    spec:
      containers:
      - name: worker
        image: registry.k8s.io/e2e-test-images/agnhost:2.53
        args: ["sleep", "300"]
        resources:
          requests:
            cpu: "3"
      restartPolicy: Never

You can create the Job using the following command:

kubectl create -f https://kueue.sigs.k8s.io/examples/admin/shared-quota-sample-job.yaml

When this Job is submitted, the Workload will have the following resource requests:

resourceRequests:
- name: main
  resources:
    cpu: 3
    cpu_credits: 3

Observing the quota enforcement

After creating several jobs, you can observe the quota enforcement in action:

kubectl get workloads

Example output when the cpu_credits quota is exhausted:

NAME                        QUEUE        RESERVED IN           ADMITTED   AGE
job-sample-job-abc12-xyz   team-queue   team-cluster-queue    True       4s
job-sample-job-def34-uvw   team-queue   team-cluster-queue    True       3s
job-sample-job-ghi56-rst   team-queue   team-cluster-queue    True       3s
job-sample-job-jkl78-opq   team-queue   team-cluster-queue    True       2s
job-sample-job-mno90-lmn   team-queue                                    2s

The last job is pending because admitting it would exceed the 14 cpu_credits quota. You can verify this by describing the workload:

kubectl describe workload job-sample-job-mno90-lmn

The events will show:

Events:
  Type     Reason   Age   From             Message
  ----     ------   ----  ----             -------
  Warning  Pending  10s   kueue-admission  couldn't assign flavors to pod set main: insufficient unused quota for cpu_credits in flavor credits, 1 more needed

Extending this pattern

You can extend this pattern to:

Assign different costs to different flavors

Configure transformations that generate different amounts of credits based on the input resource. For example, you can assign different costs to different GPU types to reflect their relative value or pricing:

apiVersion: config.kueue.x-k8s.io/v1beta2
kind: Configuration
resources:
  transformations:
  - input: nvidia.com/gpu-a100
    strategy: Replace
    outputs:
      gpu_credits: 100
  - input: nvidia.com/gpu-v100
    strategy: Replace
    outputs:
      gpu_credits: 40
  - input: nvidia.com/gpu-t4
    strategy: Replace
    outputs:
      gpu_credits: 10

With a ClusterQueue configured to limit gpu_credits:

apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
  name: "team-cluster-queue"
spec:
  namespaceSelector: {}
  resourceGroups:
  - coveredResources: ["nvidia.com/gpu-a100"]
    flavors:
    - name: "a100"
      resources:
      - name: "nvidia.com/gpu-a100"
        nominalQuota: 8
  - coveredResources: ["nvidia.com/gpu-v100"]
    flavors:
    - name: "v100"
      resources:
      - name: "nvidia.com/gpu-v100"
        nominalQuota: 16
  - coveredResources: ["nvidia.com/gpu-t4"]
    flavors:
    - name: "t4"
      resources:
      - name: "nvidia.com/gpu-t4"
        nominalQuota: 32
  - coveredResources: ["gpu_credits"]
    flavors:
    - name: "credits"
      resources:
      - name: gpu_credits
        nominalQuota: 500

This setup allows teams to use any combination of GPU types while staying within their total “budget” of 500 credits. A team could use 5 A100s (500 credits), or 12 V100s (480 credits), or a mix of different types.

Track costs across resource types

Generate credits from multiple input resources (CPU, memory, GPU) to enforce a combined budget.

Implement monetary budgets

Use credits to approximate the cost of cloud resources and enforce spending limits per team.

For more examples of resource transformations, see: