MultiKueue

Kueue multi cluster job dispatching.

A MultiKueue setup is composed of a manager cluster and at least one worker cluster.

Cluster Roles

Manager Cluster

The manager’s main responsibilities are:

  • Establish and maintain the connection with the worker clusters.
  • Create and monitor remote objects (workloads or jobs) while keeping the local ones in sync.

The MultiKueue Admission Check Controller runs in the manager cluster and will also maintain the Active status of the Admission Checks controlled by multikueue.

The quota set for the flavors of a ClusterQueue using MultiKueue controls how many jobs are subject for dispatching at a given point in time. Ideally, the quota in the manager cluster should be equal to the total quotas in the worker clusters. If significantly lower, the worker clusters will be under utilized. If significantly higher, the manager will dispatch and monitor workloads in the worker clusters that don’t have a chance to be admitted.

Worker Cluster

The worker cluster acts like a standalone Kueue cluster. The workloads and jobs are created and deleted by the MultiKueue Admission Check Controller running in the manager cluster.

Job Flow

For a job to be subject to multi cluster dispatching, you need to assign it to a ClusterQueue that uses a MultiKueue AdmissionCheck. The Multikueue system works as follows:

  • When the job’s Workload gets a QuotaReservation in the manager cluster, a copy of that Workload will be created in all the configured worker clusters.
  • When one of the worker clusters admits the remote workload sent to it:
    • The manager removes all the other remote Workloads.
    • The manager creates a copy of the job in the selected worker cluster, configured to use the quota reserved by the admitted Workload by setting the job’s kueue.x-k8s.io/prebuilt-workload-name label.
  • The manager monitors the remote objects, workload and job, and syncs any changes in their status into the local objects.
  • When the remote workload is marked as Finished:
    • The manager does a last sync for the objects status.
    • The manager removes the objects from the worker cluster.

Supported jobs

batch/Job

Known Limitations:

  • Since unsuspending a Job in the manager cluster will lead to its local execution, the AdmissionCheckStates are kept Pending during the remote job execution.
  • Since updating the status of a local Job could conflict with the Kubernetes Job controller, the manager does not sync the Job status during the job execution. The manager copies the final status of the remote Job when the remote workload is marked as Finished.

There is an ongoing effort to overcome these limitations by adding the possibility to disable the reconciliation of some jobs by the Kubernetes batch/Job controller. Details in kubernetes/enhancements KEP-4368.

JobSet

Known Limitations:

  • Since unsuspending a JobSet in the manager cluster will lead to its local execution and updating the status of a local JobSet could conflict with its main controller, you should only install the JobSet CRDs, but not the controller.

An approach similar to the one described for batch/Job is taken into account to overcome this.

Submitting Jobs

In a configured MultiKueue environment, you can submit any MultiKueue supported job to the Manager cluster, targeting a ClusterQueue configured for Multikueue. Kueue delegates the job to the configured worker clusters without any additional configuration changes.

What’s next?


Last modified March 25, 2024: typo fixes on site (#1901) (f982cda)