运行 RayService

在 Kueue 上运行 RayService 的指南。

本页演示如何利用 Kueue 的调度与资源管理能力运行 RayService

Kueue 通过为 RayService 创建的 RayCluster 来管理 RayService。 因此,RayService 需要在 metadata.labels 中包含 kueue.x-k8s.io/queue-name: user-queue 标签,该标签会被传递到相应的 RayCluster,以触发 Kueue 的管理。

本指南面向对 Kueue 有基本了解的、对外提供服务的用户。 更多信息,请参见 Kueue 概览

开始之前

  1. 请确保你使用的是 Kueue v0.6.0 版本或更高版本,以及 KubeRay v1.3.0 或更高版本。

  2. 请参见 管理集群配额了解初始 Kueue 设置的详细信息。

  3. 请参见 KubeRay 安装说明了解 KubeRay 的安装和配置详情。

RayService 定义

在 Kueue 上运行 RayService 时,请考虑以下方面:

a. 队列选择

目标 本地队列应在 RayService 配置的 metadata.labels 部分指定,该标签会被传递到其 RayCluster。

metadata:
  labels:
    kueue.x-k8s.io/queue-name: user-queue

b. 配置资源需求

工作负载的资源需求可以在 spec.rayClusterConfig 中配置。

spec:
  rayClusterConfig:
    headGroupSpec:
    template:
      spec:
        containers:
          - resources:
              requests:
                cpu: "1"
    workerGroupSpecs:
    - template:
        spec:
          containers:
            - resources:
                requests:
                  cpu: "1"

c. 限制事项

  • 有限的 Worker Group:由于 Kueue 工作负载最多可以有 8 个 PodSet, 所以spec.rayClusterConfig.workerGroupSpecs 的最大数量为 7。
  • 内建自动扩缩禁用:Kueue 管理 RayService 的资源分配,因此,集群的内部自动扩缩机制需要禁用。

RayService 示例

RayService 如下所示:

apiVersion: ray.io/v1
kind: RayService
metadata:
  name: test-rayservice
  namespace: default
  labels:
    kueue.x-k8s.io/queue-name: user-queue
spec:
  # serveConfigV2 takes a yaml multi-line scalar, which should be a Ray Serve multi-application config. See https://docs.ray.io/en/latest/serve/multi-app.html.
  serveConfigV2: |
    applications:
      - name: fruit_app
        import_path: fruit.deployment_graph
        route_prefix: /fruit
        runtime_env:
          working_dir: "https://github.com/ray-project/test_dag/archive/78b4a5da38796123d9f9ffff59bab2792a043e95.zip"
        deployments:
          - name: MangoStand
            num_replicas: 2
            max_replicas_per_node: 1
            user_config:
              price: 3
            ray_actor_options:
              num_cpus: 0.1
          - name: OrangeStand
            num_replicas: 1
            user_config:
              price: 2
            ray_actor_options:
              num_cpus: 0.1
          - name: PearStand
            num_replicas: 1
            user_config:
              price: 1
            ray_actor_options:
              num_cpus: 0.1
          - name: FruitMarket
            num_replicas: 1
            ray_actor_options:
              num_cpus: 0.1
      - name: math_app
        import_path: conditional_dag.serve_dag
        route_prefix: /calc
        runtime_env:
          working_dir: "https://github.com/ray-project/test_dag/archive/78b4a5da38796123d9f9ffff59bab2792a043e95.zip"
        deployments:
          - name: Adder
            num_replicas: 1
            user_config:
              increment: 3
            ray_actor_options:
              num_cpus: 0.1
          - name: Multiplier
            num_replicas: 1
            user_config:
              factor: 5
            ray_actor_options:
              num_cpus: 0.1
          - name: Router
            num_replicas: 1
  rayClusterConfig:
    rayVersion: '2.46.0' # should match the Ray version in the image of the containers
    ######################headGroupSpecs#################################
    # Ray head pod template.
    headGroupSpec:
      # The `rayStartParams` are used to configure the `ray start` command.
      # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
      # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
      rayStartParams: {}
      #pod template
      template:
        spec:
          containers:
          - name: ray-head
            image: rayproject/ray:2.46.0
            resources:
              limits:
                cpu: 4
                memory: 6Gi
              requests:
                cpu: 2
                memory: 4Gi
    workerGroupSpecs:
    # the pod replicas in this group typed worker
    - replicas: 1
      minReplicas: 1
      maxReplicas: 5
      # logical group name, for this called small-group, also can be functional
      groupName: small-group
      # The `rayStartParams` are used to configure the `ray start` command.
      # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
      # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
      rayStartParams: {}
      #pod template
      template:
        spec:
          containers:
          - name: ray-worker # must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc'
            image: rayproject/ray:2.46.0
            resources:
              limits:
                cpu: "2"
                memory: "4Gi"
              requests:
                cpu: "1"
                memory: "2Gi"