Setup Dev Monitoring
This page shows how to set up Prometheus for development, debugging, and testing Kueue metrics.
The page is intended for a platform developer.
Before you begin
Make sure the following conditions are met:
- A Kubernetes cluster is running.
- Kueue is installed.
1. Install kube-prometheus
From a scratch directory outside the Kueue repository, install kube-prometheus:
git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus
kubectl apply --server-side -f manifests/setup
kubectl wait --for condition=Established --all CustomResourceDefinition --namespace=monitoring
kubectl apply -f manifests/
kubectl wait --for=condition=Ready pods --all -n monitoring --timeout=300s
2. Enable Kueue metrics scraping
Apply the Kueue ServiceMonitor:
VERSION=v0.16.1
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/${VERSION}/prometheus.yaml
Alternatively, if you’re working from a Kueue source checkout, use:
make prometheus
3. Generate test data
Create a ClusterQueue and LocalQueue:
kubectl apply -f https://kueue.sigs.k8s.io/examples/admin/single-clusterqueue-setup.yaml
Submit test jobs:
for i in {1..5}; do
kubectl create -f https://kueue.sigs.k8s.io/examples/jobs/sample-job.yaml
done
4. Verify metrics
Port-forward to the Prometheus service:
kubectl -n monitoring port-forward svc/prometheus-k8s 9090:9090
Check that Prometheus is scraping Kueue:
curl -s 'http://localhost:9090/api/v1/targets' | jq '.data.activeTargets[] | select(.labels.job | contains("kueue"))'
You should see output like:
{
"labels": {
"job": "kueue-controller-manager-metrics-service",
...
},
"health": "up",
...
}
Open http://localhost:9090 in your browser and try a query:
kueue_admitted_workloads_total
For Grafana access, see the kube-prometheus documentation.
5. Enable optional metrics
To enable resource-level metrics like kueue_cluster_queue_resource_usage, edit the Kueue configuration:
kubectl edit configmap kueue-manager-config -n kueue-system
Add enableClusterQueueResources: true under the metrics section:
metrics:
bindAddress: :8443
enableClusterQueueResources: true
Restart Kueue:
kubectl rollout restart deployment/kueue-controller-manager -n kueue-system
Verify the optional metrics are available:
kueue_cluster_queue_nominal_quota
See Prometheus Metrics for the full list of optional metrics.
What’s next
- See Common Grafana Queries for PromQL queries to monitor Kueue in Grafana.
- See Setup Prometheus for production setup instructions.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.