Testing/CI/KubernetesRunners: Difference between revisions
No edit summary |
|||
Line 114: | Line 114: | ||
Each job needs enough resources to complete before it times out. | Each job needs enough resources to complete before it times out. | ||
To understand Kubernetes resource measure units, see [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ Resource Management for Pods and Containers]. | |||
Set requests: | |||
<pre> | <pre> | ||
runners: | |||
config: | | |||
[[runners]] | |||
[runners.kubernetes] | |||
cpu_request = 1 | |||
service_cpu_request = 1 | |||
</pre> | </pre> | ||
Jobs that have higher requirements should set their own variables. See [https://docs.gitlab.com/runner/executors/kubernetes.html#overwrite-container-resources overwrite container resources]. | Jobs that have higher requirements should set their own variables. See [https://docs.gitlab.com/runner/executors/kubernetes.html#overwrite-container-resources overwrite container resources]. | ||
To allow single jobs to request more resources, you have to set the <code>_overwrite_max_allowed</code> variables. See [https://docs.gitlab.com/runner/executors/kubernetes.html#cpu-requests-and-limits]. | |||
<pre> | |||
runners: | |||
config: | | |||
[[runners]] | |||
[runners.kubernetes] | |||
cpu_request_overwrite_max_allowed = 4 | |||
service_cpu_request_overwrite_max_allowed = 4 | |||
memory_request_overwrite_max_allowed = 16Gi | |||
service_memory_request_overwrite_max_allowed = 16Gi | |||
</pre> |
Revision as of 09:35, 27 March 2023
To be able to run Gitlab CI jobs on a Kubernetes cluster, a Gitlab Runner must be installed [1].
Deployment
This sections documents the steps taken to deploy a GitLab Runner instance on a Azure Kubernetes cluster by using Helm [2].
Kubernetes Cluster
Create a Kubernetes cluster on Azure (AKS). Two node pools: "agentpool" for the Kubernetes system pods and "jobs" for the CI jobs.
CLI
Follow the docs to Install the Azure CLI.
Alternatively, run the Azure CLI in a container [3]:
podman run -it mcr.microsoft.com/azure-cli
Install the Kubernetes CLI (kubectl) [4]:
az aks install-cli
Install the Helm CLI [5]:
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
Sign in
Sign in to Azure [6]:
az login
Connect to your Kubernetes Cluster. Open the Azure web dashboard for your cluster and push the "Connect" button. A list of commands will be displayed to connect to your cluster. Something like the following:
az account set --subscription ... az aks get-credentials ...
Gitlab
Register the new runner [7].
Gitlab Runner
Now it's time to install the Gitlab runner with Helm [8].
Create a namespace:
kubectl create namespace "gitlab-runner"
Create a values.yaml
file for your runner configuration [9]
like the snippet below.
Enabling RBAC support [10] seems to be needed [11] with the default AKS configuration.
gitlabUrl: "https://gitlab.com/" runnerRegistrationToken: "" rbac: create: true # Configure the maximum number of concurrent jobs concurrent: 200 # Schedule runners on "user" nodes (not "system") runners: config: | [[runners]] [runners.kubernetes.node_selector] "kubernetes.azure.com/mode" = "user"
Deploy the runner:
helm install --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner
If you change the configuration in values.yaml
, apply it with the command below. Pause your runner before upgrading it to avoid service disruptions. [12]
helm upgrade --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner
Docker
QEMU jobs require Docker-in-Docker. Additional configuration is necessary. [13]
Docker-in-Docker makes the CI environment less secure [14], it needs more resources and it has known issues. Please migrate your Docker jobs to better alternatives if you can [15].
Add the following to your values.yaml
:
runners: config: | [[runners]] [runners.kubernetes] image = "ubuntu:20.04" privileged = true [[runners.kubernetes.volumes.empty_dir]] name = "docker-certs" mount_path = "/certs/client" medium = "Memory"
Update your job definitions to use the following:
image: docker:20.10.16 services: - docker:20.10.16-dind variables: DOCKER_HOST: tcp://docker:2376 DOCKER_TLS_CERTDIR: "/certs" DOCKER_TLS_VERIFY: 1 DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client" before_script: - until docker info; do sleep 1; done
Resource Management
The QEMU pipeline has around 100 jobs. Most of them can run in parallel. Each job needs enough resources to complete before it times out.
To understand Kubernetes resource measure units, see Resource Management for Pods and Containers.
Set requests:
runners: config: | [[runners]] [runners.kubernetes] cpu_request = 1 service_cpu_request = 1
Jobs that have higher requirements should set their own variables. See overwrite container resources.
To allow single jobs to request more resources, you have to set the _overwrite_max_allowed
variables. See [16].
runners: config: | [[runners]] [runners.kubernetes] cpu_request_overwrite_max_allowed = 4 service_cpu_request_overwrite_max_allowed = 4 memory_request_overwrite_max_allowed = 16Gi service_memory_request_overwrite_max_allowed = 16Gi