Testing/CI/KubernetesRunners: Difference between revisions
(Register the containers provider) |
m (→Gitlab Runner) |
||
(21 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
=== Kubernetes Cluster === | === Kubernetes Cluster === | ||
Create a Kubernetes cluster on Azure (AKS). | Create a Kubernetes cluster on Azure (AKS). | ||
Two node pools: "agentpool" for the Kubernetes system pods and "jobs" for the CI jobs. | |||
=== CLI === | === CLI === | ||
Line 34: | Line 34: | ||
az account set --subscription ... | az account set --subscription ... | ||
az aks get-credentials ... | az aks get-credentials ... | ||
=== Gitlab === | === Gitlab === | ||
Line 44: | Line 40: | ||
=== Gitlab Runner === | === Gitlab Runner === | ||
Now it's time to install the Gitlab runner with Helm [https://docs.gitlab.com/runner/install/kubernetes.html#installing-gitlab-runner-using-the-helm-chart]. | Now it's time to install the Gitlab runner with Helm [https://docs.gitlab.com/runner/install/kubernetes.html#installing-gitlab-runner-using-the-helm-chart]. | ||
Add the GitLab Helm repository: | |||
helm repo add gitlab https://charts.gitlab.io | |||
Create a namespace: | Create a namespace: | ||
Line 49: | Line 49: | ||
kubectl create namespace "gitlab-runner" | kubectl create namespace "gitlab-runner" | ||
Create a <code>values.yaml</code> file for your runner configuration [https://docs.gitlab.com/runner/install/kubernetes.html#configuring-gitlab-runner-using-the-helm-chart] | Create a <code>values.yaml</code> file for your runner configuration [https://docs.gitlab.com/runner/install/kubernetes.html#configuring-gitlab-runner-using-the-helm-chart]. | ||
The current <code>values.yaml</code> file can be found in QEMU main repository: [https://gitlab.com/qemu-project/qemu/-/blob/master/scripts/ci/gitlab-kubernetes-runners/values.yaml scripts/ci/gitlab-kubernetes-runners/values.yaml] | |||
The default <code>poll_timeout</code> value needs to be raised to have time for auto-scaling nodes to start. | |||
[https://docs.gitlab.com/runner/executors/kubernetes.html#job-failed-system-failure-timed-out-waiting-for-pod-to-start] | |||
Enabling RBAC support [https://docs.gitlab.com/runner/install/kubernetes.html#enabling-rbac-support] | Enabling RBAC support [https://docs.gitlab.com/runner/install/kubernetes.html#enabling-rbac-support] | ||
seems to be needed [https://docs.gitlab.com/runner/install/kubernetes.html#error-job-failed-system-failure-secrets-is-forbidden] | seems to be needed [https://docs.gitlab.com/runner/install/kubernetes.html#error-job-failed-system-failure-secrets-is-forbidden] | ||
with the default AKS configuration. | with the default AKS configuration. | ||
<pre> | <pre> | ||
gitlabUrl: "https://gitlab.com/" | gitlabUrl: "https://gitlab.com/" | ||
rbac: | rbac: | ||
create: true | create: true | ||
# Configure the maximum number of concurrent jobs | |||
concurrent: 200 | |||
# Schedule runners on "user" nodes (not "system") | |||
runners: | runners: | ||
secret: gitlab-runner-secret | |||
config: | | config: | | ||
[[runners]] | [[runners]] | ||
[runners.kubernetes] | [runners.kubernetes] | ||
[runners.kubernetes.node_selector] | poll_timeout = "1200" | ||
[runners.kubernetes.node_selector] | |||
"kubernetes.azure.com/mode" = "user" | |||
</pre> | |||
Create a file <code>gitlab-runner-secret.yaml</code>: | |||
<pre> | |||
apiVersion: v1 | |||
kind: Secret | |||
metadata: | |||
name: gitlab-runner-secret | |||
type: Opaque | |||
data: | |||
runner-registration-token: "" # need to leave as an empty string for compatibility reasons | |||
runner-token: "REDACTED" # update this with a base64-encoded value | |||
</pre> | </pre> | ||
Apply the secret: | |||
kubectl apply --namespace gitlab-runner -f gitlab-runner-secret.yaml | |||
Deploy the runner: | Deploy the runner: | ||
helm install --namespace gitlab-runner gitlab-runner -f values.yaml gitlab/gitlab-runner | helm install --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner | ||
If you change the configuration in <code>values.yaml</code>, apply it with the command below. Pause your runner before upgrading it to avoid service disruptions. [https://docs.gitlab.com/runner/install/kubernetes.html#upgrading-gitlab-runner-using-the-helm-chart] | |||
helm upgrade --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner | |||
=== Docker === | |||
QEMU jobs require Docker-in-Docker. Additional configuration is necessary. [https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-in-docker-with-tls-enabled-in-kubernetes] | |||
Docker-in-Docker makes the CI environment less secure [https://docs.gitlab.com/runner/executors/kubernetes.html#using-dockerdind], it needs more resources and it has [https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#known-issues-with-docker-in-docker known issues]. Please migrate your Docker jobs to better alternatives if you can [https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-alternatives]. | |||
Add the following to your <code>values.yaml</code>: | |||
<pre> | |||
runners: | |||
config: | | |||
[[runners]] | |||
[runners.kubernetes] | |||
image = "ubuntu:20.04" | |||
privileged = true | |||
[[runners.kubernetes.volumes.empty_dir]] | |||
name = "docker-certs" | |||
mount_path = "/certs/client" | |||
medium = "Memory" | |||
</pre> | |||
Update your job definitions to use the following. | |||
Alternatively, variables can be set using the runner <code>environment</code> configuration [https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section]. | |||
<pre> | |||
image: docker:20.10.16 | |||
services: | |||
- docker:20.10.16-dind | |||
variables: | |||
DOCKER_HOST: tcp://docker:2376 | |||
DOCKER_TLS_CERTDIR: "/certs" | |||
DOCKER_TLS_VERIFY: 1 | |||
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client" | |||
before_script: | |||
- until docker info; do sleep 1; done | |||
</pre> | |||
=== Resource Management === | |||
The QEMU pipeline has around 100 jobs. Most of them can run in parallel. | |||
Each job needs enough resources to complete before it times out. | |||
To understand Kubernetes resource measure units, see [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ Resource Management for Pods and Containers]. | |||
Set requests: | |||
<pre> | |||
runners: | |||
config: | | |||
[[runners]] | |||
[runners.kubernetes] | |||
cpu_request = "0.5" | |||
service_cpu_request = "0.5" | |||
helper_cpu_request = "0.25" | |||
</pre> | |||
Jobs that have higher requirements should set their own variables. See [https://docs.gitlab.com/runner/executors/kubernetes.html#overwrite-container-resources overwrite container resources]. | |||
To allow single jobs to request more resources, you have to set the <code>_overwrite_max_allowed</code> variables. See [https://docs.gitlab.com/runner/executors/kubernetes.html#cpu-requests-and-limits]. | |||
<pre> | |||
runners: | |||
config: | | |||
[[runners]] | |||
[runners.kubernetes] | |||
cpu_request_overwrite_max_allowed = "7" | |||
memory_request_overwrite_max_allowed = "30Gi" | |||
</pre> |
Latest revision as of 15:12, 10 May 2024
To be able to run Gitlab CI jobs on a Kubernetes cluster, a Gitlab Runner must be installed [1].
Deployment
This sections documents the steps taken to deploy a GitLab Runner instance on a Azure Kubernetes cluster by using Helm [2].
Kubernetes Cluster
Create a Kubernetes cluster on Azure (AKS). Two node pools: "agentpool" for the Kubernetes system pods and "jobs" for the CI jobs.
CLI
Follow the docs to Install the Azure CLI.
Alternatively, run the Azure CLI in a container [3]:
podman run -it mcr.microsoft.com/azure-cli
Install the Kubernetes CLI (kubectl) [4]:
az aks install-cli
Install the Helm CLI [5]:
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
Sign in
Sign in to Azure [6]:
az login
Connect to your Kubernetes Cluster. Open the Azure web dashboard for your cluster and push the "Connect" button. A list of commands will be displayed to connect to your cluster. Something like the following:
az account set --subscription ... az aks get-credentials ...
Gitlab
Register the new runner [7].
Gitlab Runner
Now it's time to install the Gitlab runner with Helm [8].
Add the GitLab Helm repository:
helm repo add gitlab https://charts.gitlab.io
Create a namespace:
kubectl create namespace "gitlab-runner"
Create a values.yaml
file for your runner configuration [9].
The current values.yaml
file can be found in QEMU main repository: scripts/ci/gitlab-kubernetes-runners/values.yaml
The default poll_timeout
value needs to be raised to have time for auto-scaling nodes to start.
[10]
Enabling RBAC support [11] seems to be needed [12] with the default AKS configuration.
gitlabUrl: "https://gitlab.com/" rbac: create: true # Configure the maximum number of concurrent jobs concurrent: 200 # Schedule runners on "user" nodes (not "system") runners: secret: gitlab-runner-secret config: | [[runners]] [runners.kubernetes] poll_timeout = "1200" [runners.kubernetes.node_selector] "kubernetes.azure.com/mode" = "user"
Create a file gitlab-runner-secret.yaml
:
apiVersion: v1 kind: Secret metadata: name: gitlab-runner-secret type: Opaque data: runner-registration-token: "" # need to leave as an empty string for compatibility reasons runner-token: "REDACTED" # update this with a base64-encoded value
Apply the secret:
kubectl apply --namespace gitlab-runner -f gitlab-runner-secret.yaml
Deploy the runner:
helm install --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner
If you change the configuration in values.yaml
, apply it with the command below. Pause your runner before upgrading it to avoid service disruptions. [13]
helm upgrade --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner
Docker
QEMU jobs require Docker-in-Docker. Additional configuration is necessary. [14]
Docker-in-Docker makes the CI environment less secure [15], it needs more resources and it has known issues. Please migrate your Docker jobs to better alternatives if you can [16].
Add the following to your values.yaml
:
runners: config: | [[runners]] [runners.kubernetes] image = "ubuntu:20.04" privileged = true [[runners.kubernetes.volumes.empty_dir]] name = "docker-certs" mount_path = "/certs/client" medium = "Memory"
Update your job definitions to use the following.
Alternatively, variables can be set using the runner environment
configuration [17].
image: docker:20.10.16 services: - docker:20.10.16-dind variables: DOCKER_HOST: tcp://docker:2376 DOCKER_TLS_CERTDIR: "/certs" DOCKER_TLS_VERIFY: 1 DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client" before_script: - until docker info; do sleep 1; done
Resource Management
The QEMU pipeline has around 100 jobs. Most of them can run in parallel. Each job needs enough resources to complete before it times out.
To understand Kubernetes resource measure units, see Resource Management for Pods and Containers.
Set requests:
runners: config: | [[runners]] [runners.kubernetes] cpu_request = "0.5" service_cpu_request = "0.5" helper_cpu_request = "0.25"
Jobs that have higher requirements should set their own variables. See overwrite container resources.
To allow single jobs to request more resources, you have to set the _overwrite_max_allowed
variables. See [18].
runners: config: | [[runners]] [runners.kubernetes] cpu_request_overwrite_max_allowed = "7" memory_request_overwrite_max_allowed = "30Gi"