Testing/CI/KubernetesRunners: Difference between revisions

From QEMU
 
(9 intermediate revisions by the same user not shown)
Line 40: Line 40:
=== Gitlab Runner ===
=== Gitlab Runner ===
Now it's time to install the Gitlab runner with Helm [https://docs.gitlab.com/runner/install/kubernetes.html#installing-gitlab-runner-using-the-helm-chart].
Now it's time to install the Gitlab runner with Helm [https://docs.gitlab.com/runner/install/kubernetes.html#installing-gitlab-runner-using-the-helm-chart].
Add the GitLab Helm repository:
helm repo add gitlab https://charts.gitlab.io


Create a namespace:
Create a namespace:
Line 45: Line 49:
  kubectl create namespace "gitlab-runner"
  kubectl create namespace "gitlab-runner"


Create a <code>values.yaml</code> file for your runner configuration [https://docs.gitlab.com/runner/install/kubernetes.html#configuring-gitlab-runner-using-the-helm-chart]
Create a <code>values.yaml</code> file for your runner configuration [https://docs.gitlab.com/runner/install/kubernetes.html#configuring-gitlab-runner-using-the-helm-chart].
like the snippet below.
 
The current <code>values.yaml</code> file can be found in QEMU main repository: [https://gitlab.com/qemu-project/qemu/-/blob/master/scripts/ci/gitlab-kubernetes-runners/values.yaml scripts/ci/gitlab-kubernetes-runners/values.yaml]
 
The default <code>poll_timeout</code> value needs to be raised to have time for auto-scaling nodes to start.
[https://docs.gitlab.com/runner/executors/kubernetes.html#job-failed-system-failure-timed-out-waiting-for-pod-to-start]


Enabling RBAC support [https://docs.gitlab.com/runner/install/kubernetes.html#enabling-rbac-support]
Enabling RBAC support [https://docs.gitlab.com/runner/install/kubernetes.html#enabling-rbac-support]
Line 54: Line 62:
<pre>
<pre>
gitlabUrl: "https://gitlab.com/"
gitlabUrl: "https://gitlab.com/"
runnerRegistrationToken: ""
rbac:
rbac:
   create: true
   create: true
Line 61: Line 68:
# Schedule runners on "user" nodes (not "system")
# Schedule runners on "user" nodes (not "system")
runners:
runners:
  secret: gitlab-runner-secret
   config: |
   config: |
     [[runners]]
     [[runners]]
      [runners.kubernetes]
        poll_timeout = "1200"
       [runners.kubernetes.node_selector]
       [runners.kubernetes.node_selector]
         "kubernetes.azure.com/mode" = "user"
         "kubernetes.azure.com/mode" = "user"
</pre>
</pre>
Create a file <code>gitlab-runner-secret.yaml</code>:
<pre>
apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret
type: Opaque
data:
  runner-registration-token: "" # need to leave as an empty string for compatibility reasons
  runner-token: "REDACTED" # update this with a base64-encoded value
</pre>
Apply the secret:
kubectl apply --namespace gitlab-runner -f gitlab-runner-secret.yaml


Deploy the runner:
Deploy the runner:
Line 95: Line 122:
</pre>
</pre>


Update your job definitions to use the following:
Update your job definitions to use the following.
Alternatively, variables can be set using the runner <code>environment</code> configuration [https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section].


<pre>
<pre>
Line 123: Line 151:
     [[runners]]
     [[runners]]
       [runners.kubernetes]
       [runners.kubernetes]
         cpu_request = 1
         cpu_request = "0.5"     
         service_cpu_request = 1
         service_cpu_request = "0.5"
        helper_cpu_request = "0.25"   
</pre>
</pre>


Line 135: Line 164:
     [[runners]]
     [[runners]]
       [runners.kubernetes]
       [runners.kubernetes]
         cpu_request_overwrite_max_allowed = 4
         cpu_request_overwrite_max_allowed = "7"       
        service_cpu_request_overwrite_max_allowed = 4
         memory_request_overwrite_max_allowed = "30Gi"
         memory_request_overwrite_max_allowed = 16Gi
        service_memory_request_overwrite_max_allowed = 16Gi
</pre>
</pre>

Latest revision as of 15:12, 10 May 2024

To be able to run Gitlab CI jobs on a Kubernetes cluster, a Gitlab Runner must be installed [1].

Deployment

This sections documents the steps taken to deploy a GitLab Runner instance on a Azure Kubernetes cluster by using Helm [2].

Kubernetes Cluster

Create a Kubernetes cluster on Azure (AKS). Two node pools: "agentpool" for the Kubernetes system pods and "jobs" for the CI jobs.

CLI

Follow the docs to Install the Azure CLI.

Alternatively, run the Azure CLI in a container [3]:

podman run -it mcr.microsoft.com/azure-cli

Install the Kubernetes CLI (kubectl) [4]:

az aks install-cli

Install the Helm CLI [5]:

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Sign in

Sign in to Azure [6]:

az login

Connect to your Kubernetes Cluster. Open the Azure web dashboard for your cluster and push the "Connect" button. A list of commands will be displayed to connect to your cluster. Something like the following:

az account set --subscription ...
az aks get-credentials ...

Gitlab

Register the new runner [7].

Gitlab Runner

Now it's time to install the Gitlab runner with Helm [8].

Add the GitLab Helm repository:

helm repo add gitlab https://charts.gitlab.io

Create a namespace:

kubectl create namespace "gitlab-runner"

Create a values.yaml file for your runner configuration [9].

The current values.yaml file can be found in QEMU main repository: scripts/ci/gitlab-kubernetes-runners/values.yaml

The default poll_timeout value needs to be raised to have time for auto-scaling nodes to start. [10]

Enabling RBAC support [11] seems to be needed [12] with the default AKS configuration.

gitlabUrl: "https://gitlab.com/"
rbac:
  create: true
# Configure the maximum number of concurrent jobs
concurrent: 200
# Schedule runners on "user" nodes (not "system")
runners:
  secret: gitlab-runner-secret
  config: |
    [[runners]]
      [runners.kubernetes]
        poll_timeout = "1200" 
      [runners.kubernetes.node_selector]
        "kubernetes.azure.com/mode" = "user"

Create a file gitlab-runner-secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: gitlab-runner-secret
type: Opaque
data:
  runner-registration-token: "" # need to leave as an empty string for compatibility reasons
  runner-token: "REDACTED" # update this with a base64-encoded value

Apply the secret:

kubectl apply --namespace gitlab-runner -f gitlab-runner-secret.yaml

Deploy the runner:

helm install --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner

If you change the configuration in values.yaml, apply it with the command below. Pause your runner before upgrading it to avoid service disruptions. [13]

helm upgrade --namespace gitlab-runner runner-manager -f values.yaml gitlab/gitlab-runner

Docker

QEMU jobs require Docker-in-Docker. Additional configuration is necessary. [14]

Docker-in-Docker makes the CI environment less secure [15], it needs more resources and it has known issues. Please migrate your Docker jobs to better alternatives if you can [16].

Add the following to your values.yaml:

runners:
  config: |
    [[runners]]
      [runners.kubernetes]
        image = "ubuntu:20.04"
        privileged = true
      [[runners.kubernetes.volumes.empty_dir]]
        name = "docker-certs"
        mount_path = "/certs/client"
        medium = "Memory"

Update your job definitions to use the following. Alternatively, variables can be set using the runner environment configuration [17].

image: docker:20.10.16
services:
  - docker:20.10.16-dind
variables:
  DOCKER_HOST: tcp://docker:2376
  DOCKER_TLS_CERTDIR: "/certs"
  DOCKER_TLS_VERIFY: 1
  DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
before_script:
  - until docker info; do sleep 1; done

Resource Management

The QEMU pipeline has around 100 jobs. Most of them can run in parallel. Each job needs enough resources to complete before it times out.

To understand Kubernetes resource measure units, see Resource Management for Pods and Containers.

Set requests:

runners:
  config: |
    [[runners]]
      [runners.kubernetes]
        cpu_request = "0.5"      
        service_cpu_request = "0.5"
        helper_cpu_request = "0.25"     

Jobs that have higher requirements should set their own variables. See overwrite container resources. To allow single jobs to request more resources, you have to set the _overwrite_max_allowed variables. See [18].

runners:
  config: |
    [[runners]]
      [runners.kubernetes]
        cpu_request_overwrite_max_allowed = "7"        
        memory_request_overwrite_max_allowed = "30Gi"