Continuous Integration (CI)

What is it?

CD/CI - Continuous Delivery / Continuous Integration is a DevOps methodology that speeds up software delivery

  • Automates analysis, tests each code change
    • Report errors
    • Streamline software releasing
  • Improves development / collaboration

What do we need?

We need to combine:

  • Reproducible environment(s)
  • Code analysis tools
    • Formatters
    • Linters
    • Type Checkers
  • Tests

The tooling Zoo 🪛🔧⚒️

  • Package managers (micromamba, uv)
  • Version control systems (git)
  • IDEs - Integrated Development Environments (VS Code)
  • Code Editors (vim, emacs)
  • Linters, formatters, type-checkers (ruff)
  • Compilers and/or Interpreters (e.g. gcc, python)
  • Testing frameworks (pytest)

Static analysis

  • Syntax errors - beyond parsing:
    • All languages follow a syntax spec.
    • Easy for us to make mistakes/typos
  • Some bugs - simple logical problems
  • Conform to good practices
def helloworld():
    print('Hello',end='')
     print('world')

helloworld()
$ python3 helloworld.py 
  File "/home/reis/dir/helloworld.py", line 3
    print('world')
                  ^
IndentationError: unindent does not match any outer indentation level

Static Analysis - Linters & formatters

The latest tool in the zoo is ruff.
It supports +900 linting rules

$ ruff check --output-format=concise helloworld.py
helloworld.py:3:1: SyntaxError: Unexpected indentation
helloworld.py:5:1: SyntaxError: Expected a statement
Found 2 errors.
$ ruff format --diff helloworld.py
--- helloworld.py
+++ helloworld.py
@@ -1,5 +1,6 @@
 def helloworld():
-    print('Hello',end='')
-    print('world')
+    print("Hello", end="")
+    print("world")
+
 
 helloworld()

Static Analysis - Type checking

mypy is the go-to tool - leveraging type hints

mult.py
def mult_int(a: int, b: int) -> float: # note the optional syntax
    c: int = a * b                     # used for type hinting
    return c

print("Mult 3*6 is:")
print(mult_int(3, "6"))
$ mypy mult.py
mult.py:6: error: Argument 2 to "mult_int" has incompatible type "str"; expected "int"  [arg-type]
Found 1 error in 1 file (checked 1 source file)
$ python3 mult.py
Mult 3*6 is:
666

Use ruff and mypy

  • Create a python script (e.g. helloworld.py) that prints something
  • Install ruff and mypy in your environment
  • Use them to lint, format your script

We have the tools, now what?

Setting up a CI pipeline

  • Git repos / services are perfect place!
    • Simply add the configuration files to the repo
    • Checks are done everytime you push code
    • CI verifies if commits meet expectation
  • Provided on popular git hosting systems

What to expect?

  • Brand-new environments should be created
    • Even with versioned/pinned dependencies
  • Static analysis errors should be immediately spotted
    • Code style violations should be raised
    • Compiling code should be done (if compiled language)
    • e.g. cpython latest linting check:
  • Tests must run and assess functionality

How!?

  • Define stages/steps in order of dependency
  • For each prepare the environment and tools to be used
  • Introduce variants
    • e.g. using different python versions
    • e.g. run with a pinned version of your environment
    • e.g. run the latest (version) of your environment

Real case examples

Gitlab CI

Configured with a file YAML file named .gitlab-ci.yaml

  • Gitlab CI syntax: docs.gitlab.com/ci/yaml
  • DKRZ Gitlab documentation: docs.dkrz.de ⚠️
    • A couple of extra configurations are needed to use DKRZ’s gitlab.dkrz.de’s CI
    • E.g. jobs running in slurm/levante (templates documented) or not

Example of .gitlab-ci.yml

.gitlab-ci.yml
stages:
 - static-analysis

default:
  image: python # container image with basic python tools
  tags:
    - docker-any-image # note this is specific to DKRZ's gitlab

linter:
  stage: static-analylsis
  script:
    - pip install ruff
    - ruff check .

type-check:
  stage: static-analysis
  script:
    - pip install mypy
    - mypy .

Example of .gitlab-ci.yml

.gitlab-ci.yml
stages:
 - static-analysis

default:
  image: python # container image with basic python tools
  tags:
    - docker-any-image # note this is specific to DKRZ's gitlab

linter:
  stage: static-anaylis
  script:
    - pip install ruff
    - ruff check .

type-check:
  stage: static-analysis
  script:
    - pip install mypy
    - mypy .

Runners

In order to run the CI, code needs to run1 somewhere.

  • Centrally provided (shared) runners
  • Self-hosted
  • Runners can be tagged, indicating functionality.
    • e.g. different hardware/os/capabilities
  • Jobs can select runners using their tags.
.gitlab-ci.yml
default:
  tags:
    - gpu

(Docker) Images & Containers

  • Images are compiled “environments” containing an entire operating system.
    • e.g. python: an operating system with Python pre-installed

The image keyword specifies an image for the executor to use

.gitlab-ci.yml
default:
  image: python # by default resolves to https://hub.docker.com/_/python

(Docker) Images & Containers

  • Containers are active environments based on an image.
    • i.e. a running virtual machine, created for a single job
  • To use a container in Gitlab CI, you need a runner with a “Docker Executor”.
    • e.g. using docker-any-image tag on DKRZ gitlab.

Hands-on Session

Create a simple Gitlab CI pipeline job on your repository:

  • It should use python and print "Hello world!"

Important

⚠️ You need to enable CI/CD in Settings->General->“Visibility, project features, permissions” and enable enable instance runners under “Settings->CI/CD->Runners”.

Hands-on Session

.gitlab-ci.yml
---
stages:
 - greet

say-hi:
 stage: greet
 image: python
 script:
  - python -c 'print("Hello world!")'
 tags:
  - docker-any-image

Hands-on Session

  • Create a python script that is checked by ruff
    • It should lint and format of your code
  • Bonus: use the type-checker mypy as well

Hands-on Session

.gitlab-ci.yml
---
stages:
  - static

ruff:
  stage: static
  image: python
  script:
    - pip install ruff
    - ruff check .
    - ruff format --check .
  tags:
    - docker-any-image

mypy:
  stage: static
  image: python
  script:
    - pip install mypy
    - mypy .
  tags:
    - docker-any-image

More on Gitlab CI

Artifacts

  • Jobs have an ephemeral nature
    • Files they create will be gone once completed
  • artifacts1 allows us to specify which files to keep
    • Used to pass data between jobs
    • Save results (e.g. code report, documentation)
    • Expected to disappear (eventually/explicitly)

Example with artifacts

.gitlab-ci.yml
stages:
  - create
  - use

defaults:
  tags:
    - condaforge
    - dkrz

job1:
  stage: create
  script:
    - mkdir output
    - echo "Creating a result!" | tee -a output/result.txt

job2:
  stage: use
  script:
    - cat output/result.txt
$ cat output/result.txt
cat: output/result.txt: No such file or directory

The pipeline fails, job2 starts from the same initial point as job1 and there is no persistency!

Example with artifacts

.gitlab-ci.yml
stages:
  - create
  - use

defaults:
  tags:
    - condaforge
    - dkrz

job1:
  stage: create
  script:
    - mkdir output
    - echo "Creating a result!" | tee -a output/result.txt
  artifacts:
    paths:
      - output/result.txt
    expire_in: 1 hour

job2:
  stage: use
  script:
    - cat output/result.txt

Matrix jobs

  • Using parallel:matrix1 one can run a job in parallel.
    • For each combination of values a job will be spawned.
.gitlab-ci.yml
stages:
  - run

default:
  tags:
    - docker-any-image

job_python:
  stage: run
  image: python
  script:
    - python --version

Matrix jobs

  • Using parallel:matrix1 one can run a job in parallel.
    • For each combination of values a job will be spawned.
.gitlab-ci.yml
stages:
  - run

default:
  tags:
    - docker-any-image

job_python:
  stage: run
  parallel:
    matrix:
      - PYTHON_VERSION: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14", "3.15"]
  image: python:${PYTHON_VERSION}
  script:
    - python --version

Take home messages

  • Tools exist to make everyone’s life easier
  • CI takes tools to the next level
  • Makes problem detection and collaboration easier!

Shotgun buffet

Executors

In Gitlab different executors1 offer different isolation settings.

  • With SSH and Shell executors, a job that runs rm -rf / could destroy the state of the runner (persistency).

  • The same using Docker would only destroy the newly created isolated environment (not even the image)

Pre-commit hook 🪝

Automate the locally checking the code with git hooks 1.

  1. Inspect .git/hooks/pre-commit.sample. What does it do?
  2. Install pre-commit, configure it to use ruff-pre-commit
  3. Modify a python script (e.g. helloworld.py)
  4. Try to commit that modification
  5. Did the commit went through? What if you try to commit a syntax error?

Solution pre-commit

Using the pre-commit tool:

.pre-commit-config.yaml
---
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v2.3.0
    hooks:
    -   id: check-yaml
    -   id: end-of-file-fixer
    -   id: trailing-whitespace
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.11.5
    hooks:
      - id: ruff
      - id: ruff-format
pre-commit install
pre-commit run --all-files