Automated, adaptive CI for C++-projects

Content

This is an advanced workflow for C++-projects with large build/test times. If you are a beginner, this is not the right guide for you.

Scope of this article

Ideally, a software project contains a minimum amount of tests necessary to cover a maximum of the source code, testing all the desired behavior of the system. In an ideal situation, building and running those tests takes only little time and has little to no impact on the workflow of the developers. However, in incrementally growing projects the number of tests is likely to increase over time, and at some point, testing the overall system may become too costly to be done by the developers in their daily development workflow.

But, in order to continuously guarantee the functionality of the system, the changes introduced by a developer must be tested before integrating them into the code base. An obvious solution to the problem of large test times is to test only those parts of the system that are affected by the new changes. Depending on the development environment, this may be guaranteed automatically and you don’t have to worry. In other situations, such as the one discussed here, you have to do some manual tweaking to achieve that.

This article focuses on automated testing within the GitLab CI using the docker executor, where each test pipeline starts off in a clean project without build artifacts from a prior run. Moreover, the presented solutions are specific to header-only projects written in C++, using cmake for configuration. These solutions were used in the Dumux project, which is an open-source C++-framework for numerical simulations with a focus on flow and transport processes in porous media. At the time of writing this, Dumux defines almost 500 tests, which take several hours to build and run on a single core of a standard laptop.

Definition of the desired CI behavior

The general concept for the GitLab CI of the Dumux repository was defined as follows:

  • Automated test pipelines before integration of new code via merge requests
  • Nightly scheduled test pipelines on the master branch
  • Continuous build/test status on the master and release branches at all times

Moreover, several potential user-sided setups should be tested. That is, setups using different compilers , supported library versions, or the presence of optional libraries. This means that all tests have to be built and run multiple times.

The trigger job

A trigger job sets off other jobs and it is used to conveniently define test pipelines to be run with different setups (for more information on this see How to define a test pipeline to be reused with multiple setups ).
Given the large build and test times, pipelines should only be created in the above-mentioned situations. This can be achieved by using the following rules on the trigger job:

.base-trigger:
  stage: trigger-test-pipelines
  trigger:
    include: .pipeline.yml
    strategy: depend
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: '$CI_COMMIT_BRANCH =~ /^(master|releases\/)/'
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      when: manual

In this example the actual test pipeline is defined in .pipeline.yml. The manual keyword ensures that pipelines are not triggered automatically when a merge request is opened or when new commits are pushed to an existing one. Whenever desired, one can manually trigger the pipeline from the browser.

Reducing the number of tests to be run in merge requests

However, this still triggers all tests, so the next step is to detect those tests that are affected by the changes introduced in a merge request. This boils down to three steps:

  1. find the files that have been modified or created in the merge request
  2. find the tests that make use of the modified files
  3. introduce a test selection job combining this knowledge

1. Finding the modified/created files

With git as version control system , modified files between two versions can be obtained with git diff-tree (see also How to find modified files in git).
Together with predefined variables provided by the GitLab CI, we can identify the files that differ between the current head (checked out by the GitLab Runner when starting the job) and the target branch of the merge request with the following command:

git diff-tree -r --name-only HEAD $CI_MERGE_REQUEST_TARGET_BRANCH_NAME

Now we know which files have been modified.

2. Finding the affected tests

The next step is to find out which tests use the modified files.

For C++, our article How to obtain files included in C++ compilation describes how you can determine the files that are included upon compilation of a test. If any of the modified files is in the list of includes of the test, it is affected by the changes. However, we need to be able to iterate over all tests, and moreover, we need to know which test uses which executable (multiple tests may be using the same executable, for instance). Since Dumux is using cmake and ctest, the solution adopted here was to write out metadata about the tests upon the configuration of the project. For details on how to do this, see How to write metadata about tests with cmake or ctest.

3. Introducing a test selection job

Putting all the pieces together, a test selection job was introduced at the beginning of the pipeline, which looks something like this:

select tests:
  stage: configure
  script:
    - |
      if [[ -n "$CI_MERGE_REQUEST_TARGET_BRANCH_NAME" ]]; then
          echo "Detecting changes w.r.t to target branch '$CI_MERGE_REQUEST_TARGET_BRANCH_NAME'"
          python3 bin/testing/getchangedfiles.py --outfile changedfiles.txt \
                                                 --target-tree origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME
          python3 bin/testing/findtests.py --outfile affectedtests.json \
                                           --file-list changedfiles.txt \
                                           --build-dir build-cmake

      else
          echo "Skipping test selection, build/test stages will consider all tests!"
          touch affectedtests.json
      fi      
  artifacts:
    paths:
      - affectedtests.json

The result of this job is the file affectedtests.json which contains the names and targets of all tests that have to be built and run by the subsequent stages.
These stages are set up such that if affectedtests.json is empty, they simply build and run all tests. The above job script produces an empty file if CI_MERGE_REQUEST_TARGET_BRANCH_NAME is not defined. Thus, for any situation that triggers a pipeline that is not a merge request. However, for merge requests this variable is defined and the scripts getchangedfiles.py and findtests.py used here essentially do the two steps that were discussed above in order to detect affected tests. That is, the former is basically a wrapper around git diff-tree and the latter goes over all test meta data, determines the files included by compilation of its target and checks if any of the files in changedfiles.txt (the output of getchangedfiles.py) is included by it. If so, the test is added to the list of affected tests.
This resulting affectedtest.json file is passed as an artifact to the subsequent jobs such that they only build and execute those tests that need to be checked in order to verify that the introduced changes don’t break anything.

Reducing the number of tests to be run on master

So far, we only do an actual test selection in merge requests, but according to the rules shown earlier, pipelines are also triggered by schedules or by commits to the master or any release branch. Schedules are typically set up to run at night, so there we simply run all tests. But, during a day one or more branches may be merged to master, and with the configuration illustrated so far, these merges trigger a complete test pipeline on master.

1. Finding successful pipelines

However, an accepted merge request should in principle have a successful pipeline associated with it already. So in most cases, the pipelines triggered on master after merges are actually obsolete. Thus, we would like to check if we can find a successful pipeline for the current status already. This can be done with queries using the GitLab API (see this article for more details). In Dumux, this check has been added as a job that is picked up before the actual test pipeline execution.
The main details of its definition in the .gitlab-ci.yml file look something like:

check-pipeline-status:
  stage: check-status
  
  # only run this job after merging new changes into master or release branches
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      when: never
    - if: $CI_PIPELINE_SOURCE == "pipeline"
      when: never
    - if: '$CI_COMMIT_BRANCH =~ /^(master|releases\/)/'
      when: always
  script:
    - |
      if ! python3 .gitlab-ci/getpipelineinfo.py --access-token $CI_JOB_API_READ_TOKEN \
                                                 --look-for HEAD \
                                                 --print-format pipeline-id; then
          echo "No successful pipeline found. Triggering a new one"

          REFERENCE_SHA=$(python3 .gitlab-ci/getpipelineinfo.py \
                                   --access-token $CI_JOB_API_READ_TOKEN \
                                   --look-for latest \
                                   --print-format commit-sha)

          curl --request POST --form "token=$CI_JOB_TOKEN" \
                              --form ref=$CI_COMMIT_BRANCH \
                              --form "variables[CI_REFERENCE_SHA]=$REFERENCE_SHA" \
                              "https://git.iws.uni-stuttgart.de/api/v4/projects/31/trigger/pipeline"
      else
          echo "Found successful pipeline for the current state of the branch. Not testing again."
      fi      

First of all, we see that in rules:, we enforce that this job should not be executed when triggered from a pipeline or schedule. As said earlier, we want this check to be done only after merging new changes into master or release branches, in order to skip the test pipeline in case the tip of the merged branch has already a successful pipeline associated with it.

The script getpipelineinfo.py is basically a wrapper around API calls as described in this article. When given the option --look-for HEAD, it specifically checks if a successful pipeline can be found for the current tip. If that search is successful, the above job script simply prints a message and exits successfully. This ports the successful pipeline status to the master/release branch without retriggering the test pipelines.

On the other hand, if the current tip has not yet been successfully tested, the script getpipelineinfo.py is called yet again, but this time with the option --look-for latest. Called in this form, the script goes back in the git history until it finds the last commit for which a successful pipeline had been run. Then, a new pipeline is triggered via an API call, in which the sha of that commit is forwarded with the variable ‘CI_REFERENCE_SHA’.

2. Modify the trigger-job

This new logic requires us to modify the rules we had previously defined in the “.base-trigger” job, because the case when '$CI_COMMIT_BRANCH =~ /^(master|releases\/)/' is true is now handled by the “check-pipeline-status” job. Therefore, we substitute this rule in “.base-trigger” with if: $CI_PIPELINE_SOURCE == "pipeline", which is the case for the API call from within “check-pipeline-status”. Thus, the new rules amount to:

.base-trigger:
  stage: trigger-test-pipelines
  trigger:
    include: .pipeline.yml
    strategy: depend
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: $$CI_PIPELINE_SOURCE == "pipeline"
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      when: manual

3. Extend “select tests”-job

Recall the definition of the “select tests” job shown before, where the tests affected by changes in merge requests have been detected. This job is now extended to also handle the situation of pipelines triggered via the API call stated in “check-pipeline-status”:

select tests:
  stage: configure
  script:
    - |
      if [[ -n "$CI_MERGE_REQUEST_TARGET_BRANCH_NAME" ]]; then
          echo "Detecting changes w.r.t to target branch '$CI_MERGE_REQUEST_TARGET_BRANCH_NAME'"
          python3 bin/testing/getchangedfiles.py --outfile changedfiles.txt \
                                                 --target-tree origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME
          python3 bin/testing/findtests.py --outfile affectedtests.json \
                                           --file-list changedfiles.txt \
                                           --build-dir build-cmake

      elif [[ -n "$REFERENCE_SHA" ]]; then
           echo "Detecting changes w.r.t to reference commit $REFERENCE_SHA"
           python3 bin/testing/getchangedfiles.py --outfile changedfiles.txt \
                                                  --source-tree HEAD \
                                                  --target-tree $REFERENCE_SHA
           python3 bin/testing/findtests.py --outfile affectedtests.json \
                                            --file-list changedfiles.txt \
                                            --build-dir build-cmake

      else
          echo "Skipping test selection, build/test stages will consider all tests!"
          touch affectedtests.json
      fi      
  artifacts:
    paths:
      - affectedtests.json
    expire_in: 3 hours

The elif [[ -n "$REFERENCE_SHA" ]]-clause handles the case described here: the tests affected by modifications since the commit associated with the last successful pipeline are detected for testing in subsequent stages.

Summary

This article outlines the steps that were taken in the Dumux project to reduce the computational cost of the test pipelines in the daily development workflow using the GitLab CI.

The resulting behaviour of the CI can be summarized as follows:

  • Merge requests:
    • run only tests affected by introduced changes
  • Commits to master/release branches (e.g. via accepted merge request):
    • if successful pipeline found in merged branch: port success status to master/release branch
    • if successful pipeline not found in merged branch: run only tests affected by introduced changes in the master/release branch
  • Schedules (e.g. every night):
    • test the overall system

See also