Version control with git


Version control goals

A reasonable software quality can be maintained by relying on two things:
- version control and
- test driven development.

A well tested code is likely to be properly organized (modular), in whatever programming model:
- structured,
- object oriented or
- functional.
Modular code is a standard best practice and helps people to work on different parts of code (simultaneously). It is easily paired with version control, that helps to maintain an oversight over different working stati on different modules.

Version Control helps to prevent the duplication of files, by tracking if changes (differences) exist between them in different commits . It also helps keeping an overview over different versions. Version control systems make it possible to have a single set of files (raw text, LaTex, fortran/C++/python/BASH source code) and switch between different versions (so called branches ) using a version control system (VCS). A popular example is git .

Thus VCS will enbable you to have one folder and when you switch branches it will remove, modify or add the files according to the state of the branch you are checking out. This also has the advantage to potentially save a lot of storage on your computer, since you do not have to have dozends or even hundreds of folders/files on your computer, manually sorting them.

gitGraph: options { "nodeSpacing": 150, "nodeRadius": 10 } end commit branch feature checkout feature commit commit checkout master commit checkout feature commit checkout master merge feature

Example for an workflow with one ‘master’ and one ‘feature’ branch. Time goes from left to right. Commits are visualized in yellow, branches in blue and red.

A VCS enables traceability of every step (commit) of your files. You can go back to previous versions of your code at anytime and you can work on different things (like modules/features in your modular code) at the same time (in different branches). Once you are satisfied with your work in one branch, you merge it with the ‘master’-branch. This means your changes will be deployed to the ‘master’-branch. In the above example the branch ‘feature’ is merged with ‘master’ at in the last commit (the one on the very right).

The version control system that is widely used and adopted here is the git version control system. It is available on all operating systems, it is open source, and distributed. The distributed nature of git makes it possible to work without the connection to a remote repository (abbr. repo), and easily push the updates once the connection is established again.

Version control

The principle is to have one main git repository into which contributions are continuously integrated from different researchers. Branches should be ‘master’, ‘dev’ and the ‘feature-branches’ only, refer to the next section. Tags are used for the papers.

The main repository should be on a server, that is accessible to all researchers involved at all times. It should not be someone’s personal laptop but a GitLab/GitHub server. These provide not only an interface to work on your repository but also for tracking issues, review code development and code changes. Also we use the continuous integration feature provided by GitLab.

Every researcher can have a repository with a clone of the main repository on his local computer ( a local repo ), where he can work on the code. Every noteworthy increment should get a commit and every commit can be pushed to the server.

A simple workflow

Most of the work with git can be done using this simple git workflow:

  1. Create, add and push existing code to the remote repository.
  2. Create a feature branch (a version). Work on it, test it, refactor the code.
  3. Integrate the feature branch into the your local development branch. Make sure you have pulled the latest changes from the central group repository development branch into your local development branch first. Clean up possible conflicts the best you can. Run your tests and all other tests to make sure nothing is broken there. Submit a merge request to the remote central development branch.
    Once your merge request has been processed with your colleague / git manager, you can clean up the feature branches from your repos, local and upstream.
  4. When the code is ready for a release, integrate the development branch into the master branch. Run all tests.

Branch naming and organization should be done using the feature branching workflow:

  • master: everything here compiles , every test here runs.
  • dev: tested and stable new features are integrated here.
  • feature/parallelization: you are working on parallelization here, and nothing else.
  • feature/second-order: you are working on second-order convergence here, nothing else.
  • feature/… another feature-branch of yours

Important git commands

There are many more commands, and one can make version control arbitrarily complex with them. These commands are enough for a clean and simple workflow.

command Description
git clone <URL> create a clone, a copy of the remote repository.
git remote add <repo-nameURL> add a remote (e.g. an official TU Darmstadt repo, a research group repo, your private repository on a USB stick).
git add <file> add the changes to the file to the next commit (a set of changes you are happy with)
git commit -m "MESSAGE" commit a set of changes with a precise and short description (here ‘MESSAGE’).
git push <repo-name> <branch-name> push the changes to the remote repository and a specific branch.
git pull <repo-name> <branch-name> fetch the changes from the remote branch and merge them.
git checkout -b _branch-name_ create a new branch (a new idea).
git checkout <branch> load the files from the other branch, ‘switch’ to it.
git merge <branch> merge a branch (could be your feature branch) into the branch you’re currently on.
git tag create a git tag , a read-only point in the development tree, when you publish a paper.

Best practices

Keep things simple:

  1. Branch for every new idea, it only costs 46Kb of space, even if you branch of within OpenFOAM, that has 500k lines of code.
  2. When you are working on an idea, you are working on its branch.
    You should not work on two ideas on the same branch, that’s how one exits Zen. If a change should be shared between branches, checkout the parent branch (dev, or another feature), apply it there, then merge in the current branch from there.
  3. Commit often, do not put whole libraries in a single commit. Don’t create 1-commit PhD code.
  4. Integrate your branches with the central project repository, if there is one.
    When the feature branch is done, do a git pull on your local development branch to have it up to date. Merge your feature branch with your local development branch, and submit a merge request to the development branch on the remote central repository (group repository). Once your pull request is done, delete the feature branch in your repo (local and online). Additionally, if the paper is finished, a tag can be created from the development branch that identifies a paper, using the DOI as the version number. Delete the feature branch on the local and the remote, don’t keep 30 feature branches in your repositories.
  5. If you are talking about rebasing history, reverting commits, undoing commits, cherry picking, and other stuff like that, you are not using enough branches and are committing too seldomly.

Smallest continuous integration frequency is determined by the frequency of published papers: at a point of a publication of a paper, all the test cases related to the paper, and the validation and verification cases reported in the paper must execute on the master branch of the git repository of the project, without breaking any other tests.

Where to start

To start install git and get an account. An example of an git workflow (with Bitbucket instead of GitLab) is given in this article.

Of course please also refer to our literature. Or get a list of our articles assigned to this chapter via the tags on the top of every page.

See also