How to run a job on the cluster using SLURM and Singularity

Singularity is the software tool used by the HPC to manage containers in the cluster. If you’re new to Singularity or containers, please refer to the literature about singularity for an introduction.

SLURM is a workload manager used to run jobs on HPC . It is used e.g. on the Lichtenberg cluster.

Working with singularity containers on the cluster

  1. Have the container (.sif) and the overlay image in the project folder

  2. Create a script with the following structure:

    #!/bin/bash
    #SBATCH -J my_project_job
    #SBATCH --mail-type=END
    #SBATCH --error=/home/$USER/project<number>/"%x.err.%j"
    #SBATCH --output=/home/$USER/project<number>/"%x.out.%j"
    #SBATCH -n 1                # Number of cores
    #SBATCH --mem-per-cpu=1600  # Main memory in MByte per MPI task
    #SBATCH -t 01:30:00         # Hours, minutes and seconds, or '#SBATCH -t 10' = only minutes
    
    # -------------------------------
    srun singularity run -o overlay.img lent_latest.sif foamVersion
    

    The folder project should exist in your $HOME folder. Otherwise you get an error e-mail from HPC and the job will fail.

    The command following will not be as simple as foamVersion of course. Remember that within the %runscript section you can specify and abstract away commands and also read arguments provided from the run. For example say we have a command named lentFoam which is available only after the source of some bashrc’s and runs depending on the current working directory. Then we can put the following commands in the %runscript section:

    %runscript
        source /opt/OpenFOAM/OpenFOAM-v1806/etc/bashrc
        source /opt/OpenFOAM/lent/etc/bashrc
        cd $1
        lentFoam
    

    and initiate our run with the following command:

    srun singularity run -o overlay.img lent_latest.sif /opt/OpenFOAM/lent/cases/some/random/folder/for/simulation
    

    Here we only provide a path, which is being picked up by the %runscript (as $1 argument) and is used to change to that directory. Then the lentFoam command is called. Careful, this %runscript might be too restrictive depending on the application. It is up to the user to decide the level of automation and abstraction inside the %runscript.

  3. Run the following command to start the script:

    sbatch my_job.sh
    

    Depending on the way the container permission rights were setup and the output folders of the execution commands, the data should always end up inside the overlay image.

See also