Detailed Singularity walk-through [CIT Research Documentation]

This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong.
====== Detailed Singularity walk-through ======

This walk-through is based on our Singularity tutorial and consists of several exercises that will guide you through the entire process of installing Singularity and creating images to actually running them on a different system (Peregrine). The solutions and explanations can be found at the bottom of the page. The walk-through assumes that you have some Ubuntu-like system with root privileges (for the tutorial we use Xubuntu VMs) and that the container will be run with a regular user account on Peregrine.

===== End goal of this walk-through =====

At the end of this walk-through, you will be able to submit a job on Peregrine that uses Singularity to run the following Python-pandas script, which takes an input file and processes it:

<code python script.py>
# Based on https://www.kaggle.com/antgoldbloom/exploring-climate-change-data

import pandas as pd
import sys

# Check usage
if len(sys.argv) < 2:
    print("Usage: %s <csv file>" % sys.argv[0])
    sys.exit()

csvfile = sys.argv[1]

# Sort all cities on average temperature
dfTempByCity = pd.read_csv(csvfile,index_col='dt',parse_dates=[0])
cities_2012_avg = dfTempByCity[dfTempByCity.index.year == 2012][['City','Country','AverageTemperature']].groupby(['City','Country']).mean().sort_values('AverageTemperature',ascending=False)

print("Cities with highest average temperature in 2012:")
print(cities_2012_avg.head())

print("")

print("Cities with lowest average temperature in 2012:")
print(cities_2012_avg.tail())

#Temperature in Groningen using a 12 month rolling mean to smooth out seasonality
plot = pd.Series.rolling(dfTempByCity[dfTempByCity['City'] == 'Groningen']['AverageTemperature'],window=12).mean().plot(x=dfTempByCity.index)
plot.get_figure().savefig('plot.png')
</code>

The required dataset can be downloaded from [[https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data/downloads/GlobalLandTemperaturesByCity.csv|Kaggle]] (please note you will need to login to access the data) and we will refer to it as inputfile.csv. For the exercises we assume that both files are stored in /data/<peregrine_username> on Peregrine.

===== Exercises =====

==== Exercise 1: Install Singularity ====

Install Singularity on your machine. You can use the instructions from the [[http://singularity.lbl.gov/install-linux|Singularity website]]. On Peregrine we use the latest stable version, so it is best to use the same one on your own machine.

[[#exercise_1install_singularity1|Solution]]


----

==== Exercise 2: Create a Singularity image and populate it with python:3.5.3 (Debian Jesse) ====

Create a blank Singularity image and then populate it by using the import function to import a Python 3.5.3 image from the [[https://hub.docker.com/|Docker Hub]]. Finally, shell into your container and test it:

  * verify which operating system is installed;
  * check your username, user id and groups;
  * check the disk space usage;
  * do environment variables like $PATH get transferred from the host to the container?

[[#exercise-2create_a_singularity_image_and_populate_it_with_python353_Debian_Jesse1|Solution]]


----

==== Exercise 3: Modify and run ====

We need to add a few more things to our container. Before doing this, we have to expand its size. Use the expand function to add 1GB of space to the image. Then shell into it with write permission, update the packages through your package manager and finally install the Python packages pandas and matplotlib.

Test the container by running the following command in it:

<code bash>
python3.5 script.py inputfile.csv
</code>
[[#exercise_3modify_and_run1|Solution]]

----

==== Exercise 4: Copy to Peregrine and run ====

Now that the container is ready, we are going to run it on Peregrine. First copy the image to Peregrine; because of its size, store it in your personal directory on /data. Then log in to the interactive node of Peregrine and try to shell into your container using the ''%%--contain%%'' option: what does it do? What happens if you store a file on /home and leave the container? Also try to use the ''%%--bind%%'' option to mount your /data/<peregrine_username> directory in the container: make sure that you can find all the necessary files (Python script and input data) in your container.

[[#exercise_4copy_to_peregrine_and_run1|Solution]]

----

==== Exercise 5: Create and submit job ====

Now we want to run the Python script in the container through a SLURM job. Write a job script that requests a few minutes of wall clock time and runs the job in the short partition, and use the singularity exec function to run the Python script. You can store the job script in the same directory as all the other files. Finally, submit the job and check the results when it is done.

[[#exercise_5create_and_submit_job1|Solution]]

----

==== Exercise 6: Automate workflow with bootstrap ====

In the previous exercises we have manually set up the container image, which is not a very reproducible way of working. A definition file allows you to automate all these steps; the Singularity bootstrap command uses a definition file to bootstrap an image. Write a definition file that contains all the steps to generate the same container as you had before, create a blank image and bootstrap it using your definition file. Finally, copy it to Peregrine, run it again and verify that it works.

[[#exercise_6automate_workflow_with_bootstrap1|Solution]]

----

==== Exercise 7 (optional): GPUs ====

Singularity also allows you to make use of GPUs in your containers. In this exercise we are going to test this. First, you will need to compile some CUDA application. You can for instance use the Hello World code from the following website:\\
http://computer-graphics.se/hello-world-for-cuda.html

Load the most recent CUDA module and compile it with the nvcc compiler. In order to run it on a GPU node, write a job script that requests one GPU, runs the job in the "gpu" partition and requests an appropriate amount of wall time (for the Hello World example 1 minute is fine). The job script only has to run your compiled application through the container.\\
Note that your application needs the NVIDIA drivers in order to run on the GPU. On the GPU nodes these can be found in /usr/lib64/nvidia; you will need to make them available in your container as well.

Hint:\\
use a bind mount and adjust your ''%%$LD_LIBRARY_PATH%%''.

[[#exercise_7_optionalgpus1|Solution]]

----

===== Solutions =====

==== Exercise 1: Install Singularity ====
You may have to install these build dependencies first:

<code bash>
sudo apt install build-essential autoconf libtool wget
</code>
In order to install Singularity 2.2.1, run the following commands:

<code bash>
wget https://github.com/singularityware/singularity/releases/download/2.2.1/singularity-2.2.1.tar.gz
tar xzvf singularity-2.2.1.tar.gz
cd singularity-2.2.1
./configure --prefix=/usr/local
make
sudo make install
</code>

----

[[#exercise_1install_singularity|Back to exercise 1]]\\
[[#exercise_2create_a_singularity_image_and_populate_it_with_python353_debian_jesse|To exercise 2]]

----

 

==== Exercise 2: Create a Singularity image and populate it with python:3.5.3 (Debian Jesse) ====

Create a blank image ''%%tutorial.img%%'':

<code bash>
sudo singularity create tutorial.img
</code>
\\
Import the Docker image python:3.5.3 into your tutorial.img:

<code bash>
sudo singularity import tutorial.img docker://python:3.5.3
</code>
\\
Shell into the container:

<code bash>
singularity shell --shell /bin/bash tutorial.img
</code>
\\
Test it by running these commands:

<code bash>
cat /etc/os-release
id
df -h
env
exit
</code>

----

[[#exercise_2create_a_singularity_image_and_populate_it_with_python353_debian_jesse|Back to exercise 2]]\\
[[#exercise_3modify_and_run|To exercise 3]]

----

==== Exercise 3: Modify and run ====

Expand the size of the image by 1024MB:

<code bash>
sudo singularity expand --size 1024 tutorial.img
</code>

Shell into the container (''%%--shell /bin/bash%%'' will use a Bash shell instead of a Dash shell) as root (sudo) and with write permission (''%%-w%%''):

<code bash>
sudo singularity shell --shell /bin/bash -w tutorial.img
</code>

Run the following commands to update the package list, upgrade packages, and install pandas and matplotlib through pip3:

<code bash>
apt update
apt upgrade
pip3 install pandas matplotlib
exit
</code>

In order to run the Python script, either shell into the container and run the command, or use Singularity exec:

<code bash>
singularity exec tutorial.img python3.5 script.py inputfile.csv
</code>

----

[[#exercise_3modify_and_run|Back to exercise 3]]\\
[[#exercise_4copy_to_peregrine_and_run|To exercise 4]]

----

==== Exercise 4: Copy to Peregrine and run ====
First, we have to create a ''%%/data/<peregrine_username>%%'' directory that will serve as a mount point for Peregrine’s ''%%/data/<peregrine_username>%%'':

<code bash>
sudo singularity shell --shell /bin/bash --writable tutorial.img
</code>

Run these commands in the container:

<code bash>
mkdir -p /data/<peregrine_username>
exit
</code>
Now copy the image to your data directory on Peregrine, log in to Peregrine, go to the data directory, and start your container:

<code bash>
scp tutorial.img <peregrine_username>@peregrine.hpc.rug.nl:/data/<peregrine_username>/SingularityTutorial
ssh <peregrine_username>@peregine.hpc.rug.nl
cd /data/$USER/SingularityTutorial
singularity shell --shell /bin/bash tutorial.img
</code>

Test the container by running some commands, e.g.:

<code bash>
pwd
ls -l $HOME
exit
</code>

Find out what the ''%%--contain%%'' option for shell does:

<code bash>
singularity shell --contain --shell /bin/bash tutorial.img
    pwd
    ls -l $HOME
    touch $HOME/testfile.txt
    exit
</code>

And do the same for the ''%%--bind%%'' option:

<code bash>
singularity shell --bind /data/$USER/:/data/$USER --shell /bin/bash tutorial.img
    ls -lh /data/$USER
    exit
</code>

----

[[#exercise_4copy_to_peregrine_and_run|Back to exercise 4]]\\
[[#exercise_5create_and_submit_job|To exercise 5]]

----

==== Exercise 5: Create and submit job ====

Open a text editor to create a jobscript.sh, e.g.:

<code bash>
nano jobscript.sh
</code>
The contents of the job script should look like this:

<code bash>
#!/bin/bash
#SBATCH --time=00:05:00
#SBATCH --partition=short

singularity exec --bind /data/$USER/:/data/$USER/ tutorial.img python3.5 script.py inputfile.csv
</code>

We did not specify any resource requirements, but the defaults are fine (1 core, 2GB of memory).

Submit the job using:

<code bash>
sbatch jobscript.sh
</code>
Finally, study the output file:

<code bash>
less slurm-*.out
</code>

----

[[#exercise_5create_and_submit_job|Back to exercise 5]]\\
[[#exercise_6automate_workflow_with_bootstrap|To exercise 6]]

----

==== Exercise 6: Automate workflow with bootstrap ====
We need to have the debootstrap command on the host machine. On Ubuntu you can install this using:

<code bash>
sudo apt install debootstrap
</code>
Create a blank image (of about 1GB) and bootstrap it using:

<code bash>
sudo singularity create --size 1024 tutorial.img
sudo singularity bootstrap tutorial.img pandas.def
</code>
The pandas.def is the definition file, that should look like this:

<code bash>
BootStrap: debootstrap
OSVersion: zesty
MirrorURL: http://nl.archive.ubuntu.com/ubuntu

%setup
    echo "Looking in directory '$SINGULARITY_ROOTFS' for /bin/sh"
    if [ ! -x "$SINGULARITY_ROOTFS/bin/sh" ]; then
        echo "Hrmm, this container does not have /bin/sh installed..."
        exit 1
    fi
    exit 0


%runscript
    echo "Welcome to Ubuntu 17.04"
    exec python3.5 $@

%post
    echo "Adding multiverse repo"
    sed -i 's/$/ universe/' /etc/apt/sources.list
    sed -i 's/$/ multiverse/' /etc/apt/sources.list
    apt update
    apt upgrade
    apt install -y python3 python3-pip
    pip3 install numpy pandas matplotlib
    mkdir -p /data/<peregrine_username>
</code>
Finally, copy the image to Peregrine and run it (either interactively or through a batch job as before):

<code bash>
scp tutorial.img <peregrine_username>@peregrine.hpc.rug.nl:/data/<peregrine_username>/SingularityTutorial
ssh <peregrine_username>@peregine.hpc.rug.nl
cd /data/$USER/SingularityTutorial
singularity run --bind /data/$USER/:/data/$USER/ tutorial.img script.py inputfile.csv
</code>

----

[[#exercise_6automate_workflow_with_bootstrap|Back to exercise 6]]\\
[[#exercise_7_optionalgpus|To exercise 7]]

----

==== Exercise 7 (optional): GPUs ====

Create the mount point in the container:

<code bash>
sudo singularity shell --shell /bin/bash --writable tutorial.img
    mkdir /usr/lib64/nvidia
    exit
</code>
Copy it to Peregrine, log in, change to the right directory:

<code bash>
scp tutorial.img <peregrine_username>@peregrine.hpc.rug.nl:/data/<peregrine_username>/SingularityTutorial
ssh <peregrine_username>@peregine.hpc.rug.nl
cd /data/$USER/SingularityTutorial
</code>
Now put the CUDA source code in a file, e.g. hello_world.cu, and compile it:

<code bash>
module load CUDA/8.0.61
nvcc -o hello_world hello_world.cu
</code>
Create a job script using your favorite text editor

<code bash>
nano jobscript_gpu.sh
</code>

with the following contents:

<code bash>
#!/bin/bash
#SBATCH --time=00:05:00
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1

export LD_LIBRARY_PATH=/usr/lib64/nvidia:$LD_LIBRARY_PATH
singularity exec --bind /usr/lib64/nvidia:/usr/lib64/nvidia --bind /data/$USER/:/data/$USER tutorial.img ./hello_world
</code>

Setting the LD_LIBRARY_PATH to /usr/lib64/nvidia is required to let your application find the drivers. The ''%%--bind%%'' option will make sure that the Nvidia drivers from the host are mounted in your container. We also use a bind mount to make your data directory available. In this case you could also store the CUDA application in your home directory, which is automatically mounted; in that case you don’t need the second bind option.

Finally, submit your job using:

<code bash>
sbatch jobscript_gpu.sh
</code>

And check the results in the slurm-*.out file.


----

[[#exercise_7_optionalgpus|Back to exercise 7]]

----