This is an old revision of the document!
Running jobs on GPUs
If you want your job to make use of a special resource like a GPU, you will have to request these. This can be done using the new Slurm option:
#SBATCH --gpus-per-node=n
Where n
is the number of GPUs you want to use per node.
Alternatively you can request a specific GPU type using:
#SBATCH --gpus-per-node=type:n
where type
is the type of GPU. Note that is is also still possible to use the --gres
option that was required on Peregrine.
Jobs requesting GPU resources will automatically end up in one of the GPU partitions.
Available GPU types
Node | GPU type | GPUs per node | Memory per GPU | CPUs per node | Memory per node | Slurm name | Notes |
---|---|---|---|---|---|---|---|
A100_1 | Nvidia A100 | 4 | 40 GB | 64 | 512 GB | a100 | Full A100 cards |
A100_2 | Nvidia A100 | 8 | 20 GB | 64 | 512 GB | a100.20gb | Two virtual GPUs per A100 card |
V100 | Nvidia V100 | 1 | 32 GB | 12 | 128 GB | v100 | Not available yet, see Known issues |
Please be aware that the V100 nodes still need to be migrated to Hábrók.
Example
So if you would like to request two (NVIDIA A100) GPUs, you would have to use the following:
#SBATCH --gpus-per-node=a100:2
If you want to request a node with half of an NVIDIA A100, use the following:
#SBATCH --gpus-per-node=a100.20gb:1
If you just want one GPU you can leave out the type and use, which will make your job go a the 20 GB A100 virtual GPU.
#SBATCH --gpus-per-node=1
Interactive GPU node
Not yet available.
Running interactive jobs
You can request an interactive session by using a command like:
srun --gpus-per-node=1 --time=01:00:00 --pty /bin/bash
When the job starts running, you will be automatically logged in to the allocated node, allowing you to run your commands interactively. When you are done, just type exit
to close your interactive job and to release the allocated resources.
N.B.: interactive jobs currently don't (always) use the software stack built for the allocated nodes, you can work around this by first doing a module update
after the job has started.