Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| habrok:advanced_job_management:running_jobs_on_gpus [2024/01/15 10:50] – [Running jobs on GPUs] camarocico | habrok:advanced_job_management:running_jobs_on_gpus [2025/09/29 09:25] (current) – [Running interactive jobs] Formatting pedro | ||
|---|---|---|---|
| Line 18: | Line 18: | ||
| ==== Available GPU types ==== | ==== Available GPU types ==== | ||
| - | ^ Node ^ GPU type ^ GPUs per node ^ Memory per GPU ^ CPUs per node ^ Memory per node ^ Slurm name ^ Notes ^ | + | ^ Node ^ GPU type ^ GPUs per node ^ Memory per GPU ^ CPUs per node ^ Memory per node ^ Slurm name ^ |
| - | | A100_1 | + | | A100 | Nvidia A100 | 4 | 40 GB | 64 | 512 GB | a100 | |
| - | | A100_2 | + | | V100 | Nvidia |
| - | | V100 | Nvidia | + | | L40S | Nvidia |
| ==== Example ==== | ==== Example ==== | ||
| Line 29: | Line 29: | ||
| < | < | ||
| #SBATCH --gpus-per-node=a100: | #SBATCH --gpus-per-node=a100: | ||
| - | </ | ||
| - | If you want to request a node with half of an NVIDIA A100, use the following: | ||
| - | |||
| - | < | ||
| - | #SBATCH --gpus-per-node=a100.20gb: | ||
| </ | </ | ||
| Line 47: | Line 42: | ||
| < | < | ||
| - | gpu1.hpc.rug.nl | + | gpu1.hb.hpc.rug.nl |
| - | gpu2.hpc.rug.nl | + | gpu2.hb.hpc.rug.nl |
| </ | </ | ||
| - | These machines have an NVIDIA | + | These machines have an NVIDIA |
| ** Please keep in mind that this is a shared machine, so allow everyone to make use of these GPUs and do not perform long runs here. Long runs should be submitted as jobs to scheduler. ** | ** Please keep in mind that this is a shared machine, so allow everyone to make use of these GPUs and do not perform long runs here. Long runs should be submitted as jobs to scheduler. ** | ||
| ==== Running interactive jobs ==== | ==== Running interactive jobs ==== | ||
| - | You can request an interactive session by using a command like: | + | You can usually |
| < | < | ||
| srun --gpus-per-node=1 --time=01: | srun --gpus-per-node=1 --time=01: | ||
| </ | </ | ||
| + | |||
| + | There is currently an issue with using '' | ||
| + | < | ||
| + | srun --gres=gpu: | ||
| + | </ | ||
| + | |||
| + | or: | ||
| + | < | ||
| + | srun --gres=gpu: | ||
| + | </ | ||
| + | |||
| When the job starts running, you will be automatically logged in to the allocated node, allowing you to run your commands interactively. When you are done, just type '' | When the job starts running, you will be automatically logged in to the allocated node, allowing you to run your commands interactively. When you are done, just type '' | ||
| **N.B.: interactive jobs currently don't (always) use the software stack built for the allocated nodes, you can work around this by first running '' | **N.B.: interactive jobs currently don't (always) use the software stack built for the allocated nodes, you can work around this by first running '' | ||