MPI
Software with MPI support can run on multiple nodes, and in this case the software is launched with one or more tasks (instances) per node. All these tasks use the network for inter-node communication. The following is an example of a jobscript that can be used for MPI applications:
#!/bin/bash #SBATCH --nodes=2 #SBATCH --ntasks-per-node=4 #SBATCH --time=01:00:00 #SBATCH --mem-per-cpu=2000 module purge module load foss/2023a # compile our source code; not required if this has been done before mpicc -o ./my_mpi_app my_mpi_app.c srun ./my_mpi_app
Here we request two nodes with four tasks on each node, i.e. in total we will be running 8 tasks (on 8 CPUs). The important aspect for an MPI application is that we launch it using srun
, which is a scheduler commands that ensures that the application is started on all allocated resources and with the right number of tasks. Also MPI's own mpirun
can be used, but we generally recommend to use srun
.
Currently, two MPI implementations are supported/installed on Hábrók: OpenMPI and Intel MPI.
OpenMPI
OpenMPI is part of the foss
toolchains, and this is used for most of the MPI applications that are available on Hábrók. If you're compiling custom software and want to use the GCC compilers, this is the recommended MPI. The jobscript that we showed earlier should work fine for these applications.
Intel MPI
Intel MPI is available as part of the intel
toolchains. Intel MPI does not integrate as well with the scheduler as OpenMPI does, which means that we need to provide an additional setting in a jobscript in order to be able to launch our applications with srun
:
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so srun ./my_mpi_app
Note: mpirun
can also be used for launching Intel MPI applications, but after a recent upgrade of the Slurm scheduler it seems that this sometimes leads to connection issues, in particular when using larger number of nodes:
[bstrap:0:-1@node15] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on node16 (pid 859263, exit code 49152) [bstrap:0:-1@node15] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error [bstrap:0:-1@node15] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error [bstrap:0:-1@node15] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1061): error waiting for event [bstrap:0:-1@node15] upstream_cb (../../../../../src/pm/i_hydra/libhydra/bstrap/src/hydra_bstrap_proxy.c:356): error setting up the bstrap proxies [bstrap:0:-1@node15] HYDI_dmx_poll_wait_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:80): callback returned error status [bstrap:0:-1@node15] main (../../../../../src/pm/i_hydra/libhydra/bstrap/src/hydra_bstrap_proxy.c:628): error sending proxy ID upstream ... [mpiexec@node1] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:362): write error (Bad file descriptor)
It's unclear what is causing these issues, but using srun
instead of mpirun
should solve this issue.