Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
habrok:advanced_job_management:special_partitions [2025/08/22 06:33] – created fokkehabrok:advanced_job_management:special_partitions [2026/01/14 10:07] (current) – [GELIFES nodes] pedro
Line 11: Line 11:
 This sounds a bit complicated, but the main thing to take home is: This sounds a bit complicated, but the main thing to take home is:
  
-Submit to the dedicated partition to get to the group specific nodes.\\ +  *Submit to the dedicated partition to get to the group specific nodes.\\ 
-Usage (and priorities) of these nodes are handled separately from the usage and priorities of the rest of the cluster.+  *Usage (and priorities) of these nodes are handled separately from the usage and priorities of the rest of the cluster.
  
 ===== Account coordinator instructions ===== ===== Account coordinator instructions =====
  
-Users with the coordinator role in one of the special accounts can add users to the account using the following command:+Users with the coordinator role in one of the special accounts can add users to the account using the following command, which should be run on one of the login/interactive nodes of the cluster:
  
 <code> <code>
-sacctmgr add user USERNAME account=ACCOUNT fairshare=1+sacctmgr add user <username> account=<account> fairshare=1
 </code> </code>
-Where USERNAME should be changed into the userid that is to be added to the account ACCOUNTACCOUNT must be changed into the name of the account, e.g. gelifes, digitallab, caos. The fairshare should be by default set to 1.+Where <username> should be changed into the userid that is to be added to the account <account><account> must be changed into the name of the account, e.g. ''digitallab''''caos''. The fairshare should be by default set to 1.
  
 In order to verify/check if a user has already been added to the account, the column “Account” in the output of the following command should show a row with “users” and one with the special account: In order to verify/check if a user has already been added to the account, the column “Account” in the output of the following command should show a row with “users” and one with the special account:
  
 <code> <code>
-sacctmgr show -s user USERNAME+sacctmgr show -s user <username>
 </code> </code>
 +
 +A full list of users in the account can be obtained using:
 +<code>
 +sacctmgr show -s account <account>
 +</code>
 +Note that CIT site administrators can also be members of these special accounts, as they need access for testing node performance and running software installations.
  
 Coordinators can be added to the special accounts using: Coordinators can be added to the special accounts using:
  
 <code> <code>
-sacctmgr add coordinator name=USERNAME account=ACCOUNT+sacctmgr add coordinator name=<username> account=<account>
 </code> </code>
 In order to modify an existing user the following can be used: In order to modify an existing user the following can be used:
  
 <code> <code>
-sacctmgr modify user name=USERNAME account=ACCOUNT set fairshare=1+sacctmgr modify user name=<username> account=<account> set fairshare=10
 </code> </code>
 +This will modify the fairshare for the user.
  
- +user can be removed from the account using:
-==== GELIFES nodes ==== +
- +
-The node themselves are 64 core AMD EPYC 7601 nodes, running at 2.2 GHz, with 512GB of memory. These should be suitable for most of the GELIFES workloads. There is also quite a lot (16 TB) of **temporary** local scratch space per node, available for jobs through the use $TMPDIR in the jobs scripts. +
- +
-==== Limits ==== +
- +
-The gelifes partition has two types of limits: one on the number of jobs per user, and one on the number of cores allocated to different job lengths. +
- +
-|**Job type**|**Time limit**    |**Maximum number of submitted jobs per user**|**Maximum number of cores**| +
-|short       |≤ 1 day           |1000                                         |960                        | +
-|medium      |>1 day, ≤ 3 days  |1500                                         |640                        | +
-|long        |>3 days, ≤ 10 days|2000                                         |320                        | +
- +
-Note that jobs from all users contribute to the maximum number of cores that can be allocated to these jobs. This prevents the partition from being filled with long jobs, which would lead to higher waiting times for jobs. If this limit is reached, waiting jobs will get a ''%%Reason: QOSGrpCpuLimit%%''+
- +
-==== Software ==== +
- +
-=== Modules === +
- +
-Since the instruction set of the AMD CPUs in the gelifes partition is compatible with that of the standard Intel based nodes, the software from these nodes is used on the gelifes partition. There is an issue with the intel compiler based software, however. Whereas the software based on the GNU compilers (foss toolchains) works fine, the software based on the intel toolchain does not work. This because the intel compiler introduces a CPU check in the code, which fails for the AMD nodes.\\ +
-Our advice is therefore to make use of the foss toolchains, as all relevant software should be available in the foss toolchains. +
- +
-===== GELIFES account ===== +
- +
-You can request access to the gelifes account, giving access to the GELIFES nodes by contacting Joke Bakker from GELIFES. +
- +
-===== GELIFES coordinator instructions ===== +
- +
-Users with the coordinator role in the gelifes account can add users to the account using the following command: +
 <code> <code>
-sacctmgr add user USERNAME account=gelifes fairshare=1+sacctmgr -i delete user <username> account=<account_name>
 </code> </code>
-Where USERNAME should be changed into the userid that is to be added to the account gelifes. The fairshare should be by default set to 1. 
  
-In order to verify/check if a user has already been added to the gelifes account, the column “Account” in the output of the following command should show a row with “users” and one with “gelifes”:+===== GELIFES nodes =====
  
-<code> +Until  the beginning of January 2026, Hábrók included nodes originally purchased by GEFLIES for the Peregrine cluster. These were 64 core AMD EPYC 7601 nodes, running at 2.2 GHz, with 512GB of memory. Because these nodes came from an earlier purchase, they were older than the existing Hábrók compute nodes and as such, their support has ended. Consequently, they have been decommissioned and the ''gelifes'' partition is no longer available. 
-sacctmgr show -s user USERNAME +
-</code> +
-Coordinators can be added to the gelifes account using:+
  
-<code> +Please use the ''regular'' partition instead.
-sacctmgr add coordinator name=USERNAME account=gelifes +
-</code> +
-In order to modify an existing user the following can be used: +
- +
-<code> +
-sacctmgr modify user name=USERNAME account=gelifes set fairshare=1 +
- +
-</code>+