| Both sides previous revision Previous revision Next revision | Previous revision |
| habrok:connecting_to_the_system:login_nodes [2025/10/23 14:46] – Expand long process section pedro | habrok:connecting_to_the_system:login_nodes [2025/10/23 15:50] (current) – [Long process termination] pedro |
|---|
| ===== Login nodes ===== | ===== Login nodes ===== |
| |
| ''login1.hb.hpc.rug.nl'' and ''login2.hb.hpc.rug.nl'' are the default login nodes that are used by most users. You can use these to connect to the system, copy your files, submit jobs, compile your code, et cetera. You should not use it to test your applications, since this might slow down the system, which will hinder other users who are trying to log in. For this reason, long running intensive processes will be automatically killed, see section below.. It is also a smaller system. | ''login1.hb.hpc.rug.nl'' and ''login2.hb.hpc.rug.nl'' are the default login nodes that are used by most users. You can use these to connect to the system, copy your files, submit jobs, compile your code, et cetera. You should not use it to test your applications, since this might slow down the system, which will hinder other users who are trying to log in. It is also a smaller system. For this reason, long running intensive processes will be automatically killed, see section below. |
| |
| We have set up two of these login nodes to increase the availability of the service. | We have set up two of these login nodes to increase the availability of the service. |
| ===== Long process termination ===== | ===== Long process termination ===== |
| |
| Since 2025-10-24, we automatically kill misbehaving processes that have been running for too long and using too many resources on the login, interactive, and interactive GPU nodes. This is to prevent one or a few users from occupying resources that are only meant for short tests, which then prevents other users from executing legitimate tasks on these nodes. This, in addition to the periodic rebooting of these nodes, ensures that the resources are available in good order for all users. | Since 2025-10-24, we automatically kill misbehaving processes that have been running for too long and using too many resources on the login, interactive, and interactive GPU nodes. Certain processes that are expected to run for a long time are allowed (for example, ssh sessions). This is to prevent one or a few users from occupying resources that are only meant for short tests, which then prevents other users from executing legitimate tasks on these nodes. This, in addition to the periodic rebooting of these nodes, ensures that the resources are available in good order for all users. |
| |
| ===== Periodic reboots ===== | ===== Periodic reboots ===== |
| |
| In order to prevent the login/interactive nodes from being filled up with temporary files and long-running processes, these nodes are rebooted every other week on Monday morning at 6:00 CE(S)T. The odd-numbered nodes (''login1'', ''interactive1'', ''gpu1'') are rebooted in odd weeks, and the even-numbered nodes (''login2'', ''interactive2'', ''gpu2'') are rebooted in even weeks. | In order to prevent the login/interactive nodes from being filled up with temporary files and long-running processes, these nodes are rebooted every other week on Monday morning at 6:00 CE(S)T. The odd-numbered nodes (''login1'', ''interactive1'', ''gpu1'') are rebooted in odd weeks, and the even-numbered nodes (''login2'', ''interactive2'', ''gpu2'') are rebooted in even weeks. |