Sharing data

We don't allow users to open up their private folders, using file system permissions or access control lists. This because managing these correctly can be complicated, and therefore easily lead to security problems, where users accidentally share data with all other cluster users.

If you need to share data on Hábrók with other users, there are two options.

A group directory is useful if you need to share data with a group of users, and the other users on the cluster must not have access to that data. In this case we can set up a group on the cluster for this limited set of users and give the group access to one or more shared folders.

These group directories are created on /scratch for data that needs to be processed and on /projects for data that needs to be stored safely.

For working with this data there are two models:

  1. The files in the shared folder are readable and writable for all group members. This with the caveat that the users and certain tools can override the default permission settings, making data unreadable or unwritable for the other group members. Note that most archiving tools by default apply the permissions on files and folders as they were in the source location. These permissions can be fixed by the person that wrote the data. An explanation of managing Linux file permissions can be found at: https://kb.iu.edu/d/abdb
  2. There is a data manager that manages the data in the shared folder, and this data manager is the only person with full write access. All other group members can only read the data.

If you want to request a group directory, please contact hpc@rug.nl and let us know the following things:

  1. The proposed name of the group (this name should not be in use and be convenient on the command line). The group name will always be prefixed by hb-.
  2. The amount of space needed on the file systems involved, when more than the default quota are required. Note that for /projects there is a fair use principle where you have to pay for storage above a certain threshold. For /scratch a fair share policy is in place.
  3. Who the primary owner of the group is. This person has to approve the requests for joining the group.
  4. The person who can act as an alternative contact person for the group to approve these requests.
  5. Do all users need full write access or is there a data manager?
  6. If there is a data manager, who will fulfill that role?

Sometimes you need to share non-sensitive, public data with someone else. For this we have set up a directory /scratch/public/tmp. The data in this directory can be read by all users on the cluster. Since we have allocated limited space to this directory a cleanup script will remove data after 30 days.

When you need to share data for a longer period, please let us know. We can then create a persistent public group or privately managed directory in /scratch/public. You can request this at hpc@rug.nl, where you need to answer the same questions as for a regular group directory, or tell us that you'll manage the data yourself.

Since /scratch is optimized for large files, storing software on /scratch is not recommended. For large or shared software installations a NFS based share has been setup, which is available as /userapps. Since we assume that most software installations use downloads from external sites (like e.g. Python virtual environments) we do not make a backup of /userapps. Please contact hpc@rug.nl if you need additional space on /userapps for your installations, or when you need to share your software stack with multiple users. For the latter you should answer the questions for a group directory above.