This is an old revision of the document!


Sharing data

We don't allow users to open up their private folders, using file system permissions or access control lists. This because managing these correctly can be complicated, and therefore easily lead to security problems, where users accidentally share data with all other cluster users.

If you need to share data on Hábrók with other users, there are two options.

A group directory is useful if you need to share data with a group of users, and the other users on the cluster must not have access to that data. In this case we can set up a group on the cluster for these users and give the group access to one or more shared folders.

These group directories are created on /scratch for data that needs to be processed and on /projects for data that needs to be stored safely.

For working with this data there are two models:

  1. The files in the shared folder are readable and writable for all group members. This with the caveat that users and certain tools can override the default permission settings, making data unreadable or unwritable for others.
  2. There is a data manager that manages the data in the shared folder, and this data manager is the only person with full write access. All other group members can only read the data.

If you want to request a group directory, please contact hpc@rug.nl and let us know the following things:

  1. The proposed name of the group (this name should not be in use and be convenient on the command line). The group name will always be prefixed by hb-.
  2. The amount of space needed, when more than the default 250 GB are required. Note that for /projects there is a fair use principle where you have to pay for storage above a certain threshold. For /scratch a fair share policy is in place.
  3. Who the primary owner of the group is. This person has to approve the requests for joining the group.
  4. Who can act as an alternate contact person for the group to approve these requests.
  5. Do all users need full write access or is there a data manager?
  6. If there is a data manager, who will fulfill that role?

Sometimes you need to share non-sensitive, public data with someone else. For this we have set up a directory /scratch/public/tmp. The data in this directory can be read by all users on the cluster.

Since we have allocated limited space to this directory a cleanup script will remove data after 30 days. Please let us know if you need to share data for a longer period. We can then create a persistent group directory in /scratch/public, or move the data to a more permanent public location. You can request this at hpc@rug.nl, where you need to answer the same questions as for a regular group directory.

Since /scratch is optimized for large files, storing software on /scratch is not recommended. For large or shared software installations a NFS based share has been setup, which is available as /userapps. Since we assume that most software installations use downloads from external sites (like e.g. Python virtual environments or Anaconda installations) we do not make a backup of /userapps. Please contact hpc@rug.nl if you need additional space for your installations or when you need to share your software stack with multiple users. For the latter you should answer the applicable questions for the group share.