Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
habrok:introduction:workflow [2022/12/14 13:18] – [External storage areas] fokkehabrok:introduction:workflow [2023/03/22 13:03] (current) fokke
Line 4: Line 4:
  
 In this section we will describe the basic workflow for working on the cluster. This workflow consists of five steps: In this section we will describe the basic workflow for working on the cluster. This workflow consists of five steps:
-  - Copy input data to the system+  - Copy input data to the system wide storage area
   - Prepare the job script:   - Prepare the job script:
       - Define requirements        - Define requirements 
Line 22: Line 22:
  
 In this section we will focus on the data storage, and the next sections will delve deeper into the other topics, including the command-line interface, which is implied in some of the steps.  In this section we will focus on the data storage, and the next sections will delve deeper into the other topics, including the command-line interface, which is implied in some of the steps. 
 +
  
 ===== Data ===== ===== Data =====
  
-For most applications users need to work with data. Data can be parameters for a program that needs to be run, for example to set up a simulation. It can be input data that needs to be analyzed. And finally running simulations or data analysis will result in data containing the results of the computations.+For most applications users need to work with data. Data can be parameters for a program that needs to be run, for example to set up a simulation. It can be input data that needs to be analyzed. Andfinallyrunning simulations or data analyses will result in data containing the results of the computations.
  
 Hábrók has its own storage system, which is decoupled from the desktop storage systems the university has. Although it would be nice to be able to access data from your desktop system directly on Hábrók, currently this is not possible. Technically this would be challenging, and there would also be performance issues, when people start to do more heavy processing on the desktop storage systems. Hábrók has its own storage system, which is decoupled from the desktop storage systems the university has. Although it would be nice to be able to access data from your desktop system directly on Hábrók, currently this is not possible. Technically this would be challenging, and there would also be performance issues, when people start to do more heavy processing on the desktop storage systems.
Line 35: Line 36:
  
 Hábrók currently has three storage areas, with different capabilities. On each storage area limits are enforced with respect to the amount of data stored, to ensure that each user has some space, and that the file systems will not suddenly be completely full. Hábrók currently has three storage areas, with different capabilities. On each storage area limits are enforced with respect to the amount of data stored, to ensure that each user has some space, and that the file systems will not suddenly be completely full.
 +
 +On this page we will give a short description, more details can be found at [[habrok:data_management:storage_areas|Storage areas]].
  
 ==== home ==== ==== home ====
  
-The home area is where users can store settings, programs and small data sets. This area is limited in space to 50 GB per user and a daily tape backup of the data is being made+The home area is where users can store settings, programs and small data sets. 
  
 ==== scratch ==== ==== scratch ====
  
-For larger data sets each user has access to a space on the scratch file system. By default 250 GB per user is available, which can be increased to a larger amount if required. Because of the size of the data no backups are being made of this data.  +For larger data sets each user has access to a space on the scratch file system. \\ 
-**This area is only meant for data ready for or recent results from processing. It is not meant for long term storage. THERE IS NO BACKUP!! **+**This area is only meant for data ready for processing, or recent results from processing. It is not meant for long term storage. THERE IS NO BACKUP!! **
  
 ==== local disks ==== ==== local disks ====
  
-The Peregrine nodes also have local disks that can only be used by calculations running on that specific machine. This implies that this also is temporary space.+The Hábrók nodes also have local disks that can only be used by calculations running on that specific machine. This implies that this also is temporary space. These local disks are based on solid state (SSD) storage. This means that they are for most use cases much faster than the scratch area.  
 + 
 +We therefore advise users to copy their input data sets from the scratch area to the local disk at the beginning of the job. The job can then run using the local disk for reading and writing data. At the end of the job the output has to be written back to the scratch area.   
 +This step is especially important if your input data is read multiple times or consists of many (>1000) files. Similar guidelines are applicable to output data.
  
  
 ==== External storage areas ==== ==== External storage areas ====
  
-Besided the storage directly available on all nodes of the cluster some external storage areas can be accessed from the login nodes. These areas are described below. +Besides the storage directly available on all nodes of the cluster some external storage areas can be accessed from the login nodes. These areas are described below.
-==== data handling ====+
  
-On the login node the "data handling" file systems are mounted. Users can get access to storage on data handling based on a fair use model. The storage is allocated on request. +==== Project storage ====
  
-**WiP**+On the login nodes the ''/projects'' file system is mounted. Users can get access to storage on this storage area based on a fair use model. Additional storage is allocated on request and above a certain threshold costs are involved.
  
 ==== RDMS ==== ==== RDMS ====
  
-**WiP**+The Research Data Management system can be used from the Hábrók nodes. You can find more information about the RDMS on its dedicated wiki pages: 
 +https://wiki.hpc.rug.nl/rdms/start
  
  
 ---- ----
-**Next section: [[peregrine:connecting_to_the_system:connecting]]**+**Next section: [[habrok:connecting_to_the_system:connecting]]**