Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
habrok:introduction:workflow [2022/12/14 13:32] – fokke | habrok:introduction:workflow [2023/03/22 13:03] (current) – fokke | ||
---|---|---|---|
Line 4: | Line 4: | ||
In this section we will describe the basic workflow for working on the cluster. This workflow consists of five steps: | In this section we will describe the basic workflow for working on the cluster. This workflow consists of five steps: | ||
- | - Copy input data to the system | + | - Copy input data to the system |
- Prepare the job script: | - Prepare the job script: | ||
- Define requirements | - Define requirements | ||
Line 22: | Line 22: | ||
In this section we will focus on the data storage, and the next sections will delve deeper into the other topics, including the command-line interface, which is implied in some of the steps. | In this section we will focus on the data storage, and the next sections will delve deeper into the other topics, including the command-line interface, which is implied in some of the steps. | ||
+ | |||
===== Data ===== | ===== Data ===== | ||
- | For most applications users need to work with data. Data can be parameters for a program that needs to be run, for example to set up a simulation. It can be input data that needs to be analyzed. And finally running simulations or data analysis | + | For most applications users need to work with data. Data can be parameters for a program that needs to be run, for example to set up a simulation. It can be input data that needs to be analyzed. And, finally, running simulations or data analyses |
Hábrók has its own storage system, which is decoupled from the desktop storage systems the university has. Although it would be nice to be able to access data from your desktop system directly on Hábrók, currently this is not possible. Technically this would be challenging, | Hábrók has its own storage system, which is decoupled from the desktop storage systems the university has. Although it would be nice to be able to access data from your desktop system directly on Hábrók, currently this is not possible. Technically this would be challenging, | ||
Line 35: | Line 36: | ||
Hábrók currently has three storage areas, with different capabilities. On each storage area limits are enforced with respect to the amount of data stored, to ensure that each user has some space, and that the file systems will not suddenly be completely full. | Hábrók currently has three storage areas, with different capabilities. On each storage area limits are enforced with respect to the amount of data stored, to ensure that each user has some space, and that the file systems will not suddenly be completely full. | ||
+ | |||
+ | On this page we will give a short description, | ||
==== home ==== | ==== home ==== | ||
- | The home area is where users can store settings, programs and small data sets. This area is limited in space to 50 GB per user and a daily tape backup of the data is being made. | + | The home area is where users can store settings, programs and small data sets. |
==== scratch ==== | ==== scratch ==== | ||
- | For larger data sets each user has access to a space on the scratch file system. | + | For larger data sets each user has access to a space on the scratch file system. |
- | + | **This area is only meant for data ready for processing, | |
- | **This area is only meant for data ready for or recent results from processing. It is not meant for long term storage. THERE IS NO BACKUP!! ** | + | |
==== local disks ==== | ==== local disks ==== | ||
- | The Peregrine | + | The Hábrók |
- | + | ||
- | These local disks are based on solid state (SSD) storage. This means that they are for most use cases much faster than the scratch area. | + | |
We therefore advise users to copy their input data sets from the scratch area to the local disk at the beginning of the job. The job can then run using the local disk for reading and writing data. At the end of the job the output has to be written back to the scratch area. | We therefore advise users to copy their input data sets from the scratch area to the local disk at the beginning of the job. The job can then run using the local disk for reading and writing data. At the end of the job the output has to be written back to the scratch area. | ||
- | This step especially important if your input data is read multiple times or consists of many (>1000) files. Similar guidelines are applicable to output data. | + | This step is especially important if your input data is read multiple times or consists of many (>1000) files. Similar guidelines are applicable to output data. |
==== External storage areas ==== | ==== External storage areas ==== | ||
- | Besided | + | Besides |
- | ==== data handling ==== | + | |
- | On the login node the "data handling" | + | ==== Project |
- | **WiP** | + | On the login nodes the ''/ |
==== RDMS ==== | ==== RDMS ==== | ||
- | **WiP** | + | The Research Data Management system can be used from the Hábrók nodes. You can find more information about the RDMS on its dedicated wiki pages: |
+ | https:// | ||
---- | ---- | ||
- | **Next section: [[peregrine: | + | **Next section: [[habrok: |