Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| habrok:data_management:storage_areas [2025/07/21 08:02] – Add syntax highlighting pedro | habrok:data_management:storage_areas [2025/10/24 10:32] (current) – [/scratch] pedro | ||
|---|---|---|---|
| Line 23: | Line 23: | ||
| There is also a limit on the number of files that can be stored. This to reduce the load on the file system metadata server, which keeps track of the data about files (time of access, change, size, location, etc.). Handling a huge number of files is a challenge for most shared file systems and accessing a huge amount of files will lead to performance bottlenecks. | There is also a limit on the number of files that can be stored. This to reduce the load on the file system metadata server, which keeps track of the data about files (time of access, change, size, location, etc.). Handling a huge number of files is a challenge for most shared file systems and accessing a huge amount of files will lead to performance bottlenecks. | ||
| - | The best way of handling data sets with many (> 10,000) files is to not store them on /scratch as is, but as (compressed) archive files. These files can then be extracted to the fast local storage on the compute nodes at the beginning of a job. You can find more details and examples in our dedicated [[habrok: | + | The best way of handling data sets with many (> 10,000) files is to not store them on '' |
| When the processing is performed on the fast local storage the job performance will be much better. | When the processing is performed on the fast local storage the job performance will be much better. | ||
| Line 67: | Line 67: | ||
| ===== RDMS ===== | ===== RDMS ===== | ||
| - | The Research Data Management | + | The **Research Data Management |
| + | |||
| + | Additionally, | ||