Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rdms:best [2024/03/25 13:31]
jelte [Bundling of Data Sets] Some more text adjusments
rdms:best [2024/03/28 15:22] (current)
burcu [Best Practices]
Line 1: Line 1:
 {{indexmenu_n>8}} {{indexmenu_n>8}}
 ====== Best Practices ====== ====== Best Practices ======
-This section contains a selection of best practices for the usage of the RDMS. Following these best practices will help to get the most optimal RDMS user experience. +This section presents a selection of best practices for using the RDMS. Adhering to these best practices will ensure the most optimal user experience with the RDMS
  
-The section will be gradually extended with new usage examples and tips.+The section will be gradually updated with new usage examples and tips.
  
-If you feel that an important information should be added to this section, feel free to contact [[rdms-support@rug.nl|RDMS support]] with your request!+If you believe that important information should be added to this section, please contact [[rdms-support@rug.nl|RDMS support]] with your request!
  
-===== Data Naming =====+===== Naming Folders/Files =====
  
 For the optimal usage of the RDMS, it is highly recommended to follow these best practices for naming your files/folders: For the optimal usage of the RDMS, it is highly recommended to follow these best practices for naming your files/folders:
Line 16: Line 16:
   * Prefer usage of underscore (''_'') or hyphen (''-'' instead of white spaces in file/folder names.    * Prefer usage of underscore (''_'') or hyphen (''-'' instead of white spaces in file/folder names. 
  
-Example of a folder structure with good naming:+Example of a folder structure with correct naming:
 <code> <code>
 $ itree project_name $ itree project_name
Line 37: Line 37:
 </code> </code>
  
-Example of a folder structure with wrong/difficult naming:+Example of a folder structure with incorrect naming:
 <code> <code>
 $ itree "Project with XXX and YYY" $ itree "Project with XXX and YYY"
Line 60: Line 60:
 ===== Bundling of Data Sets ===== ===== Bundling of Data Sets =====
  
-To improve the performance of the RDMS, it is recommended to store data sets which contain a lot of single, smaller files in a structured format like ''*.tar'', ''*.tar.gz'', ''*.tar.bz'', or ''*.zip'' +To improve the performance of the RDMS, it is recommended to store data sets numerous small files in a structured format like ''*.tar'', ''*.tar.gz'', ''*.tar.bz'', or ''*.zip''This significantly improves transfer rates as the system engages in multi-threaded transfers after reaching a minimal file size threshold (32 MB). Transferring multipl smaller files furthermore results in big overhead, diminishing performance. 
-These has the advantage that it improves the transfer rates for up-/ and download significantly which results from the fact, that the system only goes into multi-threaded transfers after a certain threshold of minimal file size (32 MB) is reachedThe transfer of multiple, smaller files furthermore results in big overhead which reduces the performance. +
  
-The best practice to handle such cases is therefore:+Best practices to handle such cases are:
  
   - Fist, collect all data locally.   - Fist, collect all data locally.
-  - Before archival in the RDMS, bundle the data set or subsets of it into a structured data format (see above for formats).+  - Before archiving in the RDMS, bundle the data set or its subsets into a structured data format (as mentioned above).
   - Upload the bundled format to the RDMS.   - Upload the bundled format to the RDMS.
-  - (Add metadata if desired.) +  - (Optional) Add metadata if desired. 
-  - (Unbundle again in the RDMS.)+  - (Optional) Unbundle the data on the RDMS.)
  
-The last two steps, adding additional metadata as well as unbundling again on the RDMS sideare not mandatory+For extraction on the RDMS, CLI users can use the ''ibun -x'' command as also described in the [[https://wiki.hpc.rug.nl/rdms/access/linux/createprofile#icommands_for_metda_data_management|iCommands for (Meta)data Management]] section of this wiki.
  
-For extraction on the RDMS side, CLI users can use the ''ibun -x'' command as also described in the [[https://wiki.hpc.rug.nl/rdms/access/linux/createprofile#icommands_for_metda_data_management|iCommands for (Meta)data Management]] section of this wiki.+For RDMS web interface usersthe "Uncompress tar" function, accessible via right-click on a ''*.tar'' file, enables extractionCurrently, this function supports only ''*.tar'' formats
  
-Users of the RDMS web interface, can use the "Uncompress tar" function after right-click on a ''*.tar'' file to extract it. Currently, this just works for ''*.tar'' formats. +===== Locked Files (HIERARCHY_ERROR) =====
  
-===== Detecting Corrupt Files =====+In rare cases, data may arrive in an incomplete form in the RDMS. This usually happens if a data transfer abruptly interrupted, for example due to connection problems, whithout proper finalization. 
  
-In rare cases, it can happen that data arrives in a non-finalized form in the RDMS. These usually happens if a data transfer suddenly drops outfor example due to connection problems, while the system did not finalize it properly+Restarting the data transfer may solve this issue. However, it is possible that the already transferred data remains in a locked statecausing problems when the transfer is restarted as those files cannot be overwritten directly
  
-Restarting the data transfer can solve this issuebut it can also happen that the already transferred data is kept in a locked state which results in problems when the transfer is restarted as those files cannot be overwritten directly+If you experience these issues, it is recommended to contact [[rdms-support@rug.nl|RDMS-Support]].
  
-If you experience these issues, it is normally recommended to contact RDMS-Support.+Users of the command-line tool [[rdms:access:linux:icommands|iCommands]] have furthermore the possibility to detect such locked files directly using an appropriate CLI command
  
-Users of the command-line tool `iCommands` have furthermore the possibility to detect such locked files directly via an appropriate CLI command+In general, these issues manifest in ''HIERARCHY_ERRORs'' when a data transfer to the RDMS (e.g. via ''iput'' or ''irsync'') is attempted via CLI. 
  
-In general, these issues manifest in ''HIERARCHY_ERRORs'' when a data transfer to the RDMS (e.g. via ''iput'' or ''irsync'') is tried via CLI.  +To check all files at a RDMS location ''/rug/home/path/to/folder'' including all its subfolders, and to detect just those files that are marked as locked, the following command can be executed:
- +
-To check all files at a RDMS location ''/rug/home/path/to/collection'' including all its subfolders, and to detect just those files that are marked as locked, the following command can be executed:+
  
 <code> <code>
Line 101: Line 98:
 </code> </code>
  
-While the locked files cannot be directly removed, they can still be moved first to another location in the your home/team location, for example a separate folder for locked files. Afterwards, the data transfer can be restarted. +==== Removal of Locked Files ==== 
 + 
 +While the locked files cannot be directly removed, they can still be moved first to another location in your home/team location, for example as a separate folder for locked files. Afterwards, the data transfer can be restarted.  
 + 
 +Best practices to handle locked files and resolve the ''HIERARCH_ERROR'' are: 
 + 
 +  - Create a new folder in your home or team drive to contain all locked files. 
 +  - Use the 'iquest' command to identify locked files and move them to the newly created location. CLI users can utilize 'imv' for this purpose. 
 +  - Restart the data transfer. The ''HIERARCHY_ERROR'' should be resolved. 
 +  - If you accumulated multiple locked files in your folder which you cannot delete, please contact [[rdms-support@rug.nl|RDMS-Support]] and we will help you remove these.  
 + 
 +**Note:** It is recommended not to contact RDMS support for every locked file, but instead first try to resolve it as described above. However, if numerous locked files are detected, you can directly contact RDMS support.