This is an old revision of the document!


Best Practices

This section presents a selection of best practices for using the RDMS. Adhering to these best practices will ensure the most optimal user experience with the RDMS.

The section will be gradually updated with new usage examples and tips.

If you believe that important information should be added to this section, please feel free to contact RDMS support with your request!

For the optimal usage of the RDMS, it is highly recommended to follow these best practices for naming your files/folders:

  • Do not use special characters (e.g. $%^*&#=!) in your file/folder names.
  • Do not use periods (.) in your file/folder names.
  • Do not use quote symbols (' or ) in your file/folder names.
  • Prefer usage of underscore (_) or hyphen (-) instead of white spaces in file/folder names.

Example of a folder structure with correct naming:

$ itree project_name
project_name
  analytical_data
    machine_01
      20231223_analysis.ext
      20240111_analysis.ext
      20240325_analysis.ext
    machine_02
      20230222_analysis.ext
      20230710_analysis.ext
      20240109_analysis.ext
    machine_03
      20231020_analysis.ext
      20231120_analysis.ext
      20231212_analysis.ext
  manuscripts
    publication_v01.odt

Example of a folder structure with incorrect naming:

$ itree "Project with XXX and YYY"
Project with XXX and YYY
  analytical_data
    analytical devices @ building 1
      experiment 100% scan rate.ext
      experiment 74% scan rate.ext
      experiment 80% scan rate.ext
    analytical devices @ building 2
      Experiment 01 by user.name@rug.nl.ext
      Experiment 02 by user.name@rug.nl.ext
      Experiment 03 by user.name@rug.nl.ext
    analytical devices @ building 3
      $1-100%.ext
      Versuch #1.ext
      Versuch_Öldiffusion_erste_Möglichkeit.ext
  manuscripts
    publication.final version.odt

To improve the performance of the RDMS, it is recommended to store data sets which contain a lot of single, smaller files in a structured format like *.tar, *.tar.gz, *.tar.bz, or *.zip. These has the advantage that it improves the transfer rates for up-/ and download significantly which results from the fact, that the system only goes into multi-threaded transfers after a certain threshold of minimal file size (32 MB) is reached. The transfer of multiple, smaller files furthermore results in a big overhead which reduces the performance.

The best practice to handle such cases is therefore:

  1. Fist, collect all data locally.
  2. Before archival in the RDMS, bundle the data set or subsets of it into a structured data format (see above for formats).
  3. Upload the bundled format to the RDMS.
  4. (Add metadata if desired.)
  5. (Unbundle again in the RDMS.)

The last two steps, adding additional metadata as well as unbundling again on the RDMS side, are not mandatory.

For extraction on the RDMS side, CLI users can use the ibun -x command as also described in the iCommands for (Meta)data Management section of this wiki.

Users of the RDMS web interface, can use the “Uncompress tar” function after right-click on a *.tar file to extract it. Currently, this just works for *.tar formats.

In rare cases, it can happen that data arrives in a non-finalized form in the RDMS. These usually happens if a data transfer suddenly drops out, for example due to connection problems, while the system did not finalize it properly.

Restarting the data transfer can solve this issue, but it can also happen that the already transferred data is kept in a locked state which results in problems when the transfer is restarted as those files cannot be overwritten directly.

If you experience these issues, it is normally recommended to contact RDMS-Support.

Users of the command-line tool iCommands have furthermore the possibility to detect such locked files directly via an appropriate CLI command.

In general, these issues manifest in HIERARCHY_ERRORs when a data transfer to the RDMS (e.g. via iput or irsync) is tried via CLI.

To check all files at a RDMS location /rug/home/path/to/folder including all its subfolders, and to detect just those files that are marked as locked, the following command can be executed:

 $ iquest "status: %s, name: %s/%s" "SELECT DATA_REPL_STATUS, COLL_NAME, DATA_NAME WHERE COLL_NAME LIKE '/rug/home/path/to/folder%' AND DATA_REPL_STATUS > '1'"

These command will check the specified location for files which have a replica status of 2 (“read-locked”) or 3 (“write-locked”), and then output it in the format:

status: <2/3>. name: <path_to_folder>/<name_of_file>

While the locked files cannot be directly removed, they can still be moved first to another location in the your home/team location, for example a separate folder for locked files. Afterwards, the data transfer can be restarted.

So the best practice for handling locked files and resolving the `HIERARCH_ERROR` is:

  1. Create a new folder in your home or team drive that will contain all locked files.
  2. Move the locked files that were found via `iquest` to that location. CLI users can use `imv` for that.
  3. Now, restart the data transfer. The `HIERARCHY_ERROR` should be gone.
  4. If you accumulated multiple locked files in your folder which you cannot delete, please contact RDMS-Support and we will help you remove these.

Note: It is recommended not to message RDMS support for every locked file, but instead first try to resolve it like described above. In cases where a lot of locked files are detected, RDMS support can be also contacted directly.