Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rdms:bestpractices [2025/06/30 07:30] burcurdms:bestpractices [2025/12/09 10:41] (current) – [Transferring Huge Data Sets] slight rephrasing and Grammarly check giulio
Line 57: Line 57:
     publication.final version.odt     publication.final version.odt
 </code> </code>
 +
 +===== Transferring Large Data Sets =====
 +For the transfer of very big data sets, especially those containing files in the realm of several 100GBs and more, we recommend using the [[.:access:linux:icommands|iCommands]] CLI tool. We specifically recommend to use the ''iput'' (upload) and ''iget'' (download) commands with the following parameters:
 +
 +<code>
 +# Upload: Single Big file
 +$ iput -T --lfrestart /path/to/lfRestartFile --retries 3 /path/to/big/local/file /rug/home/destination_collection/
 +
 +# Upload: Big folder
 +$ iput -r -T -X /path/to/restartfile --lfrestart /path/to/lfRestartFile --retries 3 /path/to/big/local/folder /rug/home/destination_collection/
 +
 +# Download: Single big file
 +$ iget -T --lfrestart /path/to/lfRestartFile --retries 3 /path/to/big/local/file /rug/home/destination_collection/
 +
 +# Download: Big folder
 +$ iget -r -T -X /path/to/restartfile --lfrestart /path/to/lfRestartFile --retries 3 /path/to/big/local/folder /rug/home/destination_collection/
 +</code>
 +
 +The additional parameters used for the `iput` command have the following function:
 +  * ''-T'': Renew the socket connection after 10 minutes. This can be useful for big data transfers to prevent events like the firewall canceling the connection.
 +  * ''-X /path/to/restartfile'': When this parameter is used, the command writes restart information to the specified restart file. This file contains information on how many of the files were already uploaded and what the last uploaded file was. It is especially useful for transferring folders with multiple files. 
 +  * ''%%--lfrestart%% /path/to/lfRestartFile'': When this parameter is used, the command writes different restart information for large files to ''/path/to/lfRestartFile''. If the transfer fails, this allows you to continue the transfer from the point where it failed for the large file.  
 +  * ''%%--retries%% <int>'': This function can be used in combination with the restart files. It specifies the number of automated retries of the transfer.
 +
 +**Note:** For the transfer of a large amount of data, especially single big files, we recommend **not to use** the ''-K'' flag. The flag leads to calculating, storing, and comparing the checksums of the file(s) during transfer. This can sometimes take a very long time and also result in timeouts. We recommend that you instead either wait for the automated checksum calculation to finish or force checksum calculation after the transfer by using the ''ichksum'' command. 
  
 ===== Bundling of Data Sets ===== ===== Bundling of Data Sets =====
Line 73: Line 98:
  
 For RDMS web interface users, the "Uncompress tar" function, accessible via right-click on a ''*.tar'' file, enables extraction. Currently, this function supports only ''*.tar'' formats.  For RDMS web interface users, the "Uncompress tar" function, accessible via right-click on a ''*.tar'' file, enables extraction. Currently, this function supports only ''*.tar'' formats. 
 +
 +**Note:** The ''ibun'' command does not support symlinks. It is therefore recommended to dereference symlinks upon local creation of the archives. For the ''tar'' command, this can be achieved via the additional ''-h'' flag. 
  
 ==== Choosing a Data Compression Formats ==== ==== Choosing a Data Compression Formats ====