Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rdms:data:integrity [2025/03/19 15:30] – [Via the Web Interface] burcurdms:data:integrity [2025/03/24 08:12] (current) – [Data Safety and Integrity] burcu
Line 7: Line 7:
  
   * **Data Replication**: Data in the RDMS is stored at two different physical locations. The versions at both physical locations are called replicas, as the file is identical, meaning replicated, in both locations.   * **Data Replication**: Data in the RDMS is stored at two different physical locations. The versions at both physical locations are called replicas, as the file is identical, meaning replicated, in both locations.
-  * **Checksum**: A checksum is a unique value that is generated by running a checksum algorithm/function on certain data. The uniqueness of these values allow to check the integrity of the data. +  * **Checksum**: A checksum is a unique value that is generated by running a checksum function on certain data. The uniqueness of these values allow to check the integrity of the data. 
  
 ===== Data Replication ===== ===== Data Replication =====
Line 25: Line 25:
 ==== Data Replica Status ==== ==== Data Replica Status ====
  
-Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica statuses of which four are used:+Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica statuse of which four are used:
  
 ^ Numeric Value     ^ Symbolic Value      ^ Name  ^ Definition ^ ^ Numeric Value     ^ Symbolic Value      ^ Name  ^ Definition ^
Line 99: Line 99:
 </code> </code>
  
-As can be seen, both checksums, the one registered in the RDMS as well as the one computed for the same file locally, are the sameTherefore, it can be guaranteed that the file in the RDMS is the same as the one that was uploaded to it+As can be seen, both checksums, the one registered in the RDMS and the one computed locally for the same file, are identicalThis confirms that the file stored in the RDMS matches the originally uploaded version
  
-As a further tip, it is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell):+**Tip**: It is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell):
  
 <code> <code>
Line 135: Line 135:
  
 <code> <code>
-[System.Convert]::ToBase64String((Get-FileHash -Algorithm SHA256 -Path "\path\to\example_file| Select-Object -ExpandProperty Hash | ForEach-Object { [System.Convert]::FromHexString($_) }))+[System.Convert]::ToBase64String((Get-FileHash -Algorithm SHA256 -Path "C:\path\to\fileForEach-Object { [byte[]]($_.Hash -split '(..)' -ne '' | ForEach-Object { [Convert]::ToByte($_, 16) }) }))
 </code> </code>
  
Line 142: Line 142:
 Get-ChildItem -Path "C:\path\to\folder" -File | ForEach-Object { Get-ChildItem -Path "C:\path\to\folder" -File | ForEach-Object {
     $file = $_.FullName     $file = $_.FullName
-    $checksum = [System.Convert]::ToBase64String((Get-FileHash -Algorithm SHA256 -Path $file | Select-Object -ExpandProperty Hash | ForEach-Object { [System.Convert]::FromHexString($_) }))+    $checksum = [System.Convert]::ToBase64String((Get-FileHash -Algorithm SHA256 -Path $file | ForEach-Object { [byte[]]($_.Hash -split '(..)' -ne '' | ForEach-Object { [Convert]::ToByte($_, 16) }) }))
     [PSCustomObject]@{     [PSCustomObject]@{
         FileName = $_.Name         FileName = $_.Name