Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
rdms:data:integrity [2025/03/19 13:56] – [Checksums in the RDMS] burcu | rdms:data:integrity [2025/03/24 08:12] (current) – [Data Safety and Integrity] burcu | ||
---|---|---|---|
Line 7: | Line 7: | ||
* **Data Replication**: | * **Data Replication**: | ||
- | * **Checksum**: | + | * **Checksum**: |
===== Data Replication ===== | ===== Data Replication ===== | ||
Line 13: | Line 13: | ||
While the replication does not guarantee the integrity of the data, since corrupted data will also be replicated, it is a safeguard mechanism in case of hardware failure or damage to a data center. Because the data exists in two independent locations, the likelihood of both locations being affected is minimal. | While the replication does not guarantee the integrity of the data, since corrupted data will also be replicated, it is a safeguard mechanism in case of hardware failure or damage to a data center. Because the data exists in two independent locations, the likelihood of both locations being affected is minimal. | ||
- | **Note:** The replication in the RDMS operates at the hardware level. As a user, this is not directly visible with the tools discussed in this wiki section. For example, the iCommands CLI can be used to check data integrity and, as will be described below, also shows the status of the replica, but will still show only one replica. | + | **Note:** The replication in the RDMS operates at the hardware level. As a user, this is not directly visible with the tools discussed in this wiki section. For example, the '' |
===== Checking Data Integrity ===== | ===== Checking Data Integrity ===== | ||
Line 22: | Line 22: | ||
In the RDMS, a checksum is stored for every file. By default, this happens automatically upon data ingestion via [[rdms: | In the RDMS, a checksum is stored for every file. By default, this happens automatically upon data ingestion via [[rdms: | ||
+ | | ||
+ | ==== Data Replica Status ==== | ||
- | **How to Check Your File's Checksum** | + | Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica |
- | + | ||
- | * '' | + | |
- | * Web interface: Checksum information is also visible via the RDMS web interface | + | |
- | + | ||
- | The checksum that are stored in the RDMS are base64-encoded [[https:// | + | |
- | + | ||
- | **Note**: If you use Windows, either via native [[rdms: | + | |
- | ==== Data Replica Status Explained ==== | + | |
- | + | ||
- | Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica | + | |
^ Numeric Value ^ Symbolic Value ^ Name ^ Definition ^ | ^ Numeric Value ^ Symbolic Value ^ Name ^ Definition ^ | ||
Line 53: | Line 45: | ||
**Note:** While not directly related to the replica status information, | **Note:** While not directly related to the replica status information, | ||
- | + | ||
+ | ====== How to Check Your File's Checksum ====== | ||
+ | |||
+ | * '' | ||
+ | * Web interface: Checksum information is also visible via the RDMS web interface | ||
+ | |||
+ | The checksum that are stored in the RDMS are base64-encoded [[https:// | ||
+ | |||
+ | **Note**: If you use Windows, either via native [[rdms: | ||
==== Via Command-Line Interface ==== | ==== Via Command-Line Interface ==== | ||
The most convenient way to check the status and integrity of your data in the RDMS is via the [[rdms: | The most convenient way to check the status and integrity of your data in the RDMS is via the [[rdms: | ||
=== Checking Integrity during Data Ingestion === | === Checking Integrity during Data Ingestion === | ||
- | The commands that are used for uploading data to the RDMS, namely '' | + | The commands that are used for uploading data to the RDMS, namely '' |
< | < | ||
Line 65: | Line 66: | ||
</ | </ | ||
- | Which will compute the checksums for you locally, but also on the RDMS side. In the process the checksums are verified by the iCommands for you and also directly stored in the iCAT catalog/ | + | Which will compute the checksums for you locally, but also on the RDMS side. In the process the checksums are verified by the '' |
**Note**: Even without using the '' | **Note**: Even without using the '' | ||
Line 98: | Line 99: | ||
</ | </ | ||
- | As can be seen, both checksums, the one registered in the RDMS as well as the one computed for the same file locally, are the same. Therefore, it can be guaranteed | + | As can be seen, both checksums, the one registered in the RDMS and the one computed |
- | As a further tip, it is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell): | + | **Tip**: It is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell): |
< | < | ||
Line 134: | Line 135: | ||
< | < | ||
- | [System.Convert]:: | + | [System.Convert]:: |
</ | </ | ||
Line 141: | Line 142: | ||
Get-ChildItem -Path " | Get-ChildItem -Path " | ||
$file = $_.FullName | $file = $_.FullName | ||
- | $checksum = [System.Convert]:: | + | $checksum = [System.Convert]:: |
[PSCustomObject]@{ | [PSCustomObject]@{ | ||
FileName = $_.Name | FileName = $_.Name | ||
Line 173: | Line 174: | ||
**Notes**: | **Notes**: | ||
- | * While RDMS web interface | + | * While RDMS web interface |
- | * As of now, the [[rdms: | + | * Currently, the [[rdms: |