Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
rdms:data:integrity [2025/03/19 13:34] – burcu | rdms:data:integrity [2025/03/24 08:12] (current) – [Data Safety and Integrity] burcu | ||
---|---|---|---|
Line 6: | Line 6: | ||
In short, the key concepts are: | In short, the key concepts are: | ||
- | * **Data Replication**: | + | * **Data Replication**: |
- | * **Checksum**: | + | * **Checksum**: |
===== Data Replication ===== | ===== Data Replication ===== | ||
Line 13: | Line 13: | ||
While the replication does not guarantee the integrity of the data, since corrupted data will also be replicated, it is a safeguard mechanism in case of hardware failure or damage to a data center. Because the data exists in two independent locations, the likelihood of both locations being affected is minimal. | While the replication does not guarantee the integrity of the data, since corrupted data will also be replicated, it is a safeguard mechanism in case of hardware failure or damage to a data center. Because the data exists in two independent locations, the likelihood of both locations being affected is minimal. | ||
- | **Note:** The replication in the RDMS operates at the hardware level. As a user, this is not directly visible with the tools discussed in this wiki section. For example, the iCommands CLI can be used to check data integrity and, as will be described below, also shows the status of the replica, but will still show only one replica. | + | **Note:** The replication in the RDMS operates at the hardware level. As a user, this is not directly visible with the tools discussed in this wiki section. For example, the '' |
===== Checking Data Integrity ===== | ===== Checking Data Integrity ===== | ||
- | This section explains how you can verify the integrity of your data in the RDMS yourself: How the RDMS uses checksums to verify integrity, different | + | This section explains how you can verify the integrity of your data in the RDMS yourself: How the RDMS uses checksums to verify integrity, different replica statuses and what they mean and how you can use this info to check your data, either using the [[https:// |
==== Checksums in the RDMS ==== | ==== Checksums in the RDMS ==== | ||
- | One of the unique features of the RDMS is that it is not a simple storage solution, | + | One of the unique features of the RDMS is that it is not just a simple storage solution, |
- | In the case of the RDMS, we also store a checksum for every file that is stored in the RDMS. This is by default | + | In the RDMS, a checksum |
+ | |||
+ | ==== Data Replica Status ==== | ||
- | The checksum of your data files can be checked via the already mentioned iCommands, but the information about the file checksum is also visible via the web interface. | + | Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica |
- | + | ||
- | The checksum that are stored in the RDMS are base64-encoded [[https:// | + | |
- | + | ||
- | **Note**: If you use Windows, either via native [[rdms: | + | |
- | ==== Data Replica Status Explained ==== | + | |
- | + | ||
- | Every file (not folder) in the RDMS also has a replica status associated with it. This replica status gets automatically assigned when the data enters the system. The replica status definitions result from iRODS the data management system that is the backbone of the RDMS. As of now, iRODS knows five different replica | + | |
^ Numeric Value ^ Symbolic Value ^ Name ^ Definition ^ | ^ Numeric Value ^ Symbolic Value ^ Name ^ Definition ^ | ||
Line 50: | Line 45: | ||
**Note:** While not directly related to the replica status information, | **Note:** While not directly related to the replica status information, | ||
- | + | ||
+ | ====== How to Check Your File's Checksum ====== | ||
+ | |||
+ | * '' | ||
+ | * Web interface: Checksum information is also visible via the RDMS web interface | ||
+ | |||
+ | The checksum that are stored in the RDMS are base64-encoded [[https:// | ||
+ | |||
+ | **Note**: If you use Windows, either via native [[rdms: | ||
==== Via Command-Line Interface ==== | ==== Via Command-Line Interface ==== | ||
The most convenient way to check the status and integrity of your data in the RDMS is via the [[rdms: | The most convenient way to check the status and integrity of your data in the RDMS is via the [[rdms: | ||
=== Checking Integrity during Data Ingestion === | === Checking Integrity during Data Ingestion === | ||
- | The commands that are used for uploading data to the RDMS, namely '' | + | The commands that are used for uploading data to the RDMS, namely '' |
< | < | ||
Line 62: | Line 66: | ||
</ | </ | ||
- | Which will compute the checksums for you locally, but also on the RDMS side. In the process the checksums are verified by the iCommands for you and also directly stored in the iCAT catalog/ | + | Which will compute the checksums for you locally, but also on the RDMS side. In the process the checksums are verified by the '' |
**Note**: Even without using the '' | **Note**: Even without using the '' | ||
Line 95: | Line 99: | ||
</ | </ | ||
- | As can be seen, both checksums, the one registered in the RDMS as well as the one computed for the same file locally, are the same. Therefore, it can be guaranteed | + | As can be seen, both checksums, the one registered in the RDMS and the one computed |
- | As a further tip, it is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell): | + | **Tip**: It is also possible to adjust the command a little so that it does not just calculate the checksum for a single file, but for all files in a folder. An example command to do so (assuming Bash shell): |
< | < | ||
Line 131: | Line 135: | ||
< | < | ||
- | [System.Convert]:: | + | [System.Convert]:: |
</ | </ | ||
Line 138: | Line 142: | ||
Get-ChildItem -Path " | Get-ChildItem -Path " | ||
$file = $_.FullName | $file = $_.FullName | ||
- | $checksum = [System.Convert]:: | + | $checksum = [System.Convert]:: |
[PSCustomObject]@{ | [PSCustomObject]@{ | ||
FileName = $_.Name | FileName = $_.Name | ||
Line 170: | Line 174: | ||
**Notes**: | **Notes**: | ||
- | * While RDMS web interface | + | * While RDMS web interface |
- | * As of now, the [[rdms: | + | * Currently, the [[rdms: |