Archiving Workflow

The archiving workflow is the process of creating a data package from a dataset. If a dataset is used for a published work, you can “freeze” it by creating a data package. The archived dataset (data package) is then ready for the publication workflow (which is the process of publishing the metadata to the outside world to comply with the Open Science framework). Currently, the publication workflow is under development.

To begin the archiving process, the data you want to archive must first be stored as a project in the Project environment.

During the archiving process, there will be three different roles that will be active at different times. A single user can have any number of these roles assigned to them, and/or multiple users can have different roles and work at different stages of the archiving process.

Owner/Admin: This role is responsible for assigning the data manager and metadata manager roles, starting the archiving process. Best practice is to assign this role to a supervisor and/or to the person(s) who have access to the project folder from which the archive is generated.

Data Manager: This role is responsible for verifying that the data sent to the archive is complete and uncorrupted, and giving the final approval of the archive. Best practice is to assign this role to the person(s) who actively work with the data.

Metadata Manager: This role is responsible for verifying and completing the metadata information related to the archive. Best practice is to assign this role to the person(s) who know the origin and scope of the data, along with the knowledge of other locations where the data is stored (knowledge about existing DOI's).

Step 1:

Active role: Owner

When you start the archiving process, you will be prompted to select the folders or files you want to archive. In this step, you can decide if you want to archive the entire project folder or just a part of it. NB: Most of the time, you will select the entire folder. There are cases, however, where part of the archive needs to be deleted before the customary 10 years of retention period is up, due to privacy regulations. In such cases, we advise you to create two archives: one containing normal data that should be stored for 10 years, the other containing the sensitive data that needs to be deleted earlier. A good practice would be to label both archives in a way that makes it clear that they are interlinked and which one contains the sensitive data. This is best done in the project folder before the archiving starts.

Step 2:

Active role: Data Manager

In this step, the data manager checks if the data, that is contained in the project folder and that is sent to the archiving stage, is complete and uncorrupted. If the data manager approves, then the process creates a data package from the data contained in the project folder. The data is now frozen from this step onward and cannot be further modified. If multiple data managers are present in this stage, approval from just one of them is sufficient to go to the next step. Please consider that this is the first point of no return in the archiving process.

Step 3:

Active role: Metadata Manager

In this third step, the Metadata Manager is able to verify already existing metadata and add new metadata to complete the description of the archive itself or of single parts of the archive. Once the metadata is complete and approval is given by the Metadata Manager, then the archive is going to be moved to the next step, where metadata is once again verified (but new metadata cannot be added anymore) and the option is given to add an existing DOI to the archive. Here the metadata manager can send the archive back to the previous step, if the metadata is not complete, or approve it. Approval in this step freezes the metadata and moves the archive to the final stage of the procedure. This is the second point of no return in the process. NB: The RDMS does not generate a DOI for you yet, but we have it in our development road map.

Step 4: