| Both sides previous revision Previous revision Next revision | Previous revision |
| dcc:pdpsol:publishinghsd [2026/04/28 13:19] – add sentence about consent in the Level 1 marlon | dcc:pdpsol:publishinghsd [2026/05/13 13:29] (current) – alba |
|---|
| {{indexmenu_n>5}} | {{indexmenu_n>5}} |
| ===== Archiving and Publishing Human Subject Data ===== | ===== Archiving and publishing human subject data ===== |
| |
| ==== Introduction ==== | ==== Introduction ==== |
| |
| ==== De-identifying data before archiving or publishing ==== | ==== De-identifying data before archiving or publishing ==== |
| Often, it is not necessary to keep all collected data for the purpose of validating your findings or for researchers to reuse your data. | Often, it is not necessary to keep all collected data: |
| * Limit the (personal) data and materials you archive to the ones that you need for verification of your research. Follow the procedures in the [[datadesctruction|destruction protocol(s)]] that you designed. Add these protocol(s) to your data package, publication package or archive. (e.g. anonymised consent forms can be archived, while consent forms containing personal data should be de-identified or destroyed in accordance with the UG protocol) | * Limit the (personal) data and materials you archive to the ones that you need for verification of your research. Follow the procedures in the [[datadesctruction|destruction protocol(s)]] that you designed. Add these protocol(s) to your data package, publication package or archive. (e.g. anonymised consent forms can be archived, while consent forms containing personal data should be de-identified or destroyed) |
| * Determine whether it is possible to [[de-identification|de-identify]] before publishing, while also keeping in mind the usability of your dataset. | * [[de-identification|De-identify]] data before publishing, while also keeping in mind the usability of your dataset. |
| |
| ==== Publishing de-identified or anonymized data ==== | ==== Publishing de-identified or anonymized data ==== |
| FAIR data does not necessarily mean that all your data and materials need to be openly available. Even after de-identification, there can be [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/make-your-data-available-under-restricted-access|good reasons to restrict access to your data]]. The objective is to have data as open as possible, and as closed and protected as necessary. | FAIR data does not necessarily mean that all your data and materials need to be openly available. Even after de-identification, there can be [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/make-your-data-available-under-restricted-access|good reasons to restrict access to your data]]. The objective is to have data as open as possible, and as closed and protected as necessary. |
| |
| Consider applying a **‘layered’ approach** to your (de-identified) files by scoring your files in terms of sensitivity. | Apply a **‘layered’ approach** to your (de-identified) files by classifying them according to their level of sensitivity. |
| |
| === Level 1: contains no personal data === | === Level 1: contains no personal data === |
| |
| Publish your [[de-identification|(anonymized)]] dataset and supporting materials in a recognized data repository such as [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/dataversenl|DataverseNL]], on the condition that __**no**__ other [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/make-your-data-available-under-restricted-access|reasons for restricting access]] apply. Allow for reuse by adding a license (for instance, [[https://www.rug.nl/library/open-access/how-to-publish-open-access/creative-commons-licenses|a Creative Commons license]]) and use the persistent identifier (e.g., [[https://www.rug.nl/library/publish/isbn-doi|DOI]]) for data citation. If the data come from human participants, make sure that the terms of use align with the informed consent. | Publish your [[de-identification|(anonymized)]] dataset and supporting materials in a recognized data repository such as [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/dataversenl|DataverseNL]], on the condition that __**no**__ other [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/make-your-data-available-under-restricted-access|reasons for restricting access]] apply. Allow for reuse by adding a license (for instance, [[https://www.rug.nl/library/open-access/how-to-publish-open-access/creative-commons-licenses|a Creative Commons license]]) and use the persistent identifier (e.g., [[https://www.rug.nl/library/publish/isbn-doi|DOI]]) for data citation. If the data are anonymized human subject data, make sure that the terms of use align with the informed consent. |
| |
| === Level 2: contains personal data in de-identified form (not anonymized) === | === Level 2: contains personal data in de-identified form (not anonymized) === |
| |
| Publish your [[de-identification|de-identified dataset]] and supporting materials on [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/dataversenl|DataverseNL]], under restricted access. Determine the terms of use for external parties that would like to reuse your data. [[https://www.rug.nl/library/open-access/how-to-publish-open-access/creative-commons-licenses|Creative Commons licenses]] are not suitable for data containing personal data with access restrictions. Make sure that these terms of use align with the informed consent. | Publish your [[de-identification|de-identified dataset]] and supporting materials on [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/dataversenl|DataverseNL]], under restricted access. Determine the terms of use for external parties that would like to reuse your data. [[https://www.rug.nl/library/open-access/how-to-publish-open-access/creative-commons-licenses|Creative Commons licenses]] are not suitable for data containing personal data with access restrictions. [[https://www.rug.nl/digital-competence-centre/contact/|The UG DCC]] can assist in developing a procedure for making these data available for reuse under well-defined conditions. Make sure that these conditions align with the informed consent. |
| |
| === Level 3: contains sensitive personal data === | === Level 3: contains sensitive personal data === |
| |
| When your data still contains highly sensitive information, do not publish this data openly or with access controls in a data repository. Instead, [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/|archive your data]] in accordance with the [[https://www.rug.nl/digital-competence-centre/research-data/policies|research data policy of your faculty or institute]]. The [[https://www.rug.nl/digital-competence-centre/contact/|UG DCC]] can assist in developing a procedure for making these sensitive data available for reuse under well-defined conditions. Make sure that these conditions are in line with the informed consent. | When your data still contains highly sensitive information, do not publish this data openly or with access controls in a data repository. Instead, [[https://www.rug.nl/digital-competence-centre/research-data/archive-and-publish/|archive your data]] in accordance with the [[https://www.rug.nl/digital-competence-centre/research-data/policies|research data policy of your faculty or institute]].[[https://www.rug.nl/digital-competence-centre/contact/|The UG DCC]] can assist in developing a procedure for making these sensitive data available for reuse under well-defined conditions. Make sure that these conditions are in line with the informed consent. |
| |
| ---- | ---- |
| <color #7092be>→</color>[[https://doi.org/10.34894/R1WHEA |Corpus PINO: A spoken language resource for multiple simultaneous comparisons. (Cristiano et al., 2024)]] | <color #7092be>→</color>[[https://doi.org/10.34894/R1WHEA |Corpus PINO: A spoken language resource for multiple simultaneous comparisons. (Cristiano et al., 2024)]] |
| |
| //“Corpus PINO is a resource designed for research on different styles of spoken Italian and Neapolitan dialect. The corpus consists of anonymized audio recordings and ELAN time-aligned orthographic transcriptions involving fifty participants (stratified by age, gender, and education level). …. PINO is a contribution to the preservation of the local cultural heritage and of a minority language, i.e., an Italo-Romance dialect. It attests the lives, memories, opinions, traditions, practices, and attitudes of fifty members of this community.”// | //“Corpus PINO is a resource designed for research on different styles of spoken Italian and Neapolitan dialect. The corpus consists of [de-identified] audio recordings and ELAN time-aligned orthographic transcriptions involving fifty participants (stratified by age, gender, and education level). …. PINO is a contribution to the preservation of the local cultural heritage and of a minority language, i.e., an Italo-Romance dialect. It attests the lives, memories, opinions, traditions, practices, and attitudes of fifty members of this community.”// |
| ---- | ---- |
| ===Score the sensitivity of your data and supporting materials === | ===Score the sensitivity of your data and supporting materials === |