Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| dcc:pdpsol:de-identification [2026/02/17 09:42] – add text marlon | dcc:pdpsol:de-identification [2026/02/17 10:09] (current) – marlon | ||
|---|---|---|---|
| Line 4: | Line 4: | ||
| De-identification is the masking, manipulation or removal of personal data with the aim to make individuals in a dataset less easy to identify. It is especially important when you want to share, publish or archive your dataset. Before sharing, publishing or archiving your data, you should determine whether it is possible to de-identify your dataset, while also keeping in mind its usability. | De-identification is the masking, manipulation or removal of personal data with the aim to make individuals in a dataset less easy to identify. It is especially important when you want to share, publish or archive your dataset. Before sharing, publishing or archiving your data, you should determine whether it is possible to de-identify your dataset, while also keeping in mind its usability. | ||
| - | ==== Anonymization | + | ==== Anonymization |
| === Pseudonymization === | === Pseudonymization === | ||
| - | Pseudonymization is a de-identification procedure | + | Pseudonymization is a de-identification procedure which is often implemented during data collection. During pseudonymization |
| - | Refer to our page on pseudonymization for practical advise on its implementation. | + | [[pseudonymization|→ |
| === Anonymization === | === Anonymization === | ||
| Anonymization is a de-identification procedure during which “personal data is altered in such a way that a data subject can no longer be identified directly or indirectly, either by the data controller alone or in collaboration with any other party." | Anonymization is a de-identification procedure during which “personal data is altered in such a way that a data subject can no longer be identified directly or indirectly, either by the data controller alone or in collaboration with any other party." | ||
| + | |||
| + | **Warning: | ||
| ==== De-identification techniques ==== | ==== De-identification techniques ==== | ||
| There are several techniques that can make your dataset less identifiable. Check out possible techniques to de-identify your data below, but be aware that these techniques often affect its analytical value. | There are several techniques that can make your dataset less identifiable. Check out possible techniques to de-identify your data below, but be aware that these techniques often affect its analytical value. | ||
| - | === Remove | + | === Removing |
| Consider whether you can remove or suppress sensitive elements. | Consider whether you can remove or suppress sensitive elements. | ||
| * Remove variables that reveal rare personal attributes. | * Remove variables that reveal rare personal attributes. | ||
| Line 22: | Line 25: | ||
| * Use restricted access to your data and only provide those variables to researchers that are necessary to answer their research question. | * Use restricted access to your data and only provide those variables to researchers that are necessary to answer their research question. | ||
| - | === Replace | + | === Replacing |
| A practice in which you replace sensitive personal data with values or codes that are not sensitive: | A practice in which you replace sensitive personal data with values or codes that are not sensitive: | ||
| * Replace direct identifiers (‘name’) with a pseudonym (‘X’). | * Replace direct identifiers (‘name’) with a pseudonym (‘X’). | ||
| * Make numerical values less precise. | * Make numerical values less precise. | ||
| * Replace identifiable text with ‘[redacted]’. | * Replace identifiable text with ‘[redacted]’. | ||
| - | Masking is typically partial, i.e. applied only to some characters in the attribute. For example, in the case of a postal code: change 9746DC into 97****. | + | Masking is typically partial, i.e. applied only to some characters in the attribute. For example, in the case of a postal code: change 9746DC into 97∗∗∗∗. |
| === Aggregation & generalization === | === Aggregation & generalization === | ||