Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
dcc:pdpsol:dataminimization [2026/04/29 13:30] marlondcc:pdpsol:dataminimization [2026/04/29 13:52] (current) – add comment solveig about contact information in surverys marlon
Line 17: Line 17:
 This concept is also relevant if you use certain variables as an **independent variable** in your research. For example, if you want to collect location data, it is often unnecessary to know someone’s exact address or neighbourhood to answer a research question. For example, if the goal is to compare happiness within different regions in a country, broader categories such as rural versus urban areas may be sufficient. However, in some situations, it might be necessary to collect more detailed or high-granularity data. For example, if the research is about neighbourhood connections, detailed location data would be necessary.  This concept is also relevant if you use certain variables as an **independent variable** in your research. For example, if you want to collect location data, it is often unnecessary to know someone’s exact address or neighbourhood to answer a research question. For example, if the goal is to compare happiness within different regions in a country, broader categories such as rural versus urban areas may be sufficient. However, in some situations, it might be necessary to collect more detailed or high-granularity data. For example, if the research is about neighbourhood connections, detailed location data would be necessary. 
  
-==== Take into account the effort of research participation ====+=== Take into account the effort of research participation ===
 Although it is important to consider what personal data you need for your research, it is also important to be mindful of the effort and strain participation may place on your participants. This means you should limit the collection of personal data to what you need for your research. However, you should also respect participants’ time and effort, and avoid designing studies that require participants to take part multiple times due to narrowly defined research questions. This is particularly important when working with vulnerable or hard-to-reach groups. In such cases, it is advisable to design studies that can address several relevant questions at once, thereby maximizing the value of participants’ contributions while minimizing their strain.  Although it is important to consider what personal data you need for your research, it is also important to be mindful of the effort and strain participation may place on your participants. This means you should limit the collection of personal data to what you need for your research. However, you should also respect participants’ time and effort, and avoid designing studies that require participants to take part multiple times due to narrowly defined research questions. This is particularly important when working with vulnerable or hard-to-reach groups. In such cases, it is advisable to design studies that can address several relevant questions at once, thereby maximizing the value of participants’ contributions while minimizing their strain. 
  
Line 37: Line 37:
 ===Type of data=== ===Type of data===
 Some data can reveal more information about an individual than others. Only use an extensive or detailed data collection method if you also use this type of data to answer your research question. Some data can reveal more information about an individual than others. Only use an extensive or detailed data collection method if you also use this type of data to answer your research question.
-  * **Video**: Observational research on human interactions, facial expressions, movement patterns, etc.  +  * **Video**: Observational research focusing on human interactions, facial expressions, movement patterns, etc.  
-  * **Audio**: Unstructered qualitative research in which precize content and possibly tone and pitch are important (e.g., focus groups and open interviews), and research conducting speech analysis.    +  * **Audio**: Unstructered qualitative research where precize content and possibly tone and pitch are important (e.g., focus groups and open interviews), but also research conducting speech analysis.    
-  * **Text**: Structured qualitative research focusing on content (e.g. interviews, observations)+  * **Text**: Structured qualitative research focusing on content (e.g. interviews, oberservations)
  
 ===Contact information=== ===Contact information===
Line 79: Line 79:
  
 ===Contact information=== ===Contact information===
-Do not collect contact information if you do not plan to contact your participants after you have collected the data (e.g. in case of recruitment via social media, posters or third parties). The [[https://www.rug.nl/digital-competence-centre/it-solutions/collect-and-annotate/qualtrics-surveys?lang=en|UG approved survey tool Qualtrics]] provides the option to use an [[https://www.qualtrics.com/support/survey-platform/distributions-module/web-distribution/anonymous-link/|anonymous link]] to prevent the collection of name and e-mail address of your participants. +Do not collect contact information if you do not plan to contact your participants after you have collected the data (e.g. in case of recruitment via social media, posters or third parties). The [[https://www.rug.nl/digital-competence-centre/it-solutions/collect-and-annotate/qualtrics-surveys?lang=en|UG approved survey tool Qualtrics]] provides the option to use an [[https://www.qualtrics.com/support/survey-platform/distributions-module/web-distribution/anonymous-link/|anonymous link]] to prevent the collection of name and e-mail address of your participants. If you would like to contact participants to share results or for another purpose that doesn’t require linking identities to their responses, set up a separate survey to collect contact information. You can provide a link to this second survey at the end of the original one. This approach ensures that all research data anonymous from the start, while still allowing you to maintain a list of contact details. This only works if there is no need to connect contact information to individual responses.
  
 === Informed Consent === === Informed Consent ===
Line 103: Line 103:
   * **Social media data scraping** is the automated collection of user-generated content and metadata from platforms like X (Formerly Twitter) and YouTube for systematic analysis. Make sure you limit the variables you collect during scraping and define clear filters to your range (e.g. keywords and date range). Consider taking a sample and not scraping all the data that falls within this range.    * **Social media data scraping** is the automated collection of user-generated content and metadata from platforms like X (Formerly Twitter) and YouTube for systematic analysis. Make sure you limit the variables you collect during scraping and define clear filters to your range (e.g. keywords and date range). Consider taking a sample and not scraping all the data that falls within this range. 
   * **[[https://datadonation.eu/data-donation/|Data donation]]** allows a researcher to collect digital trace data, by asking their participants to request and share their Data Download Packages (DDPs), which they can request by exercising their [[https://www.rug.nl/digital-competence-centre/privacy-and-data-protection/gdpr-research/rights-of-human-data-subjects-in-scientific-research|privacy right to access and data portability]]. Although these packages can contain a lot of sensitive data, researchers at Scientific institutions in the Netherlands can use the software [[https://datadonation.eu/software/port/|Port]] which helps to set up a [[https://d3i-infra.github.io/data-donation-task/|data donation task]]. This limits the amount of data that will be donated to the data that is necessary for the research project, because participants do not donate the full DDP they received from the Social Media Platform.    * **[[https://datadonation.eu/data-donation/|Data donation]]** allows a researcher to collect digital trace data, by asking their participants to request and share their Data Download Packages (DDPs), which they can request by exercising their [[https://www.rug.nl/digital-competence-centre/privacy-and-data-protection/gdpr-research/rights-of-human-data-subjects-in-scientific-research|privacy right to access and data portability]]. Although these packages can contain a lot of sensitive data, researchers at Scientific institutions in the Netherlands can use the software [[https://datadonation.eu/software/port/|Port]] which helps to set up a [[https://d3i-infra.github.io/data-donation-task/|data donation task]]. This limits the amount of data that will be donated to the data that is necessary for the research project, because participants do not donate the full DDP they received from the Social Media Platform. 
-  * **Manual data collection and observation** make it possible to carefully design your data collection and easily prevent the collection of identifiable data. You can determine what data you collect and are less dependent on API or Data Download Packages (DDPs). Examples of good practices: 1) Make sure not to collect any usernames, or store them seperately from the rest of your data ([[pseudonymization|pseudonymization]]). 2)[[de-identification|De-identify]] other personal identifiable information that is not necessary for your research purpose while you are collecting the data.  +  * **Manual data collection and observation** make it possible to carefully design your data collection and easily prevent the collection of identifiable data. You can determine what data you collect and are less dependent on API or Data Download Packages (DDPs). Examples of good practices: 1) Make sure not to collect any usernames, or store them seperately from the rest of your data ([[pseudonymization|pseudonymization]]). 2) [[de-identification|De-identify]] other personal identifiable information that is not necessary for your research purpose during data collection.