Managing Sensitive Data - Services for Research
Managing Sensitive Data
While not giving out exact technical details on how to process sensitive data, the EU General Data Protection Regulation (GDPR) outlines the principles on sensitive data processing. The list is long and definitions are complex, but certain basic rules may be easily highlighted.
- Minimise the data. This means that you should only process the data that is absolutely needed. For example, if the dataset includes information about persons' age but that information is not needed for the research, it should not be included in the dataset. The unnecessary information should be removed from the dataset before processing data.
- Anonymise or pseudonymise the data whenever possible. Anonymisation means removing all identifiable information from the data so that it is impossible to identify a single individual from it. Whereas, pseudonymisation means replacing identifiable information in the data with artificial identifiers or pseudonyms so that a single individual cannot be identified from the dataset without additional information.
Note, that pseudonymisation does not remove sensitiveness of the data and thus, all requirements for sensitive data processing are still valid. The code registry for back referencing pseudonymised data should not be stored along with the data but preferably in a completely separate system in order to minimise potential damage in case of a data leak.
- Encrypt the data. Sensitive data at rest (i.e. stored in any type of media) or in transport (i.e. copying over network) should always be encrypted with sufficient encryption key length and commonly accepted encryption algorithms.
- Destroy the data you do not need. The data should be completely destroyed when there is no need for it anymore.
Remember to identify and name a Data Controller for your sensitive data. A data controller determines the means and procedures for processing the data, meaning that they control how data is processed and for what purposes. For research data, a data controller is usually the Principal Investigator either alone or together with another legal person or entity.
You also have to identify and name a Data Processor, who processes the data on behalf of the controller. The GDPR states that 'processing' means any operation which is performed on personal data, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction and erasure or destruction. This means that the data processor is often the computing facility where the data is processed, such as CSC.