Definition of sensitive data
There is not necessarily a comprehensive definition of sensitive data. It could be for example confidential business related data which leaking could harm the business.
The research data might contain some human related information. Then the legislation also applies.
EU General Data Protection Act (GDPR) defines the personal information in its 4th article and the data handling principles in its 5th, 24th and 32nd articles.
And in 9th article it defines the special cases which included for example genomic or biometric data.
Read more: Finnish Social Science Data Archive, Anonymisation and Personal Data
Consider also the ethical principles, that must be taken into account when handling sensitive data.
Types of sensitive data
Sensitive data are data disclosing personal information, such as racial or ethnic origin, sexual orientation or religion or health and medical data. Other types of sensitive data, beyond human data, is confidential business information or information on state security. Sensitive data can also include data that reveals the location of rare, endangered or commercially-valuable species, or other conservation efforts.
While conducting research consisting of sensitive data, it is important to be familiar with the laws and restrictions attached to it. Sensitive data can be opened or published if it is anonymized first. Make sure that the data is usable and intact after the anonymization. In case the quality of sensitive data suffers, only the metadata can be published, not the data itself. However, the metadata can be published in cases where the data does not include identifiable information, with which people can be identified.
EU wide data protection rules have been agreed upon and will be implemented in May 2018. The set of rules, the General Data Protection Regulations (GDPR), will help protect and harmonize privacy laws across Europe.
How to manage sensitive data
The legislation does not define the precise technological means.
First, you should make a risk evaluation and both organisational and political security measures has to be considered such as psedonymisation and encryption (32nd arcticla).
In pretext points 26, 28 and 29 also suggest that the pseudonymisation of the personal data does not remove the classification of personal data.
One has to use common sense such as storing the possible code registry or encryption keys outside the system where the pseudonymised or encrypted data is.
If you need to process sensitive data, ePouta ( https://research.csc.fi/epouta ) might be suitable for you. Its virtual machines do not have direct internet connection but are aimed to be connected to the customer organisations' local network. To setup such connection requires some effort and is usually not suitable for short term usage.
CSC is developing solutions to have a secure desktop and to introduce an interface to bring the data into a secure storage. This is due to be in production 2019.
Meanwhile, please contact email@example.com and lets discuss about the possible solutions.