CSC research infrastructure data management policy
This CSC Data Management Policy (DMPol) gives guidance on how the CSC’s Data Policy is implemented. DMPol promotes good data management practices within our operations and for our designated user communities. DMPol sets out the guiding principles for managing and curating data in the CSC Research Infrastructure (RI). The CSC RI is a common national research platform for data management and scientific computing containing services for the entire research data life cycle.
This policy covers how the customer’s and user’s data is handled in research data management services, and metadata linked to this data. DMpol entails guidelines related to all types of data, including administrative personal data, and sensitive data. CSC services may have specific terms of use, service descriptions, and policies depending on the purpose of the services.
1. General description of the administrative data of the RI
Administrative data of the CSC RI relates to required service management information of assigned personnel, administrative rights, information of users, service agreements, usage of the services, and billing. Administrative personal data processing in CSC’s services is based on our legitimate interests, or on the performance of a contract. The register includes the following data:
- data subject’s basic details: name*, customer number, username and/or other unique identifier, password, gender and preferred language of communication;
- data subject’s contact details: email address*, telephone number* and physical address*;
- professional and research-related information of data subject’s: home organisation*, department or institution, role, field of science*, nationality* and level of education*;
- information regarding use of Services for Research: data subject’s project memberships, resource applications and use of resources;
- home organisation contact information: business IDs, names and contact details of contact persons, information on previous and current contracts and orders, and other data from customer interactions;
- user data generated by technical systems, such as logs, online identifiers, and
- any other data collected with specific consent of the data subject: e.g. project category of user: academic, commercial, course or LUMI high-performance computing.
Data marked with an asterisk are required for establishing a contract or customer relationship with CSC. The privacy notice of the CSC customer and stakeholder register describes further in detail how CSC collects only necessary data needed for service provisioning, improving service quality and user experience.
More detailed information on privacy notices:
With regard to the use of Fairdata services, the Ministry of Education and Culture is also maintaining user and metadata register for the Fairdata Services and the Digital preservation solution:
2. General description of research data managed within RI
As a common research infrastructure, CSC RI offers services to all research domains. CSC RI cooperates closely with several other RIs, promoting synergies and interoperability between RIs across domains. Services are provided for all types of scientific data that can be processed and stored as digital objects. Sources of research data vary depending on the active projects of users. Data stored in CSC storage services can be optimised to cloud, container cloud or other storage system, including also digital mid-term or long-term preservation services for research data.
Data stored in CSC RI systems are merely defined by customers using the CSC services.
Datatype | Services (2024) |
Administrative data | eDuuni, e-mail services, service desk, Webropol, Eventilla, Service logs & Authentication and authorization infrastructure services (AAI), MyCSC, IdM and Reppu |
Active research data | Allas, cPouta, ePouta, Kvasi, LUMI, Mahti, Puhti, Rahti, Chipster, CSC Noppe, SD Desktop, FUNET FileSender |
Dynamic research data | Allas, Kaivos, Pukki, Sensitive Data Services: SD Connect, SD Desktop, FEGA |
Research data publication | Paituli, Fairdata services: IDA, Qvain, Etsin, Sensitive Data services: FEGA, SD Apply (reuse) |
Digital preservation | Digital Preservation Service for Research Data (part of Fairdata Services) |
- CSC research services
- General Terms of Use for CSC’s Services for Research and Education
- Terms of use for LUMI supercomputer
3. Ethical and legal compliance for personal or sensitive research data
CSC’s Services for Research and Education Data Processing Agreement (DPA) establishes the rights and duties between CSC – Finnish IT Сenter for Science Ltd (CSC) and a Controller of data when CSC processes personal data in CSC’s Services on behalf of the controller. The service descriptions provided in the web portal describe what kind of security measures and restrictions the service offers for data management or sharing.
More detailed information: Data Processing Agreement – CSC’s Services for Research and Education
If the data contains personal data (including special categories of data), user must ensure that the service intended to process the data is suitable for this type of data. In such a case, user acts as a Data Controller as described in applicable data protection legislation. CSC acts as a processor of personal data. When the data contains personal data including special categories of data (i.e. sensitive personal data) the user and CSC will execute the Data Processing Agreement and the description of processing activities. The form for the description of processing activities is located in CSC’s Customer portal MyCSC. These documents together govern such personal data processing activities.
See: General Terms of Use for CSC’s Services for Research and Education
CSC’s Sensitive Data (SD) services for research are designed to support secure sensitive data management through web-user interfaces accessible from the user’s own computer.
More detailed information: SD Services for Research
4. Agreements on research data rights
The use of many of CSC’s services is free-of-charge for users affiliated with a Finnish higher education institution (universities, universities of applied sciences), a state research institute or the National Archives of Finland, if not otherwise agreed. User can only use the services according to the terms of use. Many services are free-of-charge for academic use, and to certain capacity levels.
User is responsible for their data, stored in or transmitted via CSC’s services and shall comply with all applicable laws and regulations, and related data policies. If data includes sensitive or confidential information, user is responsible for making sure that the service used to handle the data complies with the required security.
User gives CSC the right to access the data to secure the accessibility, quality and security, which includes, for example, keeping the data on CSC’s IT service platform. This can entail monitoring access, automated monitoring for intrusion detection, taking backups, copying/moving the data, or reproducing faults. CSC protects the confidentiality of the data within the remits of the legal framework.
No ownership rights or intellectual property rights (IPR) of the data are transferred when using the services, if not otherwise agreed. Users are responsible for sharing their data within a CSC project, according to the project’s and their own requirements. If the user leaves a CSC project, they must ensure that the rights related to the data have been agreed.
More detailed information:
- Security and privacy
- General Terms of Use for CSC’s Services for Research and Education
- Prerequisites and Responsibilities for a CSC Project Manager
Citation guideline: How to cite CSC in a paper
CSC RI. Common infrastructure for research data management, Finland
In addition, services might have additional own guidance for citation.
5. Documentation and metadata
CSC offers support, guidance and training for data documentation, and metadata production as part of science support and preservation services. CSC guidance on best practices on metadata and documentation include information on interoperability, machine-actionability, and recommended controlled vocabularies, thesauri and ontologies, and are documented in CSC guidance on metadata and documentation.
Depending on the service, and the data types CSC services have different technical and organisational security measures described in terms of use or/and in TOMs. E.g. Fairdata storage service has versioning rules for published research datasets to ensure the integrity of the data.
More detailed information:
- General terms of use and privacy
- Technical and organisational security measures for protection of personal data in Fairdata Services
Beyond technical control and internal integrity, checks on the quality and accuracy of the data are the responsibility of the data owner. Data validation and quality control assessments are part of the ISO 27001 certification and FitSM guidance.
Persistent identifiers – CSC PID Policy
A persistent identifier (PID) identifies and locates an entity regardless of where it is hosted or published and enables its unambiguous and long-term identification. All datasets are to have adequate and unique persistent identifiers (PIDs), which comply to documented standards. Using and implementing PIDs, facilitates linking researchers and their research outputs. Publishing data alongside PIDs, metadata and licensing information is crucial for reusability of data. DOI identifier is commonly used for research dataset references.
Metadata
Finnish Research Information Hub shares information on research conducted in Finland, including research datasets that have been published in Fairdata Services. Fairdata Services has it’s own Graphical UI and End User API for describing and publishing research data but also existing metadata can be brought from other metadata catalogs and organizations’ own metadata repositories. A prerequisite for such organizational use is the use of PIDs in the original source, sufficient metadata, information related to access rights, scientific field of knowledge, and credible data management by the operator.
6. Access control, backup, storage, and disposal of the administrative and research data
Access control:
CSC manages the access control to its computer environment and related services and maintains with its user and identity management system a register of users and user groups as well as their purpose, access rights, authentication, limits, and responsibilities. CSC does not normally accept both anonymous upload and download for unpublished content in a single service. Only persons or user groups authorized by CSC can access, store, modify, or otherwise handle the data in CSC’s computer systems.
More information on user’s responsibilities are defined in General Terms of Use for CSC’s Services for Research and Education.
Accesses to CSC data services are logged and the logs can be accessed according to Finnish data protection laws, good practices in public systems and security management. In systems administration and incident handling CSC complies with Finnish government regulations on Information Security Management, and ISO/IEC 27001:2013 standards. The ISO 27001 certificate covers data center operations, ICT and computing platforms, IaaS cPouta and ePouta, Long-term Preservation Service, SAPA, Eduuni and Tiimeri collaboration platforms, LUMI hosting and Funet Miitti services.These information security management systems ensure that CSC possesses the capacity to manage, govern and continuously develop information security of its services and operations.
Technical and organizational security measures for protecting personal data are further defined:
Backups
Availability of backup service is defined in the service description.Only limited backups are taken and not guaranteed. CSC gives no guarantee for restoring any Content and declines any liability for files lost for any reason if not otherwise agreed. CSC recommends that Users maintain an up-to-date copy of their content by other means if they cannot afford to lose it.
Storage services
Data resources granted for users can vary depending on the research needs (e.g. data storage space 1 TB or data computing for 1 million core hours). The capacity management has an important role to ensure that the available resources are optimally used. Maintenance of service level with increased requirements requires continuous capacity monitoring of data storage and computation platforms and planning of resource management, as well as procurement of scientific computation environment.
CSC provides multiple options for storing data during data analysis, and storing data after a research project has ended. The Storage services are defined in CSC Service catalog. The services within CSC RI have their owns SLA’s stating their level of back up, integrity checking and access control. CSC also gives guidance on what to consider when choosing a suitable storage solution.
Data disposal
User is responsible for transferring or deleting data, before their account is terminated or their project is closed. If this has not been done in timely manner, data will be deleted 90 days after account termination or project closure without possibility to retrieve the data, unless otherwise agreed in writing. CSC will make a reasonable effort to notify the user before deleting data, and upon request to retrieve their data before it is deleted. Copies of data may remain temporarily on backup storage, but access will be strictly restricted.
Additional terms or requirements may apply and are made available with the relevant services. Additional terms become part of the Service Agreement (accepted Terms of Use, accepted Data Processing Agreement and Service Descriptions) with CSC if user uses those services.
See: General Terms of Use for CSC’s Services for Research and Education
7. Opening research data and/or metadata
CSC promotes the FAIR data principles of open science (as open as possible and as closed as needed) and requires complying to national data protection laws throughout its services and operations. Data reuse is encouraged when possible. This is facilitated by storing data in repositories and/or managed through appropriate licenses which specify the degree of publicity and user rights. Access restrictions can be implemented based on terms of use, contractual or legal reasons.
CSC does not publish or curate data on behalf of the user or customer organization without an agreement. The user is responsible for licensing all published data and for keeping data and the associated metadata up to date. All data should have an appointed contact point to facilitate access rights management. Read our guidance on licensing & rights.
CSC recommends using file formats approved by Digital preservation and ELIXIR for life sciences data.
Sharing data can be done also with active data. The owner of the data is responsible for diligent rights management within services that enable collaboration within a project or with external partners. For example, data can be shared from Allas object storage either with a limited audience, e.g. other projects, or access can be allowed for everybody by making the data public.
Metadata can be published through the Fairdata Services which enable verifiable and reproducible science and secure preservation. The Fairdata Services metadata model is based on the DCAT ontology. The CSC RI is also aligned with the upcoming Finnish Research Information Hub to enable interoperable metadata.
CSC SD services enable secure opening of the data, and use of sensitive data in collaborative projects: SD Connect for storing and sharing, SD Submit and Federated EGA for publishing under controlled access (in the pilot phase), and SD Apply for reusing data (in the pilot phase).