Use Case | 5.12.2024

Large-scale simulations

Studying the dynamics of biomolecular systems often requires large models on the order of a few million atoms. Understanding the relations of the system components requires simulating not only one such system, but other similar ones to pinpoint the differences caused by key components. The simulations need to be long, on the order of a few microseconds. Such simulations would not be possible without high-performance computing resources, as well as sufficient data storage capacity to store the large amount of produced output data.

In practice, this use case requires a few concurrent simulations, each taking up tens of compute nodes. The individual simulations are run for a few days at a time, and after each run each simulation is confirmed to have completed without errors. The simulations are then continued for a total duration of several weeks.

This use case can be accomplished by using several CSC services together:

Preparation

Create a CSC account, then create or join a CSC project. Browse the available molecular dynamics applications in Docs CSC and learn how to run them efficiently on CSC supercomputers. Don’t hesitate to contact our Service Desk if you need support in selecting suitable tools and methods.

How to start Available applications in Docs CSC Contact CSC Service Desk

Running the simulations

The simulations are run on one of CSC’s supercomputers. If the chosen molecular dynamics software runs on GPUs, LUMI is the best option. If not, both Mahti and the CPU partition of LUMI are good choices.

High-performance computing LUMI supercomputer Mahti supercomputer

Storing and sharing output data

During the research project, output data should be stored in the Allas object storage service. Data stored in Allas can also be shared with your collaborators.

Allas object storage service Storing and sharing data

Publishing the data

After you have finished all simulations and analyzed the data, it is recommended to publish the results for others to discover and reuse. With Fairdata services you can easily create descriptive metadata for your dataset. Your published datasets get landing pages and persistent identifiers (DOI/URN).

Fairdata services Publish and discover data