3.Using the batch job environment
At CSC, batch job systems are used to execute computing tasks in clusters and supercomputers. The Sisu supercomputer uses SLURM(Simple Linux Utility for Resource Management System) batch job system, which is used in Taito cluster too. However, the SLURM setup differs significantly between Sisu and Taito. In Sisu, the SLURM definitions are used just to reserve computing resources. The parallel commands are launched using ALPS (Application Level Placement Scheduler) and the ALPS command aprun is used instead of the srun command which is normally used in SLURM batch jobs. The aprun command is also used to define many resource allocation details that in other severs are defined using the SLURM commands.
Table 3.1 Batch job partitions in Sisu
|Partition||Minimum number of nodes||Maximum number of nodes||Maximum number of cores||Maximum running time||Notes|
|test_large||1||800||19200||4 hours||Jobs in this partition have very low priority.|
|large||24||400||9600||72 hours||Scaling tests are required for jobs that use more than 42 nodes|
|gc||Special queue for grand challenge jobs. Limits will be set based of the needs of the projects.|