Back

How to choose CPU architecture on Taito

About half of the Taito nodes have Sandy Bridge and the rest Haswell CPUs. All programs that run on Sandy Bridge should also run on Haswell nodes. Programs that have been specifically compiled to use the latest Haswell architecture intrinsic instructions won't run on Sandy Bridge nodes. More information on how to compile for Haswell in here. The Haswell nodes have 24 cores and Sandy Bridge nodes have 16 cores. The Haswell CPUs are slightly faster than Sandy Bridge.

The Batch Job Script Wizard in the Scientist's User Interface has in interactive tool to create batch scripts. You can also use it to specify the correct options for your batch script.

Most jobs can use either architecture CPUs, but a single parallel job can not use a mixture of both architecture CPUs. The queuing system (SLURM) will take care of this. To minimize queuing time, by default the queueing system will place jobs in serial and longrun queues into those nodes that are available regardless of the architecture.

If you want to use only one of the architectures, you can constrain the job to one architecture by adding this line to your batch job:

for Sandy Bridge architecture:

#SBATCH --constraint=snb

for Haswell architecture:

#SBATCH --constraint=hsw

for either architecture:

#SBATCH --constraint="[snb|hsw]"

The default architecture on parallel and test jobs (in the parallel and test queues) is either Sandy Bridge or Haswell. Thus, if you want your parallel job to run on Haswell architecture put this line in your batch script:

for Haswell architecture:

#SBATCH --constraint=hsw

CSC strongly recommends using full nodes to run parallel jobs. For Sandy Bridge nodes this means requesting a multiple of 16 cores (i.e. 16,32,48,...). For Haswell nodes this means 24, 48, 72, 96, ... For an mpi-parallel job you can either request the number of cores directly (-n 76) or the number of nodes and tasks per node (-N 3 --tasks-per-node=24) and SLURM will do the math. If you choose to request tasks directly, it might be good to make sure that your parallel job does not spread over more nodes. Do this by adding a line specifying the allowed number of nodes to your batch script.

i.e. for 48 Sandy Bridge cores (3 nodes) it would be:

#SBATCH --constraint=snb
#SBATCH --ntasks-per-node=16
#SBATCH -N 3
...
srun ./your_mpi_program

and for 3 full Haswell nodes:

#SBATCH --constraint=hsw
#SBATCH --ntasks-per-node=24
#SBATCH -N 3
...
srun ./your_mpi_program
Note that in this case you don't need to give srun the number of tasks explicitly.