1.1 Taito supercluster
The Taito supercluster (taito.csc.fi) is intended for serial and medium sized parallel tasks as well as jobs that require a lot of memory. Researchers that want to use Taito should 1) register as CSC users and then 2) apply for a computing project. This registration process is described in the chapters 1.2.1 and 220.127.116.11 of the CSC computing environment user guide,
A computing project at CSC has a common computing quota that can be extended by application. Use of Taito or any other server will consume the computing quota granted to the project. One core hour in Taito consumes 2 billing units from the computing quota of the project.
The Taito users are allowed to submit up to 896 simultaneous batch jobs to be executed. The maximum size of a single job is at most 448 compute cores using Sandy Bridge CPUs and 672 using Haswell CPUs.
Table 1.1 Available batch jobs queues in supercluster taito.csc.fi.
|Queue||Number of cores||Maximum run time|
|serial (default)||16 / 24 (one node*)||3 days|
|parallel||448 / 672 (28 nodes*)||3 days|
|longrun||16 / 24 (one node*)||14 days|
|test||32 / 48 (two nodes*)||30 min|
|hugemem||80 (two haswell hugemem nodes)||7 days|
* Sandy Bridge / Haswell (one Sandy Bridge node consists of 16 and Haswell one of 24 cores)
In Taito you don't need to sent the scaling test results of your parallel code to CSC. However, you should still make sure that you are using the resources efficiently i.e. that your code - with the used input - does scale to the selected number of cores. The rule of thumb is that when you double the number of cores, the job should to run at least 1.5 times faster. If it doesn't, you should use less cores. Note that scaling depends on the input (model system) as well as the used code, so you may need to test separately for scaling with the same code for different model systems. If you are unsure, contact CSC Service Desk.
Taito (taito.csc.fi) is a 16 cabinet HP cluster based on commodity off-the-shelf building blocks. The theoretical peak performance of the cluster, calculated on the aggregate performance of the computing nodes, is about 600 TFlop/s.
Taito has a total of 985 compute nodes that use either Intel Haswell or Sandy Bridge processors which are well suited for high performance computing. There are 407 Apollo 6000 XL230a Gen.9 server blades, installed in November 2014, and 496 (Sep 10th 2015) older HP Proliant SL 230s Gen8 half-tray servers. The 407 most recent servers host two twelve core Intel Haswell E5-2690v3 processors, running at 2,6GHz. The older blades host two eight-core Intel Sandy Bridge 2,6 GHz processors (Intel E5-2670, 64bits). This means that there are in total 17704 computing cores available in the cluster.
The Haswell processors comprise several components: twelve cores with individual L1 and L2 caches, an integrated memory controller, three QPI links, and an L3 cache shared within the socket. The processor supports several instructions sets, most notably the Advanced Vector Extensions 2 (AVX2) instruction set; however, older instructions sets are still supported. Each Haswell core has dedicated 32KB of L1 cache, and 768 KB of L2 cache. The L3 cache is shared among the processors, and its size is 30 MB. Most of the Haswell nodes (397) have 8 slots of 16GB DDR4 DIMMs, operating at 2133 MHz, for a total of 128GB per compute node. This means that there are 5,3 GB of memory available per core. There are also ten Haswell-equipped nodes with 16 modules of 16 GB 2133 MHz DIMMs, that is 256 GB of DDR4 memory per node, and 10,7 GB per core.
Sandy Bridge nodes
The Sandy Bridge processors comprise several components: eight cores with individual L1 and L2 caches, an integrated memory controller, three QPI links, and a L3 cache shared within the socket. The processor supports several instructions sets, for example Advanced Vector Extensions (AVX) instruction set, however older instructions sets are still supported too. Note however, that the AVX2 instruction set, available in Haswell processors, is not supported in Sandy Bridge processors.
Each Sandy Bridge core has dedicated 32KB of L1 cache, and 256 KB or L2 cache. The L3 cache is shared among the processors, and its size is 20 MB. While most of the nodes have 64GB of dedicated memory (4 GB per core), 16 of them have 256 GB per node (we refer to them as "bigmem nodes").
For jobs requiring very large memory, Taito includes six hugemem nodes each having 1,5 TB of memory:
- two HP Proliant DL560 nodes, with 2.7Ghz Sandy Bridge processors with 32 cores (four eight node sockets) and 2 TB of local temporary storage space.
- four Dell R930 nodes, with 2.8Ghz Haswell processors with 40 cores and 2,6 TB of local SSD based fast temporary storage space.
In addition to the computing nodes, Taito has four login nodes, two of which (taito-login3 and taito-login4) are used for logging into the system, submitting jobs, I/O and service usage. The tow other login nodes act as the front ends for the GPGPU and MIC hardware linked to the cluster. The login nodes are HP Proliant DL380 G8.
Communication among nodes and to the storage is done by Infiniband FDR fabric, which provides low latency and high throughput connectivity. High speed interconnect is provided by 58 Mellanox Infiniband FDR switches with 36 ports each, and by the Infiniband HCAs installed on each computing node. The network topology for the cluster is 4:1 pruned tree fabric.
Table 1.2 Configuration of the Taito.csc.fi supercluster. The aggregate performance of the system is 600 TF/s.
|Node type||Number of nodes||Node model||Number of cores / node||Total number of cores||Memory / node|
|Sandy Bridge login node||4||HP Proliant DL380 G8||16||64||64 / 192 GB|
|Haswell compute node||397||HP Apollo 6000 XL230a G9||24||9528||128 GB|
|Haswell big memory node||10||HP Apollo 6000 XL230a G9||24||240||256 GB|
|Sandy Bridge compute node||496||HP SL230s G8||16||7936||64 GB|
|Sandy Bridge big memory node||16||HP SL230s G8||16||256||256 GB|
|Sandy Bridge huge memory node||2||HP Proliant DL560||32||64||1,5 TB|
|Haswell huge memory node||4||Dell R930||40||160||1,5 TB|
The following commands can give some useful information from the whole Taito system or from the current node a user is logged in.
To get a quick overview of all Taito compute node status use the following command:
The above command prints information in a compute node oriented format. Alternatively, you can get the information in in a partition/queue oriented format with command:
For information about the disk systems one can use the following command:
Details about the available processors on the current node can be checked with:
And details about the current memory usage on the node is shown with:
|Previous chapter||One level up||Next chapter|