2.4 Running parallel applications in FGCI

In FGCI you can utilize POSIX threads (OpenMP) and MPI based parallel computing. In the case of threads based parallel computing the number of parallel processes (threads) is limited by the structure of the hardware: all the processes must be running in the same node. Thus in the case of older FGI machines, threads based programs can't use more than 12 computing cores. In the new FGCI hardware the one node contain 24 cores and as hyperthreading is used, one node can run 48 simultaneous threads.

In MPI computing the parallel processes can be distributed to several computing nodes and thus there is not technical limit to the number of cores that can be used. However all parallel implementations benefit from the parallel computing only up to a certain extend. After some application and analysis dependent limit, utilizing larger amount of cores will not be feasible. Because of that, scaling tests, where the application is tested with different core amounts, a should be run before the actual production runs.

 

2.4.1 Executing threads based parallel software in FGI

In the case of many pre-installed threads utilizing programs, the Runtime Environment of the program automatically sets up the parameters the parallel job execution requires. However if you use your own software, you need to do some extra definitions in the job description file.

In the following example we use software package called SOAPdenovo to run a sequence assembly job in FGCI. SOAPdenovo is not available as a Runtime Environment in FGI. However you can download a pre-compiled Linux executables from the home page of SOAPdenovo. These executables can be copied to the remote cluster together with other input files. In this example we use executable called SOAPdenovo-31mer, job configuration file:soap.conf and the input dataset: datape.fasta. SOAPdenovo produces a large set of result directories and files. Thus in this case it is handy to use the output definition "( "/" "" ) " that defines that all the data will be retrieved from the execution directory.

&
(executable=runsoapdenovo.sh)
(jobname=soapdenovo)
(stdout=std.out)
(stderr=std.err)
(gmlog=gridlog)
(walltime=24h)
(memory=2000)
(count=12)
(runtimeenvironment="ENV/ONENODE")
(inputfiles=
( "SOAPdenovo-31mer" "SOAPdenovo-31mer" )
( "soap.config" "soap.config" )
( "datape.fasta" "datape.fasta")
)
(outputfiles=
( "/" "" )
)

The definition "(runtimeenvironment="ENV/ONENODE") " is essential for threads based parallel jobs. This definition ensures that all the cores, that the job uses, will be in the same computing node.

In this case we use 12 computing cores (count=12) that is the maximum for thread based parallel jobs in old FGI servers. In ARC environment the memory reservation is given per one core. For example in this example (memory=2000) reserves 2 GB for each core which means that the job requires total of 24 GB of memory. When you change the core number to be used, you should always check the memory reservation too.

In the command script runsoapdenovo.sh below, we first need to use command chmod to give execution permissions for the executable that is copied to the remote cluster. In the case of SOAPdenovo-31mer command the number of computing cores to be used is given with option -p. In the end of the script the input files are deleted with rm commands. This is done to avoid unnecessary copying of the input files back from the grid environment.

#!/bin/bash
echo "Hello SOAPdenovo!"
chmod u+x SOAPdenovo-31mer

./SOAPdenovo-31mer all -s soap.config -K 23 -p 12 -o soap23

rm -f datape.fasta
rm -f SOAPdenovo-31mer
rm -r soap.conf

echo "Bye SOAPdenovo!"

The sample job above could be executed using the normal arcsub, arcstat and arcget commands.

2.4.2. Executing MPI based parallel program in FGI environment

The way, how an MPI based applications are launched in FGI environment may differ between different applications. For application specific details, please check the runtime environment page of the application from the FGCI User pages. As MPI jobs can utilize several computing nodes the ENV/ONENODE definition, used with thread based parallel jobs, is not needed in the job description file. However, just like threads based parallel jobs, too you should always remember to check the memory reservation, when the number of computing cores is changed.

A simple Gromacs run is used here as an example of MPI based parallel job. In this example the job description file gromacs.xrsl below reserves 32 computing cores (count=32), 500 MB of memory per core ( total memory 16 GB) and 24 hours of computing time. The pre-installed Gromacs is taken in use with runtime environment definition: (runtimeenvironment>="APPS/CHEM/GROMACS-4.5.5").

&(executable=rungromacs.sh)
(jobname=gromacs)
(stdout=std.out)
(stderr=std.err)
(runtimeenvironment>="APPS/CHEM/GROMACS-4.5.5")

(gmlog=gridlog_1)
(walltime="24 hour")
(memory=500)
(count=32)
(inputfiles=( "topol.tpr" "topol.tpr" ))
(outputfiles= ( "output.tar.gz" "output.tar.gz" ) )

In the command script, the MPI version of the Gromacs molecular dynamics engine: mdrun_mpi is launched using the mpirun command. When the Gromacs run is ready, all the files from the remote execution directory are packed to a single gzip compressed tar file.

#!/bin/sh
echo "Hello GROMACS!"
mpirun mdrun_mpi -s topol.tpr
exitcode=$?
tar cf output.tar ./*
gzip output.tar
echo "Bye GROMACS!"
exit $exitcode 

The sample job above could be executed using the normal arcsub, arcstat and arcget commands.

 

    Previous chapter     One level up     Next chapter