How can I run hugemen jobs that are longer than 7 days?


The hugemem partition of Taito is intended for computing tasks that require more than 256 GB of memory. However, there is only six nodes in Taito, that can handle this kind of jobs. Because of that, hugmem queue should be used only when large memory is really needed. By default hugemem queue accepts jobs that require in maximum 7 days of run time.  However, occasionally CSC customers have computing tasks that would require even longer computing times.

For those kind of jobs the maximum execution time can be extended to 14 days. The extended time reservation can be used when  SBATCH option --qos=hugememlong. For example the batch job below can be used to submit a job that  runs spades software using 1,4 TB if memory and 16 cores for 14 days:

#!/bin/bash -l
#SBATCH -J spades
#SBATCH -t 14-00:00:00
#SBATCH -p hugemem
#SBATCH -n 1
#SBATCH --nodes=1
#SBATCH --mem=1400000
#SBATCH --qos=hugememlong
#SBATCH --cpus-per-task=16
#SBATCH -e errors.txt
#SBATCH -o output.txt

module load spades --pe1-1 test1_R2.gz --pe1-2 test1_R1.fastq.gz \
-t 16 -o Spades_long --trusted-contigs trusted.fa \
 --untrusted-contigs untrusted.fa -m 1400

As this long jobs reserve hugemem nodes for very long times, the queue is configured so that each user can have only one job running at a time.