Queues
Queues
Note - SLURM uses the word partition, but to avoid confusion for people used to using other queueing systems, we use the word queue.
Before creating and submitting jobs, it’s useful to understand the concept of queues, what queues are available for use, and the differences between them.
Queues form part of the cluster functionality provided by the scheduler (SLURM), the software used to balance node utilisation across the ADA cluster.
A queue is a pre-defined collection of nodes of a specific type, generally tailored for running a specific type of job. So your first task before running a job is to decide which queue it should be run on.
Note
You need to specify the time for your job - this is different from hpc.uea.ac.uk.
If you don't you will be allocated the default job length of 24 hours.
The maximum job length is 7 days (168 hours). Jobs exceeding this will be killed automatically.
The queue names on ADA are based on the hardware type, the number of processors, and RAM for the node as follows:
Queue Name | Slots Available | Default Time | Maximum Time | RAM Per Core | Priority | Qos | Description |
compute-16-64 (default queue) | 2112 | 24 hours | 7 days | 4Gb | 30 | - | standard compute node |
compute-24-96 | 1584 | 24 hours | 7 days | 4Gb | 30 | - | enhanced compute node |
compute-24-128 | 840 | 24 hours | 7 days | 5Gb | 30 | - | enhanced compute node |
interactive | 192 | 24 hours | 7 days | 4Gb | 30 | - | interactive node |
ib-24-96 | 672 | 24 hours | 7 days | 4Gb | 30 | ib | parallel ib mellanox |
ib-28-128 | 812 | 24 hours | 7 days | 4Gb | 30 | ib | parallel ib mellanox |
hmem-512 | 32 | 24 hours | 7 days | 32Gb | 20 | hmem | High Memory |
hmem-754 | 192 | 24 hours | 7 days | 32Gb | 20 | hmem | High Memory |
gpu-P5000-2 | 8 GPU (2 per node) |
24 hours | 7 days | 16Gb per cpu 192Gb per gpu |
30 | gpu | GPU |
gpu-K40-1 | 7 GPU (1 per node) |
24 hours | 7 days | 16Gb per cpu 64Gb per gpu |
30 | gpu | GPU |
gpu-V100-2 | 4 GPU (2 per node) |
24 hours | 7 days | 16Gb per cpu 192Gb per gpu |
30 | gpu | GPU |
QOS
Access to the more specialist hardware queues are restricted by the use of a QOS. This enables us to reserve the more limited resources for jobs that need the specialised hardware. If you think you need such access, please mail us with the reason why you want to use that resource. We will add you to the relevant qos, or explain which other resources you should use.
Memory on the queues
The Ram per Core figure is the assumed amount available to your job if you don't specify a RAM limit. It is the total ram divied by the number of cores on the machine. The amount of RAM on the machine is the final figure on the queue name.
If your job used more than the assumed level, without asking for it, it may impact other users of the node, causing jobs to slow down as memory is swapped between the different processes.
You can specify more than that figure if you want to use more RAM - please see the memory section of this site.
If you think your job is going to need with 5Gb of total RAM available on a given queue, please move to a queue with more RAM available. If your job is likely to need more than 120Gb RAM then you will need to use the high memory queue.