- We have upgraded the current login node to a newer operating system and kernel, providing many benefits from a security and performance perspective
- We have introduced a secondary login node to provide more HPC login resilience
- We have upgraded the current (150) servers/compute nodes, to a newer operating system and kernel, providing many benefits from a security and performance perspective
- We have upgraded 4 of the 8 GPU servers/nodes with a newer operating system and kernel, providing security and performance benefits
- We have compiled a new cuda module to provide access to the latest cuda-8 toolkit/environment and updated the associated cuda driver (more details to follow) – Please don’t use this until advised!
- We have Installed, imaged, and configured an additional 104 nodes/servers into the HPC environment split in the following way:
b. 36 x Infiniband servers/nodes running on Broadwell CPU architecture with 128GB of DDR4 memory on each node and 24 CPU cores – These IB nodes will utilise the latest IB interconnect (FDR and associated fabric) providing 56Gb/s IB performance
c. Installed and configured a new Infiniband 56Gb/s FDR network fabric to provide high speed interconnect/networking to the IB nodes
d. 2 x huge memory servers/nodes running on Broadwell CPU architecture with 512GB of DDR4 memory on each 16 CPU cores
e. 2 x huge memory servers/nodes running on Broadwell CPU architecture with 512GB of DDR4 memory on each 16 CPU cores, for a group of HPC/Bio users who invested in HPC equipment for their own dedicated usage
f.This upgrade provides you with an additional 1680 CPU cores
We have tested the above extensively over the last 24 hours or so, looking at many of the standard applications, but it isn’t beyond reason, especially with the amount of operating system and kernel changes which we have made to so many systems, that there might be the odd issue with the odd application.
As you might know, we currently have 8 GPU systems, all of which rely on cuda-7.5. As part of this upgrade we have taken 4 of those GPU nodes (g0005..g0008) out of that queue, and updated them (as mentioned above). We will likely contact some of you over the next few days, to ask you to do some testing for us. Utilizing the newer O/S, kernel and cuda-8 environment.