New HPC Storage
We have been working hard with our storage consultants to build a new HPC storage solution that will provide improved performance, reliability and capacity while providing improved scalability and potential for future integration with wider research storage facilities. During the migration we will be carrying out the final phase of syncing over 65 million files amounting to 120TB of HPC data to new HPC storage, while also carrying out the final configuration of the new GPFS HPC storage infrastructure.
The new HPC storage solution is based on the same IBM General Parallel File System (GPFS) software, and IBM hardware that is currently used for HPC storage. Our existing solution was designed back in 2008 to meet the needs of ESCluster, the HPC cluster in place at the time (http://rscs.uea.ac.uk/high-performance-computing/faqs/history-of-hpc-at-uea/escluster). This basic solution has grown over time from serving approximately 30TB to approximately 900 computational cores, to the current 120TB of storage delivered to over 4000 cores. The new infrastructure has been designed from the ground up for our current HPC requirements, and allows for scalability as the HPC infrastructure grows. There are many benefits to the new solution, from mirrored home directories for improved resilience, to optimised metadata which will reduce those slow ‘ls’ commands!.
The new GPFS storage will appear slightly different
+ There will be new directory structures:
- HPC home directories moving from /gpfs/dep/username to /gpfs/home/username. Similarly, scratch will be moving from /gpfs/scratchdep/username to /gpfs/scratch/username. We will be putting symlinks in place so that your existing scripts will still work with the old location.
- There will be some restructuring of /gpfs/data and user and research group filesystems, which will be confirmed as work progresses. Symlinks will be put in place where appropriate to ensure scripts continue to work
- /gpfs/esarchive will consist of a slightly different path, still to be confirmed, and will be available only from the Grace login node (grace.uea.ac.uk).
+ Because home directories are mirrored, your disk usage on home will appear double what it previously was. This has been taken into account with quotas.
We will update the progress of the migration at https://twitter.com/UEAHPCC
Chris - HPC Admin
Grace Phase 3
We are pleased to announce that an order for hardware has been placed with our partners Viglen for the third phase of Grace development. This will include the addition of 68 compute nodes providing a total of 1088 cores. Once again these nodes are based on the Intel Xeon E5-2670 2.60GHz 8 core processors, providing each node with 16 computation slots, and 32GB of RAM. As well as compute nodes, additional Infiniband infrastructure has been included, with the separation of nodes as:
- 48 Infiniband nodes providing an additional 768 parallel slots
- 20 standard Ethernet nodes providing an additional 320 sequential slots
The new hardware will be installed in the UEA Data Centre 1, separate from the majority of the existing Grace hardware which is located in UEA Data Centre 2, which will help improve service availability and resilience.
The upgrade will take Grace up to more than 300 computational nodes, increasing the core count up to 4148 cores, with a theoretical peak performance of nearly 65TFlops.
Delivery and hardware installation is scheduled for April, and following on from this will be configuration and integration into the HPC cluster environment.
We are also in the process of migrating the huge memory node (Intel Xeon E7440 2.40GHz 16 cores, 128G RAM) to Grace, as well as increasing the number of large memory (48GB RAM) nodes taking the total to 8.
Chris – HPC Admin
Why create a new website?
Having been a researcher for several years, whilst also providing training on using the cluster to students and colleagues. I appreciated how important it was to have a comprehensive HPC information portal. As a team we believed that the current site was dated and in desperate need of update.
Having moved to the HPC Admin team in September 2012, I was asked to put together a new informative website with associated supporting information.
I think we can all appreciate that compiling documentation can be a daunting task, what with questions relating to: content, user expectation, best practices, variety of research, user cluster proficiency levels and much more. Having given these questions plentiful thought, I proceeded upon the wonderful path of learning how to use the UEA's Content Management System (Liferay).
Having acquired the necessary Liferay skills, I began the task of porting and updating the current wiki information.
Over several weeks, as a team, we worked hard to further develop material which we hope will help to fortify your learning experience whilst using Grace.
Please have a good look around our new site, and if you have any ideas for new learning material, then let us know.
Leo - HPC ADMIN