Isambard: The Future of High Performance Computing

Posted by Professor Simon McIntosh-Smith on 27 March 2017

A revolution in computing

Right now, we're in the middle of a revolution. It might not be apparent, but quietly, and in the background of our everyday lives, there are changes in technologies that will affect how science is done over the next ten years and beyond.

Here in Bristol, I'm proud to be part of this revolution and, along with colleagues who are part of the EPSRC 'Tier 2' high-performance computing initiative, we're preparing for the future.

The importance of HPC

Science is impossible today without the resources of computers and, increasingly, the problems that science needs to solve need the power of supercomputing. The laptop, phone or tablet you're reading this on represented the largest computing power available only a few years ago. Now it fits in the palm of your hand.

That trajectory of change is showing no signs of ceasing and so today, we need to explore and understand the boundaries of high performance computing for today's science so that we're prepared for the ever increasing complexity of tomorrow's problems. This area of 'computational science' is driven by the phenomenal increases in raw computing power available to us and the increase in the performance of processor technologies.

Over the last 15 years or so, supercomputer architectures have remained relatively unchanged. Most machines are essentially large clusters of computer servers based on one of just a small number of CPU architectures - either x86 as used by Intel® and AMD®, or POWER as used by IBM®. Most other architectures have already died out from mainstream high performance computing (HPC).

We're pushing these technologies harder and harder, shrinking the basic components down to nanometre scale.

A need for change

As today's mainstream computer architectures have started to hit significant technical challenges, new architectures have begun to emerge. Some of these architectures, such as graphics processing units (GPUs) from NVIDIA or AMD, take a very different architectural approach, employing large numbers of very lightweight parallel execution units. Intel is taking a different approach with its Xeon Phi™, exploring the potential for massively parallel CPUs based on enhanced versions the traditional x86 architecture.

However, it's often the outside contender that can surprise us. ARM is a new entrant into the HPC space. ARM doesn't make CPU devices themselves. Instead, ARM licenses the design of CPU cores to companies that do make CPU devices, and make them by the billions.

ARM is best known for its success in the mobile space, where ARM CPU cores power most mobile phones, tablets, and many other consumer gadgets. More recently, ARM has released designs targeting the HPC market. This has enabled several companies that are new to HPC to design CPUs for the HPC market, significantly changing the competitive landscape.

Isambard

Because ARM's HPC designs are relatively recent developments, there has been no large-scale production supercomputers based on the ARM architecture yet. This makes it difficult for supercomputer users and vendors to know how well the various ARM-based server chips coming from manufacturers like Cavium, Qualcomm, Applied Micro, AMD and others will perform on real scientific workloads.

To solve this problem, as part of EPSRC's recent funding of a new cohort of Tier 2 HPC centres, the GW4 alliance of the universities of Bristol, Bath, Cardiff and Exeter, have teamed up with the Met Office and Cray, to design and build the 'Isambard' system, the world's first large-scale, production, ARM-based supercomputer.

Tier 2 means a machine that sits between the large national HPC services like ARCHER and the local university cluster machines within individual HEIs. The advantage of the latest layer of Tier 2 machines is that they will have a policy of national access, so users across the UK can get access to a wide range of technologies.

Isambard will include over 10,000, high-performance ARM cores, one of the largest such machines anywhere in the world. Isambard will also include a small number of processors based on other advanced architectures, such as Intel's latest Xeon Phi (based on their Knights Landing many-core CPU), and NVIDIA's latest Pascal™ based P100 Tesla® CPUs.

A bright future

Isambard will enable, for the first time, direct architectural comparisons between many of the likely candidates for next-generation supercomputers. It will also enable direct comparisons with ARCHER, the UK's current national supercomputer service, as both Isambard and ARCHER will sport the same high performance interconnect, and both will use Cray's high-quality optimising software environment.

Isambard is an exciting experiment, enabled by EPSRC's investment in Tier 2 HPC systems. If we discover that ARM processors are competitive in HPC, then Isambard could be the first of a new generation of ARM-based supercomputers, ushering in an era of wider architectural choice, with greater opportunities for differentiation between supercomputer vendors. These outcomes should mean that scientists can choose systems more highly optimised to solve their problem, thus delivering even more exciting scientific breakthroughs at greater cost effectiveness than ever before.

The revolution is here - watch this space!

Author

In the following table, contact information relevant to the page. The first column is for visual reference only. Data is in the right column.

Photo portrait of Simon McIntosh Smith
Name: Professor Simon McIntosh-Smith
Job title: Professor of High Performance Computing
Organisation: University of Bristol

Simon McIntosh-Smith is a full Professor of High Performance Computing at the University of Bristol in the UK. He began his career as a microprocessor architect at Inmos and STMicroelectronics in the early 1990s, before co-designing the world's first fully programmable graphics processor (GPU) at Pixelfusion in 1999. In 2002 he co-founded ClearSpeed Technology where, as Director of Architecture and Applications, he co-developed the first modern many-core HPC accelerators. He now leads the High Performance Research Group at the University of Bristol, where his research focuses on performance portability and application based fault tolerance. He plays a key role in designing and procuring HPC services at the local, regional and national level, including the UK's national HPC server, ARCHER. In 2016 he led the successful bid by the GW4 consortium along with the UK's Met Office and Cray, to design and build 'Isambard', the world's first large-scale production ARMv8-based supercomputer.