In the summer of 2019 the USA announced plans for its third Exascale Supercomputer, which is the generation beyond Terascale hardware.
The Aurora system uses Intel Xeon processors and is due to be running in the Argonne National Laboratory, Illinois in 2021 with performance of 1 Exaflops. Also in 2021 Cray (owned by Hewlett Packard Enterprise) will deliver the Frontier Supercomputer to Oak Ridge which is powered by AMD EPYC to deliver 1.5 Exaflops. Frontier fills a room and uses 100 Cray Shasta cabinets with Slingshot interconnects between the nodes and has a power consumption around 30MW.
We now know that El Capitan will be delivered in early 2023 and will also use Cray Shasta cabinets so broadly speaking El Capitan will look like Frontier however the performance will increase to 2 Exaflops with power consumption remaining under 40MW.
2 Exaflops is a whole heap of compute power and yes, we know FLOPS is an acronym for Floating Point Operations per Second so the correct form is ExaFLOPS.
In our recent news post about AMD Radeon Pro W5500 graphics we talked about performance up to 5.5 Teraflops which is 5.5x 10 to the 12. By contrast El Capitan will have a claimed performance of 2x 10 to the 18 so we are talking about the equivalent of hundreds of thousands of graphics cards which seems reasonable as the price of El Capitan is quoted as USD $600 million.
El Capitan will be installed at the Lawrence Livermore National Laboratory (LLNL) and will be used primarily to model nuclear weapons for the US Department of Energy.
AMD tells us “The system features next generation AMD EPYC processors, codenamed Genoa featuring the Zen 4 processor core, next generation AMD Radeon Instinct GPUs based on a new compute optimised architecture, 3rd Gen AMD Infinity Architecture and open source AMD ROCm heterogeneous computing software”.
At present AMD is using its Zen 2 architecture with DDR4 memory and we expect to see Zen 3 later this year or perhaps early in 2021. While Zen 3 is claimed to use a new architecture the current rumours suggest it will continue to use TSMC 7nm fabrication process with the same motherboard sockets and the same DDR4 memory. The change from Zen 2 to Zen 3 is likely to be a reordering of the cache system to share memory and reduce latency.
We expected Zen 4 to arrive during 2021 on a revised 7nm process, perhaps using Extreme Ultra Violet and therefore named 7nm+, however it now seems likely Zen 4 will move to TSMC 5nm. Zen 4 will support ‘Next Generation Memory’ which is presumably DDR5 however we also see the next generation Radeon Instinct GPUs will use ‘Next Generation High Bandwidth Memory (HBM)’. In addition AMD has named its next version of Infinity Fabric as Infinity Architecture which seems to be based on PCI Express Gen. 5 with the ability for one EPYC to communicate with four Radeon Instinct GPUs.
AMD tells us Infinity Architecture will support ‘Unified memory across CPU and GPU’ which seems like a contradiction in terms. Our best guess is that Zen 4 will use DDR5 system memory while the Radeon Instinct GPUs in El Capitan use HBM2E with Infinity Architecture allowing the EPYC CPU in each node to communicate with the graphics memory on the four GPUs for specific AI workloads.
The 2 Exaflops headline is impressive however our main interest with El Capitan is the insight we have gained into AMD’s plans over the next two or three years. We are impressed by the current Zen 2 architecture and Ryzen 3000 processors however Zen 3 and Zen 4 sound like they will be a real blast and will be coming in the very near future.