{"DOI":"10.5281/zenodo.6553033","abstract":"The objective of Work Package 9 task 3 is to assess and make recommendations to the PRACE RI for joint developments with industrial partners to develop highly energy efficient HPC components and systems, as well as power and cooling technologies. WP9 has carried out this task through evaluation of a number of prototypes targeting novel approaches to HPC server and system design with many prototypes having some degree of direct industry involvement or support.
Prototype efforts assessed the use of FPGAs for function acceleration, the use of CPUs for the mobile market and with a TDP about two orders of magnitude less than typical x86 CPUs for the HPC market, DSPs common for embedded systems and with a TDP about one order of magnitude less than x86 CPUs, the emerging heterogeneous CPUs integrating x86 and GPU cores, and traditional GPUs with a novel direct communication between GPUs via Infiniband between nodes. Two prototypes focused on novel approaches to scalability of I/O systems in support of Exascale systems and their energy efficiency. Technologies assessed included integration of I/O nodes into the MPP or cluster interconnect fabric, the use of flash technology, scalable disk systems and virtual tape libraries based on disk systems with spun down idle disks. Data management in file systems, in particular the management of large numbers of small files, was also addressed with the I/O-prototypes. One prototype evaluation assessed the issues and benefits of integrated cooling solutions for hot water cooling.
The findings of the evaluations of prototypes looking at HPC server architectures is that 1) an optimized FPGA implementation of matrix-multiplication can offer a 5 \u2013 10 times higher energy efficiency than an x86 software solution, 2) an optimized implementation of matrix multiplication on the DSP can yield about half the energy efficiency gain of an FPGA implementation, 3) the first and second generation x86+GPU CPUs are not competitive in regards to energy efficiency even with standard x86 CPUs fo [...]","author":[{"family":"Johnsson","given":"Lennart"},{"family":"Netzer","given":"Gilbert"}],"id":"unknown","issued":{"date-parts":[[2013,3,31]]},"language":"en","publisher":"Zenodo","title":"D9.3.3: Report on prototypes evaluation","type":"article-journal"}