In the next few years, tremendous increases in supercomputer capability will revolutionize the way science is done, and predictive computer simulations will play a critical role in national security, energy, scientific discovery, and national competitiveness. The dramatic increase in computing power at the microprocessor level will be driven by a rapid escalation in the number of cores incorporated into a single chip rather than increases in clock rate. The transition from massively parallel architectures to multi-core architectures will be as profound and challenging as the change from vector architectures to massively parallel computers that occurred in the early 1990's that enabled our Nation and the U.S. Department of Energy to break the teraflop barrier. To effectively use the next generation of computers the nation must solve a host of architectural challenges in hardware and software.
- Moore's Law still holds, but clock speed is constrained by power and cooling limits
- Processors are shifting to multi/many core with attendant hierarchical parallelism
- Compute nodes with hardware accelerators create the additional complexity of heterogeneous architectures
- Processor cost is increasingly driven by pins and packaging, which means the memory wall is growing in proportion to the number of cores on a processor socket
- Supercomputer architectures must be designed with an understanding of the applications they are intended to run
- A supercomputer architecture that performs well on full scale real applications cannot be built from only commodity components
- Scaling limitations of present algorithms
- Hierarchical algorithms to deal with bandwidth across the memory hierarchy
- Software strategies to mitigate high memory latencies
- More complex multi-physics requires large memory per node
- Need for automated fault tolerance, performance analysis, and verification
- Innovative algorithms for multi-core, heterogeneous nodes
To meet these challenges, Sandia and Oak Ridge have established the Institute for Advanced Architectures and Algorithms (IAA). Sandia and Oak Ridge will build upon their long history of collaborating together, strong ties to universities, and successful industry collaborations (e.g., with Intel, Cray, AMD, Micron, SUN and IBM). The IAA is a premier example of a National Laboratory/Industry/University collaboration aimed at maintaining our global leadership in Science and Technology, and future competitiveness.
Partnerships will be extremely important to meeting the aggressive goals of the IAA. Two of these that will be tightly integrated into the IAA are the New Mexico Alliance for Computing at Extreme Scale (ACES) and the Extreme Scale Software Center (ESSC) at Oak Ridge. ACES is a NNSA/ASC alliance whose mission is to design, procure and operate future generations of ASC capability systems. ESSC is a DoD/SC center with focus on system software, algorithms, and tools for future architectures. These partners will have representation on the IAA advisory board and steering committee, help to define the key research areas, and participate in the IAA project plans. The IAA will have many other partners that will be producer/consumers of IAA technologies. These will range from other agencies, other national labs, universities, and industry.
Key HPC architectural research areas include but are not limited to:
- high-speed interconnects
- memory subsystems
- processor microarchitectures
- hierarchical algorithms
- programming models
- system software
- scalable I/O
Cross-cutting technologies needed in all these areas include the co-design of algorithms, system simulators, and application performance modeling. IAA will focus on only a couple key areas at any one time.
More details can be found in this presentation