The Sunway TaihuLight supercomputer: system and applications

logo

SCIENCE CHINA Information Sciences, Volume 59, Issue 7: 072001(2016) https://doi.org/10.1007/s11432-016-5588-7

The Sunway TaihuLight supercomputer: system and applications

More info
  • ReceivedMay 27, 2016
  • AcceptedJun 11, 2016
  • PublishedJun 21, 2016

Abstract

The Sunway TaihuLight supercomputer is the world's first system with a peak performance greater than 100 PFlops. In this paper, we provide a detailed introduction to the TaihuLight system. In contrast with other existing heterogeneous supercomputers, which include both CPU processors and PCIe-connected many-core accelerators (NVIDIA GPU or Intel Xeon Phi), the computing power of TaihuLight is provided by a homegrown many-core SW26010 CPU that includes both the management processing elements (MPEs) and computing processing elements (CPEs) in one chip. With 260 processing elements in one CPU, a single SW26010 provides a peak performance of over three TFlops. To alleviate the memory bandwidth bottleneck in most applications, each CPE comes with a scratch pad memory, which serves as a user-controlled cache. To support the parallelization of programs on the new many-core architecture, in addition to the basic C/C++ and Fortran compilers, the system provides a customized Sunway OpenACC tool that supports the OpenACC 2.0 syntax. This paper also reports our preliminary efforts on developing and optimizing applications on the TaihuLight system, focusing on key application domains, such as earth system modeling, ocean surface wave modeling, atomistic simulation, and phase-field simulation.


References

[1] Hey A J, Tansley S, Tolle K M, et al. The Fourth Paradigm: Data-Intensive Scientific Discovery. Vol. 1. Redmond: Microsoft Research, 2009. Google Scholar

[2] Shingu S, Takahara H, Fuchigami H, et al. {A 26.58 TFlops global atmospheric simulation with the spectral transform method on the Earth Simulator}. In: Proceedings of the ACM/IEEE Conference on Supercomputing. Los Alamitos: IEEE, 2002. 1--19. Google Scholar

[3] Rudi J, Malossi A C I, Isaac T, et al. An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. New York: ACM, 2015. 5. Google Scholar

[4] Ishiyama T, Nitadori K, Makino J. 4.45 PFlops astrophysical $N$-body simulation on K computer: the gravitational trillion-body problem. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. Los Alamitos: IEEE, 2012. 5. Google Scholar

[5] Habib S, Morozov V A, Finkel H, et al. The universe at extreme scale: multi-petaflop sky simulation on the BG/Q. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. Los Alamitos: IEEE, 2012. 4. Google Scholar

[6] Shimokawabe T, Aoki T, Takaki T, et al. Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. New York: ACM, 2011. 3. Google Scholar

[7] Adiga N R, Almasi G, Aridor Y, et al. An overview of the BlueGene/L supercomputer. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing. Los Alamitos: IEEE, 2002. 1--22. Google Scholar

[8] Liao X K, Xiao L Q, Yang C Q, et al. {MilkyWay-2 supercomputer: system and application}. Front Comput Sci, 2014, 8: 345-356 CrossRef Google Scholar

[9] Yang X J, Liao X K, Lu K, et al. {The TianHe-1A supercomputer: its hardware and software}. J Comput Sci Technol, 2011, 26: 344-351 CrossRef Google Scholar

[10] Zheng F, Li H L, Lv H, et al. Cooperative computing techniques for a deeply fused and heterogeneous many-core processor architecture. J Comput Sci Technol, 2015, 30: 145-162 CrossRef Google Scholar

[11] Drake J, Foster I, Michalakes J, et al. {Design and performance of a scalable parallel community climate model}. Parallel Comput, 1995, 21: 1571-1591 CrossRef Google Scholar

[12] Dennis J M, Vertenstein M, Worley P H, et al. {Computational performance of ultra-high-resolution capability in the Community Earth System Model}. Int J High Perform Comput Appl, 2012, 26: 5-16 CrossRef Google Scholar

[13] Neale R B, Chen C-C, Gettelman A, et al. {Description of the NCAR Community Atmosphere Model (CAM 5.0)}. The National Center for Atmospheric Research, Boulder. Note NCAR/TN-4861STR. Google Scholar

[14] Dennis J M, Edwards J, Evans K J, et al. {CAM-SE: a scalable spectral element dynamical core for the Community Atmosphere Model}. Int J High Perform Comput Appl, 2012, 26: 74-89 CrossRef Google Scholar

[15] Lauritzen P H, Jablonowski C, Taylor M A. Numerical Techniques for Global Atmospheric Models. Berlin: Springer, 2011. Google Scholar

[16] Ogura Y, Phillips N A. Scale analysis of deep and shallow convection in the atmosphere. J Atmos Sci, 1962, 19: 173-179 CrossRef Google Scholar

[17] Ullrich P, Jablonowski C. Operator-split Runge-Kutta-Rosenbrock methods for nonhydrostatic atmospheric models. Mon Weather Rev, 2012, 140: 1257-1284 CrossRef Google Scholar

[18] Zeng Y Y, Li Q F, Wei Z, et al. MASNUM ocean wave numerical model in spherical coordinates and its application. Acta Oceanol Sin, 2005, 27: 1-7 Google Scholar

[19] Hou C F, Xu J, Wang P, et al. Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals. Comput Phys Commun, 2013, 184: 1364-1371 CrossRef Google Scholar

[20] Hou C F, Xu J, Wang P, et al. Petascale molecular dynamics simulation of crystalline silicon on Tianhe-1A. Int J High Perform Comput Appl, 2013, 27: 307-317 CrossRef Google Scholar

Copyright 2019 Science China Press Co., Ltd. 科学大众杂志社有限责任公司 版权所有

京ICP备18024590号-1