[2]CHOQUETTE J, GANDHI W, GIROUX O, et al. NVIDIA A100 tensor core GPU: performance and innovation[J].IEEE Micro,2021,41(2): 29-35.
|
JIMENEZ J. Computing high-Reynolds-number turbulence: will simulations ever replace experiments?[J]. Journal of Turbulence,2003,4. DOI: 10.1088/1468-5248/4/1/022.
|
[3]RAVIKUMAR A, SRIRAMAN H. A novel mixed precision distributed TPU GAN for accelerated learning curve[J].Computer Systems Science and Engineering,2023,46(1): 563-578.
|
[4]NOVITSKIY I M, KUTATELADZE A G. DU8ML: machine learning-augmented density functional theory nuclear magnetic resonance computations for high-throughput in silico solution structure validation and revision of complex alkaloids[J].Journal of Organic Chemistry,2022,87(7): 4818-4828.
|
[5]HAIDAR A, TOMOV S, DONGARRA J, et al. Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers[C]//SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. Dallas, TX, USA: IEEE, 2018: 603-613.
|
[6]DU S, BHATTACHARYA C B, SEN S. Maximizing business returns to corporate social responsibility (CSR): the role of CSR communication[J]. International Journal of Management Reviews,2010,12(1): 8-19.
|
[7]DENG L, LI G, HAN S, et al. Model compression and hardware acceleration for neural networks: a comprehensive survey[J].Proceedings of the IEEE,2020,108(4): 485-532.
|
[8]BAI Y, WANG Y X, LIBERTY E. ProxQuant: quantized neural networksvia proximal operators[J/OL]. 2018[2024-07-10]. https://arxiv.org/abs/1810.00861v3.
|
[9]BUTTARI A, DONGARRA J, KURZAK J, et al. Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy[J].ACM Transactions on Mathematical Software,2008,34(4): 1-22.
|
[10]陈逸, 刘博生, 徐永祺, 等. 混合精度频域卷积神经网络FPGA加速器设计[J]. 计算机工程, 2023,49(12): 1-9.(CHEN Yi, LIU Bosheng, XU Yongqi, et al. FPGA accelerator design for hybrid precision frequency domain convolutional neural network[J]. Computer Engineering,2023,49(12): 1-9.(in Chinese))
|
[11]AMESTOY P R, DUFF I S, L’EXCELLENT J Y. Multifrontal parallel distributed symmetric and unsymmetric solvers[J].Computer Methods in Applied Mechanics and Engineering,2000,184(2/3/4): 501-520.
|
[12]LI X S, DEMMEL J W. SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems[J]. ACM Transactions on Mathematical Software,2003,29(2): 110-140.
|
[13]HOGG J D, SCOTT J A. A fast and robust mixed-precision solver for the solution of sparse symmetric linear systems[J].ACM Transactions on Mathematical Software,2010,37(2): 1-24.
|
[14]CARSON E, HIGHAM N J. A new analysis of iterative refinement and its application to accurate solution of ill-conditioned sparse linear systems[J].SIAM Journal on Scientific Computing,2017,39(6): A2834-A2856.
|
[15]HIGHAM N J, PRANESH S. Exploiting lower precision arithmetic in solving symmetric positive definite linear systems and least squares problems[J].SIAM Journal on Scientific Computing,2021,43(1): A258-A277.
|
[16]LOE J A, GLUSA C A, YAMAZAKI I, et al. A study of mixed precision strategies for GMRES on GPUs[J/OL]. 2021[2024-07-10]. https://arxiv.org/abs/2109.01232v1.
|
[17]AMESTOY P, BUTTARI A, HIGHAM N J, et al. Five-precision GMRES-based iterative refinement[J].SIAM Journal on Matrix Analysis and Applications,2024,45(1): 529-552.
|
[18]HAIDAR A, BAYRAKTAR H, TOMOV S, et al. Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems[J].Proceedings of the Royal Society A:Mathematical,Physical and Engineering Sciences,2020,476(2243): 20200110.
|
[19]ZOUNON M, HIGHAM N J, LUCAS C, et al. Performance impact of precision reduction in sparse linear systems solvers[J].PeerJ Computer Science,2022,8: e778.
|
[20]GRATTON S, SIMON E, TITLEY-PELOQUIN D, et al. Exploiting variable precision in GMRES[EB/OL]. 2019[2024-07-10]. https://arxiv.org/abs/1907.10550v2.
|
[21]GIRAUD L, HAIDAR A, WATSON L T. Mixed-precision preconditioners in parallel domain decomposition solvers[M]//Lecture Notes in Computational Science and Engineering. Berlin: Springer, 2008: 357-364.
|
[22]GOBEL F, GRUTZMACHER T, RIBIZEL T, et al. Mixed precision incomplete and factorized sparse approximate inverse preconditioning on GPUs[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2021: 550-564.
|
[23]陈华, 史悦戎. 基于GPU的重启PGMRES并行算法研究[J]. 计算机工程与应用, 2014,50(7): 35-40.(CHEN Hua, SHI Yuerong. Study on restarted PGMRES parallel algorithm with GPU[J]. Computer Engineering and Applications,2014,50(7): 35-40.(in Chinese))
|
[24]冯选燕, 燕振国, 朱华君, 等. 非精确Newton方法中线性迭代收敛判据研究[J]. 空气动力学学报, 2023,41(12): 28-36.(FENG Xuanyan, YAN Zhenguo, ZHU Huajun, et al. Study on the convergence criterion of linear iteration in inexact Newton methods[J]. Acta Aerodynamica Sinica,2023,41(12): 28-36.(in Chinese))
|
[25]贡伊明, 刘战合, 刘溢浪, 等. 时间谱方法中的高效GMRES算法[J]. 航空学报, 2017,38(7): 120894.(GONG Yiming, LIU Zhanhe, LIU Yilang, et al. Efficient GMRES algorithm in time spectral method[J]. Acta Aeronautica et Astronautica Sinica,2017,38(7): 120894.(in Chinese))
|
[26]伍康, 吕毅斌, 石允龙, 等. 有界多连通区域数值保角变换的GMRES(m)法[J]. 应用数学和力学, 2022,43(9): 1026-1033.(WU Kang, L Yibin, SHI Yunlong, et al. The GMRES(m) method for numerical conformal mapping of bounded multi-connected domains[J]. Applied Mathematics and Mechanics,2022,43(9): 1026-1033.(in Chinese))
|
[27]肖文可, 陈星玎. 求解PageRank问题的重启GMRES修正的多分裂迭代法[J]. 应用数学和力学, 2022,43(3): 330-340.(XIAO Wenke, CHEN Xingding. A modified multi-splitting iterative method with the restarted GMRES to solve the PageRank problem[J]. Applied Mathematics and Mechanics,2022,43(3): 330-340.(in Chinese))
|