Our paper “A Submatrix-Based Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code CP2K” [1] has been accepted for publication at the International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing) 2020. In close collaboration, members of the Paderborn Center for Parallel Computing, the Department of Computer Science and the Department of Chemistry have extended the open source quantum chemistry code CP2K by an implementation of the submatrix method [2] for approximate function evaluation on distributed, sparse matrices.
The method is used to evaluate the matrix sign function in linear-scaling density functional theory (DFT) calculations. It provides a high level of parallelism, linear scaling with system size and higher weak scaling efficiency compared to conventional iterative methods on distributed matrices. Furthermore, by transforming the computations into local, dense matrix operations, the method paves the way to highly efficient acceleration with GPUs and FPGAs.
An implementation of FPGA acceleration for the matrix sign computation in CP2K is already available to all of our users, utilizing the Bittware 520N FPGA boards within our Noctua cluster. Instructions on how to use our FPGA-accelerated version of CP2K can be found in our Wiki:
https://wikis.uni-paderborn.de/pc2doc/Noctua-Software#FPGA-Accelerated_CP2K_for_DFT_with_the_Submatrix_Method
Abstract
Electronic structure calculations based on density-functional theory (DFT) represent a significant part of today's HPC workloads and pose high demands on high-performance computing resources. To perform these quantum-mechanical DFT calculations on complex large-scale systems, so-called linear scaling methods instead of conventional cubic scaling methods are required. In this work, we take up the idea of the submatrix method and apply it to the DFT computations in the software package CP2K. For that purpose, we transform the underlying numeric operations on distributed, large, sparse matrices into computations on local, much smaller and nearly dense matrices. This allows us to exploit the full floating-point performance of modern CPUs and to make use of dedicated accelerator hardware, where performance has been limited by memory bandwidth before. We demonstrate both functionality and performance of our implementation and show how it can be accelerated with GPUs and FPGAs.
References
- M. Lass, R. Schade, T. D. Kühne, C. Plessl
A Submatrix-Based Method for Approximate Matrix Function Evaluation in the Quantum Chemistry Code CP2K
In Proc. International Conference for High Performance Computing, Networking, Storage and Analysis (SC). 2020. Accepted for publication.
Preprint available on arXiv: http://arxiv.org/abs/2004.10811 - M. Lass, S. Mohr, H. Wiebeler, T.D. Kühne, and C. Plessl
A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices
In Proc. Platform for Advanced Scientific Computing (PASC). 2018.
DOI: 10.1145/3218176.3218231