Distributed-Memory H-Matrix Algebra I: Data Distribution and Matrix-Vector Multiplication
Year: 2021
Author: Yingzhou Li, Jack Poulson, Lexing Ying
CSIAM Transactions on Applied Mathematics, Vol. 2 (2021), Iss. 3 : pp. 431–459
Abstract
We introduce a data distribution scheme for H-matrices and a distributed-memory algorithm for H-matrix-vector multiplication. Our data distribution scheme avoids an expensive Ω(P2) scheduling procedure used in previous work, where P is the number of processes, while data balancing is well-preserved. Based on the data distribution, our distributed-memory algorithm evenly distributes all computations among P processes and adopts a novel tree-communication algorithm to reduce the latency cost. The overall complexity of our algorithm is \mathscr{O}(\frac{Nlog N}{P} +αlog P+βlog^2P) for \mathcal{H}-matrices under weak admissibility condition, where N is the matrix size, α denotes the latency, and β denotes the inverse bandwidth. Numerically, our algorithm is applied to address both two- and three-dimensional problems of various sizes among various numbers of processes. On thousands of processes, good parallel efficiency is still observed.
You do not have full access to this article.
Already a Subscriber? Sign in as an individual or via your institution
Journal Article Details
Publisher Name: Global Science Press
Language: English
DOI: https://doi.org/10.4208/csiam-am.2020-0206
CSIAM Transactions on Applied Mathematics, Vol. 2 (2021), Iss. 3 : pp. 431–459
Published online: 2021-01
AMS Subject Headings: Global Science Press
Copyright: COPYRIGHT: © Global Science Press
Pages: 29
Keywords: Parallel fast algorithm \mathcal{H}-matrix distributed-memory parallel computing.
Author Details
Yingzhou Li Email
Jack Poulson Email
Lexing Ying Email