Distributed-Memory $\mathcal{H}$-Matrix Algebra I: Data Distribution and Matrix-Vector Multiplication

Distributed-Memory $\mathcal{H}$-Matrix Algebra I: Data Distribution and Matrix-Vector Multiplication

Year:    2021

Author:    Yingzhou Li, Jack Poulson, Lexing Ying

CSIAM Transactions on Applied Mathematics, Vol. 2 (2021), Iss. 3 : pp. 431–459

Abstract

We introduce a data distribution scheme for $\mathcal{H}$-matrices and a distributed-memory algorithm for $\mathcal{H}$-matrix-vector multiplication. Our data distribution scheme avoids an expensive $Ω(P^2)$ scheduling procedure used in previous work, where $P$ is the number of processes, while data balancing is well-preserved. Based on the data distribution, our distributed-memory algorithm evenly distributes all computations among $P$ processes and adopts a novel tree-communication algorithm to reduce the latency cost. The overall complexity of our algorithm is $\mathscr{O}(\frac{Nlog N}{P} +αlog P+βlog^2P)$ for $\mathcal{H}$-matrices under weak admissibility condition, where $N$ is the matrix size, $α$ denotes the latency, and $β$ denotes the inverse bandwidth. Numerically, our algorithm is applied to address both two- and three-dimensional problems of various sizes among various numbers of processes. On thousands of processes, good parallel efficiency is still observed.

You do not have full access to this article.

Already a Subscriber? Sign in as an individual or via your institution

Journal Article Details

Publisher Name:    Global Science Press

Language:    English

DOI:    https://doi.org/10.4208/csiam-am.2020-0206

CSIAM Transactions on Applied Mathematics, Vol. 2 (2021), Iss. 3 : pp. 431–459

Published online:    2021-01

AMS Subject Headings:    Global Science Press

Copyright:    COPYRIGHT: © Global Science Press

Pages:    29

Keywords:    Parallel fast algorithm $\mathcal{H}$-matrix distributed-memory parallel computing.

Author Details

Yingzhou Li

Jack Poulson

Lexing Ying