Year: 2015
Communications in Computational Physics, Vol. 18 (2015), Iss. 3 : pp. 808–830
Abstract
We describe our implementation of a parallel fast multipole method for evaluating potentials for discrete and continuous source distributions. The first requires summation over the source points and the second requiring integration over a continuous source density. Both problems require $\mathcal{O}$($N^2$) complexity when computed directly; however, can be accelerated to $\mathcal{O}$($N$) time using FMM. In our PVFMM software library, we use kernel independent FMM and this allows us to compute potentials for a wide range of elliptic kernels. Our method is high order, adaptive and scalable. In this paper, we discuss several algorithmic improvements and performance optimizations including cache locality, vectorization, shared memory parallelism and use of coprocessors. Our distributed memory implementation uses space-filling curve for partitioning data and a hypercube communication scheme. We present convergence results for Laplace, Stokes and Helmholtz (low wavenumber) kernels for both particle and volume FMM. We measure efficiency of our method in terms of CPU cycles per unknown for different accuracies and different kernels. We also demonstrate scalability of our implementation up to several thousand processor cores on the Stampede platform at the Texas Advanced Computing Center.
Journal Article Details
Publisher Name: Global Science Press
Language: English
DOI: https://doi.org/10.4208/cicp.020215.150515sw
Communications in Computational Physics, Vol. 18 (2015), Iss. 3 : pp. 808–830
Published online: 2015-01
AMS Subject Headings: Global Science Press
Copyright: COPYRIGHT: © Global Science Press
Pages: 23
-
An FMM Accelerated Poisson Solver for Complicated Geometries in the Plane Using Function Extension
Fryklund, Fredrik | Greengard, LeslieSIAM Journal on Scientific Computing, Vol. 45 (2023), Iss. 6 P.A3001
https://doi.org/10.1137/22M153495X [Citations: 1] -
Effect of electrostatic interaction on impact breakage of agglomerates formed by charged dielectric particles
Ruan, Xuan | Li, ShuiqingPhysical Review E, Vol. 106 (2022), Iss. 3
https://doi.org/10.1103/PhysRevE.106.034905 [Citations: 4] -
High order discretization techniques for real-space ab initio simulations
Anderson, Christopher R.
The Journal of Chemical Physics, Vol. 148 (2018), Iss. 11
https://doi.org/10.1063/1.5017477 [Citations: 1] -
Tapas: An Implicitly Parallel Programming Framework for Hierarchical N-Body Algorithms
Fukuda, Keisuke | Matsuda, Motohiko | Maruyama, Naoya | Yokota, Rio | Taura, Kenjiro | Matsuoka, Satoshi2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), (2016), P.1100
https://doi.org/10.1109/ICPADS.2016.0145 [Citations: 3] -
Fast convolution with free-space Green's functions
Vico, Felipe | Greengard, Leslie | Ferrando, MiguelJournal of Computational Physics, Vol. 323 (2016), Iss. P.191
https://doi.org/10.1016/j.jcp.2016.07.028 [Citations: 61] -
Computation of volume potentials on structured grids with the method of local corrections
Kavouklis, Chris | Colella, PhillipCommunications in Applied Mathematics and Computational Science, Vol. 13 (2018), Iss. 2 P.337
https://doi.org/10.2140/camcos.2018.13.337 [Citations: 0] -
Hardware Accelerator Integration Tradeoffs for High-Performance Computing: A Case Study of GEMM Acceleration in N-Body Methods
Asri, Mochamad | Malhotra, Dhairya | Wang, Jiajun | Biros, George | John, Lizy K. | Gerstlauer, AndreasIEEE Transactions on Parallel and Distributed Systems, Vol. 32 (2021), Iss. 8 P.2035
https://doi.org/10.1109/TPDS.2021.3056045 [Citations: 3] -
Fast and accurate solvers for simulating Janus particle suspensions in Stokes flow
Kohl, Ryan | Corona, Eduardo | Cheruvu, Vani | Veerapaneni, ShravanAdvances in Computational Mathematics, Vol. 49 (2023), Iss. 4
https://doi.org/10.1007/s10444-023-10046-y [Citations: 6] -
Enhanced Upward Translations for Systems with Clusters
Rejwer-Kosińska, Ewa | Linkov, Aleksandr | Rybarska-Rusinek, LilianaApplied Sciences, Vol. 14 (2024), Iss. 17 P.7543
https://doi.org/10.3390/app14177543 [Citations: 0] -
Localization of small obstacles from back-scattered data at limited incident angles with full-waveform inversion
Barucq, Hélène | Faucher, Florian | Pham, HaJournal of Computational Physics, Vol. 370 (2018), Iss. P.1
https://doi.org/10.1016/j.jcp.2018.05.011 [Citations: 8] -
A scalable computational platform for particulate Stokes suspensions
Yan, Wen | Corona, Eduardo | Malhotra, Dhairya | Veerapaneni, Shravan | Shelley, MichaelJournal of Computational Physics, Vol. 416 (2020), Iss. P.109524
https://doi.org/10.1016/j.jcp.2020.109524 [Citations: 27] -
A fast, high-order scheme for evaluating volume potentials on complex 2D geometries via area-to-line integral conversion and domain mappings
Anderson, Thomas G. | Zhu, Hai | Veerapaneni, ShravanJournal of Computational Physics, Vol. 472 (2023), Iss. P.111688
https://doi.org/10.1016/j.jcp.2022.111688 [Citations: 4] -
TBFMM: A C++ generic and parallel fast multipole method library
Bramas, Berenger
Journal of Open Source Software, Vol. 5 (2020), Iss. 56 P.2444
https://doi.org/10.21105/joss.02444 [Citations: 1] -
A fast method for imposing periodic boundary conditions on arbitrarily-shaped lattices in two dimensions
Pei, Ruqi | Askham, Travis | Greengard, Leslie | Jiang, ShidongJournal of Computational Physics, Vol. 474 (2023), Iss. P.111792
https://doi.org/10.1016/j.jcp.2022.111792 [Citations: 3] -
Kernel aggregated fast multipole method
Yan, Wen | Blackwell, RobertAdvances in Computational Mathematics, Vol. 47 (2021), Iss. 5
https://doi.org/10.1007/s10444-021-09896-1 [Citations: 8] -
Simulator calibration for accelerator-rich architecture studies
Asri, Mochamad | Pedram, Ardavan | John, Lizy K. | Gerstlauer, Andreas2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS), (2016), P.88
https://doi.org/10.1109/SAMOS.2016.7818335 [Citations: 4] -
Comparative Analysis of Direct Method and Fast Multipole Method for Multirotor Wake Dynamics
Sengupta, B. | Lee, Y. | Araghizadeh, M. S. | Myong, R. S. | Lee, H.International Journal of Aeronautical and Space Sciences, Vol. 25 (2024), Iss. 3 P.789
https://doi.org/10.1007/s42405-023-00699-w [Citations: 0] -
A New Mixed Potential Representation for Unsteady, Incompressible Flow
Greengard, Leslie | Jiang, ShidongSIAM Review, Vol. 61 (2019), Iss. 3 P.733
https://doi.org/10.1137/18M1216158 [Citations: 3] -
A Scalable Hierarchical Semi-Separable Library for Heterogeneous Clusters
Fernando, Isuru Dilanka | Jayasena, Sanath | Fernando, Milinda | Sundar, Hari2017 46th International Conference on Parallel Processing (ICPP), (2017), P.513
https://doi.org/10.1109/ICPP.2017.60 [Citations: 1] -
Scalable topology optimization with the kernel-independent fast multipole method
Ostanin, Igor | Tsybulin, Ivan | Litsarev, Mikhail | Oseledets, Ivan | Zorin, DenisEngineering Analysis with Boundary Elements, Vol. 83 (2017), Iss. P.123
https://doi.org/10.1016/j.enganabound.2017.07.020 [Citations: 3] -
Numerical approximation of the spatially homogeneous Fokker–Planck–Landau equation
Wollman, Stephen
Journal of Computational and Applied Mathematics, Vol. 324 (2017), Iss. P.173
https://doi.org/10.1016/j.cam.2017.04.016 [Citations: 2] -
Fast multipole preconditioners for sparse matrices arising from elliptic equations
Ibeid, Huda | Yokota, Rio | Pestana, Jennifer | Keyes, DavidComputing and Visualization in Science, Vol. 18 (2018), Iss. 6 P.213
https://doi.org/10.1007/s00791-017-0287-5 [Citations: 11] -
Solution of Stokes flow in complex nonsmooth 2D geometries via a linear-scaling high-order adaptive integral equation scheme
Wu, Bowei | Zhu, Hai | Barnett, Alex | Veerapaneni, ShravanJournal of Computational Physics, Vol. 410 (2020), Iss. P.109361
https://doi.org/10.1016/j.jcp.2020.109361 [Citations: 15] -
Fast and scalable evaluation of pairwise potentials
Hughey, S. | Alsnayyan, A. | Aktulga, H.M. | Gao, T. | Shanker, B.Computer Physics Communications, Vol. 255 (2020), Iss. P.107248
https://doi.org/10.1016/j.cpc.2020.107248 [Citations: 1] -
Finite Element Discretizations for Variable-Order Fractional Diffusion Problems
Lei, Wenyu | Turkiyyah, George | Knio, OmarJournal of Scientific Computing, Vol. 97 (2023), Iss. 1
https://doi.org/10.1007/s10915-023-02318-y [Citations: 2] -
Data-Driven Construction of Hierarchical Matrices With Nested Bases
Cai, Difeng | Huang, Hua | Chow, Edmond | Xi, YuanzheSIAM Journal on Scientific Computing, Vol. 46 (2024), Iss. 2 P.S24
https://doi.org/10.1137/22M1500848 [Citations: 2] -
Fast Large-Scale Algorithm for Electromagnetic Wave Propagation in 3D Media
Harris, Mitchell Tong | Langston, M. Harper | Letourneau, Pierre-David | Papanicolaou, George | Ezick, James | Lethin, Richard2019 IEEE High Performance Extreme Computing Conference (HPEC), (2019), P.1
https://doi.org/10.1109/HPEC.2019.8916219 [Citations: 1] -
Optimal control of the self-bound dipolar droplet formation process
Mennemann, J.-F. | Langen, T. | Exl, L. | Mauser, N.J.Computer Physics Communications, Vol. 244 (2019), Iss. P.205
https://doi.org/10.1016/j.cpc.2019.06.002 [Citations: 4] -
A fast multipole algorithm for radiative heat transfer in 3D semitransparent media
Han, Yaochuang | Nie, Yufeng | Dong, HaoJournal of Quantitative Spectroscopy and Radiative Transfer, Vol. 221 (2018), Iss. P.8
https://doi.org/10.1016/j.jqsrt.2018.09.002 [Citations: 3] -
Complex dynamics of long, flexible fibers in shear
LaGrone, John | Cortez, Ricardo | Yan, Wen | Fauci, LisaJournal of Non-Newtonian Fluid Mechanics, Vol. 269 (2019), Iss. P.73
https://doi.org/10.1016/j.jnnfm.2019.06.007 [Citations: 22] -
Extension of the fast multipole method for the rectangular cells with an anisotropic partition tree structure
Andoh, Yoshimichi | Yoshii, Noriyuki | Okazaki, SusumuJournal of Computational Chemistry, Vol. 41 (2020), Iss. 14 P.1353
https://doi.org/10.1002/jcc.26180 [Citations: 6] -
Scalable simulation of realistic volume fraction red blood cell flows through vascular networks
Lu, Libin | Morse, Matthew J. | Rahimian, Abtin | Stadler, Georg | Zorin, DenisProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (2019), P.1
https://doi.org/10.1145/3295500.3356203 [Citations: 11] -
Quadrature by fundamental solutions: kernel-independent layer potential evaluation for large collections of simple objects
Stein, David B. | Barnett, Alex H.Advances in Computational Mathematics, Vol. 48 (2022), Iss. 5
https://doi.org/10.1007/s10444-022-09971-1 [Citations: 6] -
Computation of volume potentials on structured grids with the method of local corrections
Kavouklis, Chris | Colella, PhillipCommunications in Applied Mathematics and Computational Science, Vol. 14 (2019), Iss. 1 P.1
https://doi.org/10.2140/camcos.2019.14.1 [Citations: 4] -
Computational tools for cellular scale biophysics
Stein, David B. | Shelley, Michael J.Current Opinion in Cell Biology, Vol. 89 (2024), Iss. P.102379
https://doi.org/10.1016/j.ceb.2024.102379 [Citations: 1] -
A fast platform for simulating semi-flexible fiber suspensions applied to cell mechanics
Nazockdast, Ehssan | Rahimian, Abtin | Zorin, Denis | Shelley, MichaelJournal of Computational Physics, Vol. 329 (2017), Iss. P.173
https://doi.org/10.1016/j.jcp.2016.10.026 [Citations: 68] -
The Anisotropic Truncated Kernel Method for Convolution with Free-Space Green's Functions
Greengard, Leslie | Jiang, Shidong | Zhang, YongSIAM Journal on Scientific Computing, Vol. 40 (2018), Iss. 6 P.A3733
https://doi.org/10.1137/18M1184497 [Citations: 11] -
Euchromatin Activity Enhances Segregation and Compaction of Heterochromatin in the Cell Nucleus
Mahajan, Achal | Yan, Wen | Zidovska, Alexandra | Saintillan, David | Shelley, Michael J.Physical Review X, Vol. 12 (2022), Iss. 4
https://doi.org/10.1103/PhysRevX.12.041033 [Citations: 13] -
High Performance Computing
Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions
Abduljabbar, Mustafa | Markomanolis, George S. | Ibeid, Huda | Yokota, Rio | Keyes, David2017
https://doi.org/10.1007/978-3-319-58667-0_5 [Citations: 6] -
Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing
Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation
Yokota, Rio | Ibeid, Huda | Keyes, David2017
https://doi.org/10.1007/978-3-319-62426-6_17 [Citations: 3] -
A robust solver for elliptic PDEs in 3D complex geometries
Morse, Matthew J. | Rahimian, Abtin | Zorin, DenisJournal of Computational Physics, Vol. 442 (2021), Iss. P.110511
https://doi.org/10.1016/j.jcp.2021.110511 [Citations: 8] -
A Coarea Formulation for Grid-Based Evaluation of Volume Integrals
Uchytil, Christopher | Storti, DuaneJournal of Computing and Information Science in Engineering, Vol. 20 (2020), Iss. 6
https://doi.org/10.1115/1.4047355 [Citations: 2] -
Cyclically parallelized treecode for fast computations of electrostatic interactions on molecular surfaces
Chen, Jiahui | Geng, Weihua | Reynolds, Daniel R.Computer Physics Communications, Vol. 260 (2021), Iss. P.107742
https://doi.org/10.1016/j.cpc.2020.107742 [Citations: 4] -
A GPU-Accelerated Barycentric Lagrange Treecode
Vaughn, Nathan | Wilson, Leighton | Krasny, Robert2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), (2020), P.701
https://doi.org/10.1109/IPDPSW50202.2020.00125 [Citations: 3] -
An integral equation method for the Cahn-Hilliard equation in the wetting problem
Wei, Xiaoyu | Jiang, Shidong | Klöckner, Andreas | Wang, Xiao-PingJournal of Computational Physics, Vol. 419 (2020), Iss. P.109521
https://doi.org/10.1016/j.jcp.2020.109521 [Citations: 4] -
Spectrally accurate solutions to inhomogeneous elliptic PDE in smooth geometries using function intension
Stein, David B.
Journal of Computational Physics, Vol. 470 (2022), Iss. P.111594
https://doi.org/10.1016/j.jcp.2022.111594 [Citations: 3] -
NcorpiON : A O(N) software for N-body integration in collisional and fragmenting systems
Couturier, Jérémy | Quillen, Alice C. | Nakajima, MikiNew Astronomy, Vol. 114 (2025), Iss. P.102313
https://doi.org/10.1016/j.newast.2024.102313 [Citations: 0] -
A GPU-accelerated fast multipole method based on barycentric Lagrange interpolation and dual tree traversal
Wilson, Leighton | Vaughn, Nathan | Krasny, RobertComputer Physics Communications, Vol. 265 (2021), Iss. P.108017
https://doi.org/10.1016/j.cpc.2021.108017 [Citations: 11] -
Fast topological-shape optimization with boundary elements in two dimensions
Ostanin, Igor A. | Zorin, Denis N. | Oseledets, Ivan V.Russian Journal of Numerical Analysis and Mathematical Modelling, Vol. 32 (2017), Iss. 2 P.127
https://doi.org/10.1515/rnam-2017-0011 [Citations: 1] -
High-precision real-space simulation of electrostatically confined few-electron states
Anderson, Christopher R. | Gyure, Mark F. | Quinn, Sam | Pan, Andrew | Ross, Richard S. | Kiselev, Andrey A.AIP Advances, Vol. 12 (2022), Iss. 6
https://doi.org/10.1063/5.0089350 [Citations: 6] -
PBBFMM3D: A parallel black-box algorithm for kernel matrix-vector multiplication
Wang, Ruoxi | Chen, Chao | Lee, Jonghyun | Darve, EricJournal of Parallel and Distributed Computing, Vol. 154 (2021), Iss. P.64
https://doi.org/10.1016/j.jpdc.2021.04.005 [Citations: 8] -
Boundary Integral Solver Approaches for Particle Accelerator Simulation Problems and Deployment on NERSC Hardware
Wei, Julia | Langston, M. Harper | Letourneau, Pierre-David | Morse, Matthew J. | Weintraub, Larry | Nogoy, Aimee | Amsel, Noah | Lethin, Richard2021 IEEE High Performance Extreme Computing Conference (HPEC), (2021), P.1
https://doi.org/10.1109/HPEC49654.2021.9622793 [Citations: 0] -
Extended Hermite Radial Basis Functions for Sparse Contours Interpolation
Zhong, Deyun | Wang, Liguan | Bi, LinIEEE Access, Vol. 8 (2020), Iss. P.58752
https://doi.org/10.1109/ACCESS.2020.2982802 [Citations: 6] -
Universal image systems for non-periodic and periodic Stokes flows above a no-slip wall
Yan, Wen | Shelley, MichaelJournal of Computational Physics, Vol. 375 (2018), Iss. P.263
https://doi.org/10.1016/j.jcp.2018.08.041 [Citations: 15] -
ExaFMM: a high-performance fast multipole method library with C++ and Python interfaces
Wang, Tingyu | Yokota, Rio | Barba, LorenaJournal of Open Source Software, Vol. 6 (2021), Iss. 61 P.3145
https://doi.org/10.21105/joss.03145 [Citations: 15] -
Fast electrostatic solvers for kinetic Monte Carlo simulations
Saunders, William Robert | Grant, James | Müller, Eike Hermann | Thompson, IanJournal of Computational Physics, Vol. 410 (2020), Iss. P.109379
https://doi.org/10.1016/j.jcp.2020.109379 [Citations: 7] -
Flexibly imposing periodicity in kernel independent FMM: A multipole-to-local operator approach
Yan, Wen | Shelley, MichaelJournal of Computational Physics, Vol. 355 (2018), Iss. P.214
https://doi.org/10.1016/j.jcp.2017.11.012 [Citations: 16] -
A regularization method for solving the Poisson equation for mixed unbounded-periodic domains
Juul Spietz, Henrik | Mølholm Hejlesen, Mads | Walther, Jens HonoréJournal of Computational Physics, Vol. 356 (2018), Iss. P.439
https://doi.org/10.1016/j.jcp.2017.12.018 [Citations: 6] -
Overlapping Domain Decomposition Preconditioner for Integral Equations
Chen, Chao | Biros, GeorgeSIAM Journal on Scientific Computing, Vol. 44 (2022), Iss. 6 P.A3617
https://doi.org/10.1137/21M1442917 [Citations: 0] -
Taylor states in stellarators: A fast high-order boundary integral solver
Malhotra, Dhairya | Cerfon, Antoine | Imbert-Gérard, Lise-Marie | O'Neil, MichaelJournal of Computational Physics, Vol. 397 (2019), Iss. P.108791
https://doi.org/10.1016/j.jcp.2019.06.067 [Citations: 16] -
A Parallel Arbitrary-Order Accurate AMR Algorithm for the Scalar Advection-Diffusion Equation
Bakhtiari, Arash | Malhotra, Dhairya | Raoofy, Amir | Mehl, Miriam | Bungartz, Hans-Joachim | Biros, GeorgeSC16: International Conference for High Performance Computing, Networking, Storage and Analysis, (2016), P.514
https://doi.org/10.1109/SC.2016.43 [Citations: 1] -
Algorithm 967
Malhotra, Dhairya | Biros, GeorgeACM Transactions on Mathematical Software, Vol. 43 (2017), Iss. 2 P.1
https://doi.org/10.1145/2898349 [Citations: 18] -
H2Pack
Huang, Hua | Xing, Xin | Chow, EdmondACM Transactions on Mathematical Software, Vol. 47 (2021), Iss. 1 P.1
https://doi.org/10.1145/3412850 [Citations: 8] -
Accurate and efficient calculation of photoionization in streamer discharges using fast multipole method
Lin, Bo | Zhuang, Chijie | Cai, Zhenning | Zeng, Rong | Bao, WeizhuPlasma Sources Science and Technology, Vol. 29 (2020), Iss. 12 P.125010
https://doi.org/10.1088/1361-6595/abc6f5 [Citations: 3] -
Efficient convergent boundary integral methods for slender bodies
Malhotra, Dhairya | Barnett, AlexJournal of Computational Physics, Vol. 503 (2024), Iss. P.112855
https://doi.org/10.1016/j.jcp.2024.112855 [Citations: 3]