A Survey on Parallel Computing and Its Applications in Data-Parallel Problems Using GPU Architectures

Cristóbal A. Navarro; Nancy Hitschfeld-Kahler; Luis Mateu

doi:10.4208/cicp.110113.010813a

A Survey on Parallel Computing and Its Applications in Data-Parallel Problems Using GPU Architectures

Read Now

Download (PDF)

Year: 2014

Author: Cristóbal A. Navarro, Nancy Hitschfeld-Kahler, Luis Mateu

Communications in Computational Physics, Vol. 15 (2014), Iss. 2 : pp. 285–329

Abstract

Parallel computing has become an important subject in the field of computer science and has proven to be critical when researching high performance solutions. The evolution of computer architectures (multi-core and many-core) towards a higher number of cores can only confirm that parallelism is the method of choice for speeding up an algorithm. In the last decade, the graphics processing unit, or GPU, has gained an important place in the field of high performance computing (HPC) because of its low cost and massive parallel processing power. Super-computing has become, for the first time, available to anyone at the price of a desktop computer. In this paper, we survey the concept of parallel computing and especially GPU computing. Achieving efficient parallel algorithms for the GPU is not a trivial task, there are several technical restrictions that must be satisfied in order to achieve the expected performance. Some of these limitations are consequences of the underlying architecture of the GPU and the theoretical models behind it. Our goal is to present a set of theoretical and technical concepts that are often required to understand the GPU and its massive parallelism model. In particular, we show how this new technology can help the field of computational physics, especially when the problem is data-parallel. We present four examples of computational physics problems: n-body, collision detection, Potts model and cellular automata simulations. These examples well represent the kind of problems that are suitable for GPU computing. By understanding the GPU architecture and its massive parallelism programming model, one can overcome many of the technical limitations found along the way, design better GPU-based algorithms for computational physics problems and achieve speedups that can reach up to two orders of magnitude when compared to sequential implementations.

Submit Article

Journal Article Details

Publisher Name: Global Science Press

Language: English

DOI: https://doi.org/10.4208/cicp.110113.010813a

Communications in Computational Physics, Vol. 15 (2014), Iss. 2 : pp. 285–329

Published online: 2014-01

AMS Subject Headings: Global Science Press

Pages: 45

Keywords:

Author Details

Cristóbal A. Navarro

Nancy Hitschfeld-Kahler

Luis Mateu

A scalable and energy efficient GPU thread map for m-simplex domains
Navarro, Cristóbal A. | Quezada, Felipe A. | Bustos, Benjamin | Hitschfeld, Nancy | Kindelan, Rolando
Future Generation Computer Systems, Vol. 141 (2023), Iss. P.651
https://doi.org/10.1016/j.future.2022.12.020 [Citations: 0]
Potential benefits of a block-space GPU approach for discrete tetrahedral domains
Navarro, Cristobal A. | Bustos, Benjamin | Hitschfeld, Nancy
2016 XLII Latin American Computing Conference (CLEI), (2016), P.1
https://doi.org/10.1109/CLEI.2016.7833394 [Citations: 4]
Analysis of Global and Local Synchronization in Parallel Computing
Cicirelli, Franco | Giordano, Andrea | Mastroianni, Carlo
IEEE Transactions on Parallel and Distributed Systems, Vol. 32 (2021), Iss. 5 P.988
https://doi.org/10.1109/TPDS.2020.3037469 [Citations: 10]
Francis 99 CFD through RapidCFD accelerated GPU code
Molinero, D | Galván, S | Domínguez, F. | Ibarra, L | Solorio, G
IOP Conference Series: Earth and Environmental Science, Vol. 774 (2021), Iss. 1 P.012016
https://doi.org/10.1088/1755-1315/774/1/012016 [Citations: 2]
GPU Tensor Cores for Fast Arithmetic Reductions
Navarro, Cristobal A. | Carrasco, Roberto | Barrientos, Ricardo J. | Riquelme, Javier A. | Vega, Raimundo
IEEE Transactions on Parallel and Distributed Systems, Vol. 32 (2021), Iss. 1 P.72
https://doi.org/10.1109/TPDS.2020.3011893 [Citations: 32]
Synthesis and feedback on the distribution and parallelization of FMI-CS-based co-simulations with the DACCOSIM platform
Dad, Cherifa | Tavella, Jean-Philippe | Vialle, Stéphane
Parallel Computing, Vol. 106 (2021), Iss. P.102802
https://doi.org/10.1016/j.parco.2021.102802 [Citations: 1]
Modified Fully Homomorphic Encryption based on Parallel Processing in Cloud Computing
Tandel, Parth | Shubhrant, Abhinav | Sohani, Mayank
International Journal of Scientific Research in Computer Science, Engineering and Information Technology, Vol. (2021), Iss. P.250
https://doi.org/10.32628/CSEIT217252 [Citations: 1]
Follow the Leader: Alternating CPU/GPU Computations in PDES
Marotta, Romolo | Pellegrini, Alessandro | Andelfinger, Philipp
Proceedings of the 38th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (2024), P.47
https://doi.org/10.1145/3615979.3656056 [Citations: 2]
Integration of CPU and GPU to accelerate RSA modular exponentiation operation
Razaque, Abdul | Jinrui, Wang | Zancheng, Wang | Hani, Qassim Bani | Khaskheli, Murad Ali | Bhutto, Waseem Ahmed
2018 IEEE Long Island Systems, Applications and Technology Conference (LISAT), (2018), P.1
https://doi.org/10.1109/LISAT.2018.8378036 [Citations: 0]
Solving Poisson’s equation using FFT in a GPU cluster
Jodra, Jose L. | Gurrutxaga, Ibai | Muguerza, Javier | Yera, Ainhoa
Journal of Parallel and Distributed Computing, Vol. 102 (2017), Iss. P.28
https://doi.org/10.1016/j.jpdc.2016.09.004 [Citations: 6]
Inverse characterization of composite materials via surrogate modeling
Steuben, John | Michopoulos, John | Iliopoulos, Athanasios | Turner, Cameron
Composite Structures, Vol. 132 (2015), Iss. P.694
https://doi.org/10.1016/j.compstruct.2015.05.029 [Citations: 19]
A GPU-based parallel Object kinetic Monte Carlo algorithm for the evolution of defects in irradiated materials
Jiménez, F. | Ortiz, C.J.
Computational Materials Science, Vol. 113 (2016), Iss. P.178
https://doi.org/10.1016/j.commatsci.2015.11.011 [Citations: 22]
Heterogeneous parallel computing accelerated iterative subpixel digital image correlation
Huang, JianWen | Zhang, LingQi | Jiang, ZhenYu | Dong, ShouBin | Chen, Wei | Liu, YiPing | Liu, ZeJia | Zhou, LiCheng | Tang, LiQun
Science China Technological Sciences, Vol. 61 (2018), Iss. 1 P.74
https://doi.org/10.1007/s11431-017-9168-0 [Citations: 25]
Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing

History and Evolution of GPU Architecture
Das, Prashanta Kumar | Deka, Ganesh Chandra
2016
https://doi.org/10.4018/978-1-4666-8853-7.ch006 [Citations: 1]
Intelligent Scene Modeling and Human-Computer Interaction

Model Reconstruction of Real-World 3D Objects: An Application with Microsoft HoloLens
Jung, Younhyun | Wu, Yuhao | Jung, Hoijoon | Kim, Jinman
2021
https://doi.org/10.1007/978-3-030-71002-6_6 [Citations: 0]
MC64-Cluster: Many-Core CPU Cluster Architecture and Performance Analysis in B-Tree Searches
Esteban, Francisco José | Díaz, David | Hernández, Pilar | Caballero, Juan Antonio | Dorado, Gabriel | Gálvez, Sergio
The Computer Journal, Vol. 61 (2018), Iss. 6 P.912
https://doi.org/10.1093/comjnl/bxx114 [Citations: 2]
Minimization of high computational cost in data preprocessing and modeling using MPI4Py
Oluwasakin, E. | Torku, T. | Tingting, S. | Yinusa, A. | Hamdan, S. | Poudel, S. | Hasan, N. | Vargas, J. | Poudel, K.
Machine Learning with Applications, Vol. 13 (2023), Iss. P.100483
https://doi.org/10.1016/j.mlwa.2023.100483 [Citations: 5]
Massively Parallel Discrete Element Method Simulations on Graphics Processing Units
Steuben, John | Mustoe, Graham | Turner, Cameron
Journal of Computing and Information Science in Engineering, Vol. 16 (2016), Iss. 3
https://doi.org/10.1115/1.4033724 [Citations: 4]
DARIO: Differentiable Vision Transformer Pruning With Low-Cost Proxies
Sun, Haozhe | Heuillet, Alexandre | Mohr, Felix | Tabia, Hedi
IEEE Journal of Selected Topics in Signal Processing, Vol. 18 (2024), Iss. 6 P.997
https://doi.org/10.1109/JSTSP.2024.3501685 [Citations: 0]
GPU based numerical simulation of core shooting process
Zhang, Yi-zhong | Lu, Gao-chun | Ni, Chang-jiang | Jing, Tao | Yang, Lin-long | Wu, Qin-fang
China Foundry, Vol. 14 (2017), Iss. 5 P.392
https://doi.org/10.1007/s41230-017-7172-1 [Citations: 2]
PCIe-based FPGA-GPU heterogeneous computation for real-time multi-emitter fitting in super-resolution localization microscopy
Gui, Dan | Chen, Yunjiu | Kuang, Weibing | Shang, Mingtao | Zhang, Yingjun | Huang, Zhen-Li
Biomedical Optics Express, Vol. 13 (2022), Iss. 6 P.3401
https://doi.org/10.1364/BOE.459198 [Citations: 4]
GPU Maps for the Space of Computation in Triangular Domain Problems
Navarro, Cristobal A. | Hitschfeld, Nancy
2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), (2014), P.375
https://doi.org/10.1109/HPCC.2014.64 [Citations: 9]
Efficient GPU thread mapping on embedded 2D fractals
Navarro, Cristobál A. | Quezada, Felipe A. | Hitschfeld, Nancy | Vega, Raimundo | Bustos, Benjamin
Future Generation Computer Systems, Vol. 113 (2020), Iss. P.158
https://doi.org/10.1016/j.future.2020.07.006 [Citations: 6]
An Evaluation of Directive-Based Parallelization on the GPU Using a Parboil Benchmark
Đukić, Jovan | Mišić, Marko
Electronics, Vol. 12 (2023), Iss. 22 P.4555
https://doi.org/10.3390/electronics12224555 [Citations: 2]
Co-Processing Parallel Computation for Distributed Optical Fiber Vibration Sensing
Wang, Yu | Lv, Yuejuan | Jin, Baoquan | Xu, Yuelin | Chen, Yu | Liu, Xin | Bai, Qing
Applied Sciences, Vol. 10 (2020), Iss. 5 P.1747
https://doi.org/10.3390/app10051747 [Citations: 4]
An Empirical Investigation of a Fault Tolerant Containerized Application Deployment
Bisht, Sankalp Singh | Kaur, Parmeet
2022 1st International Conference on Informatics (ICI), (2022), P.171
https://doi.org/10.1109/ICI53355.2022.9786896 [Citations: 0]
Faster search for long gravitational-wave transients: GPU implementation of the transient $ \newcommand{\F}{\mathcal{F}}\boldsymbol{ \F}$ -statistic
Keitel, David | Ashton, Gregory
Classical and Quantum Gravity, Vol. 35 (2018), Iss. 20 P.205003
https://doi.org/10.1088/1361-6382/aade34 [Citations: 14]
BiqBin: A Parallel Branch-and-bound Solver for Binary Quadratic Problems with Linear Constraints
Gusmeroli, Nicolò | Hrga, Timotej | Lužar, Borut | Povh, Janez | Siebenhofer, Melanie | Wiegele, Angelika
ACM Transactions on Mathematical Software, Vol. 48 (2022), Iss. 2 P.1
https://doi.org/10.1145/3514039 [Citations: 7]
Coding Dimensions and the Power of Finite Element, Volume, and Difference Methods

Parallel Computing Techniques
Tayyeh, Alnoman Mundher | Shather, Akram H. | Anaz, Saja Sumiea | Jasim, Firas T.
2024
https://doi.org/10.4018/979-8-3693-3964-0.ch006 [Citations: 0]
Towards a GPU accelerated spatial computing framework
Chavan, Harshada | Alghamdi, Rami | Mokbel, Mohamed F.
2016 IEEE 32nd International Conference on Data Engineering Workshops (ICDEW), (2016), P.135
https://doi.org/10.1109/ICDEW.2016.7495634 [Citations: 3]
Improving performance of GPU code using novel features of the NVIDIA kepler architecture
Li, Yuanzhe | Schwiebert, Loren | Hailat, Eyad | Mick, Jason | Potoff, Jeffrey
Concurrency and Computation: Practice and Experience, Vol. 28 (2016), Iss. 13 P.3586
https://doi.org/10.1002/cpe.3744 [Citations: 8]
Reducing the replication time for structural estimations: A successful replication of “An Anatomy of International Trade” using GPU computing

Zhong, Jiatong

Economic Inquiry, Vol. 63 (2025), Iss. 2 P.424
https://doi.org/10.1111/ecin.13257 [Citations: 1]
Proceedings of the Future Technologies Conference (FTC) 2023, Volume 3

Parallel Fingerprint Recognition Using Generalized Hough Transform in a Virtual Grid
Zerbo, Ali | Ouedraogo, Moïse | Sere, Abdoulaye
2023
https://doi.org/10.1007/978-3-031-47457-6_35 [Citations: 0]
Turbomachinery GPU Accelerated CFD: An Insight into Performance
Molinero-Hernández, Daniel | Galván-González, Sergio R. | Herrera-Sandoval, Nicolás D. | Guzman-Avalos, Pablo | Pacheco-Ibarra, J. Jesús | Domínguez-Mota, Francisco J.
Computation, Vol. 12 (2024), Iss. 3 P.57
https://doi.org/10.3390/computation12030057 [Citations: 0]
Adaptive multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model
Navarro, Cristóbal A. | Huang, Wei | Deng, Youjin
Computer Physics Communications, Vol. 205 (2016), Iss. P.48
https://doi.org/10.1016/j.cpc.2016.04.007 [Citations: 12]
Acoustic Vibration of a Fluid in a Three-Dimensional Cavity: Finite Element Method Simulation using CUDA and MATLAB
Chango, Juan F. | Navarro, Cristobal A. | Gonzalez-Montenegro, Mario A.
2018 37th International Conference of the Chilean Computer Science Society (SCCC), (2018), P.1
https://doi.org/10.1109/SCCC.2018.8705226 [Citations: 0]
Lattice Monte Carlo simulation of Galilei variant anomalous diffusion
Guo, Gang | Bittig, Arne | Uhrmacher, Adelinde
Journal of Computational Physics, Vol. 288 (2015), Iss. P.167
https://doi.org/10.1016/j.jcp.2015.02.017 [Citations: 2]
Big Data Analytics Using Cloud Computing Based Frameworks for Power Management Systems: Status, Constraints, and Future Recommendations
AL-Jumaili, Ahmed Hadi Ali | Muniyandi, Ravie Chandren | Hasan, Mohammad Kamrul | Paw, Johnny Koh Siaw | Singh, Mandeep Jit
Sensors, Vol. 23 (2023), Iss. 6 P.2952
https://doi.org/10.3390/s23062952 [Citations: 50]
A GPU-Accelerated Filtered Density Function Simulator of Turbulent Reacting Flows
Inkarbekov, M. | Aitzhan, A. | Kaltayev, A. | Sammak, S.
International Journal of Computational Fluid Dynamics, Vol. 34 (2020), Iss. 6 P.381
https://doi.org/10.1080/10618562.2020.1787996 [Citations: 4]
Fusion of Calling Sites
do Couto Teixeira, Douglas | Collange, Caroline | Pereira, Fernando Magno Quintao
2015 27th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), (2015), P.90
https://doi.org/10.1109/SBAC-PAD.2015.16 [Citations: 1]
An efficient infinity norm minimization algorithm for under-determined inverse problems

Rateb, Ahmad M.

Digital Signal Processing, Vol. 156 (2025), Iss. P.104818
https://doi.org/10.1016/j.dsp.2024.104818 [Citations: 0]
Efficiency analysis of discontinuous Galerkin approaches for the application onto quantum Liouville-type equations
Ganiu, Valmir | Schulz, Dirk
Journal of Computational Electronics, Vol. 23 (2024), Iss. 4 P.718
https://doi.org/10.1007/s10825-024-02178-1 [Citations: 0]
Implementation of Floating‐Point Arithmetic Processing on Content Addressable Memory‐Based Massive‐ParallelSIMDmatriX Core
Kageyama, Kyosuke | Arai, Sota | Hamano, Hajime | Kong, Xiangbo | Koide, Tetsushi | Kumaki, Takeshi
IEEJ Transactions on Electrical and Electronic Engineering, Vol. 18 (2023), Iss. 4 P.546
https://doi.org/10.1002/tee.23753 [Citations: 3]
High-Performance and Parallel Computing Techniques Review: Applications, Challenges and Potentials to Support Net-Zero Transition of Future Grids
Al-Shafei, Ahmed | Zareipour, Hamidreza | Cao, Yankai
Energies, Vol. 15 (2022), Iss. 22 P.8668
https://doi.org/10.3390/en15228668 [Citations: 3]
Gamification-Based E-Learning Strategies for Computer Programming Education

Applying Gamification in a Parallel Programming Course
Fresno, Javier | Ortega-Arranz, Hector | Ortega-Arranz, Alejandro | Gonzalez-Escribano, Arturo | Llanos, Diego R.
2017
https://doi.org/10.4018/978-1-5225-1034-5.ch006 [Citations: 1]
Comparison of Pulse Compression Algorithm Implementations on Various Hardware Platforms

Dróżka, Mateusz

2023 Signal Processing Symposium (SPSympo), (2023), P.44
https://doi.org/10.23919/SPSympo57300.2023.10302685 [Citations: 0]
ESIREOS: Efficient, Scalable, Internal, Relative Evaluation of Outliers Solutions
Alves, William A. | Marques, Henrique O. | Naldi, Murilo C. | Sander, Jörg
2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS), (2023), P.555
https://doi.org/10.1109/ICPADS60453.2023.00088 [Citations: 0]
Irregular alignment of arbitrarily long DNA sequences on GPU
Perez-Wohlfeil, Esteban | Trelles, Oswaldo | Guil, Nicolás
The Journal of Supercomputing, Vol. 79 (2023), Iss. 8 P.8699
https://doi.org/10.1007/s11227-022-05007-z [Citations: 3]
Supercomputing

Multi GPU Implementation to Accelerate the CFD Simulation of a 3D Turbo-Machinery Benchmark Using the RapidCFD Library
Molinero, Daniel | Galván, Sergio | Pacheco, Jesús | Herrera, Nicolás
2019
https://doi.org/10.1007/978-3-030-38043-4_15 [Citations: 0]
RTX-RSim
Thoman, Peter | Wippler, Markus | Hranitzky, Robert | Fahringer, Thomas
Proceedings of the International Workshop on OpenCL, (2020), P.1
https://doi.org/10.1145/3388333.3388662 [Citations: 4]
A Mesh Reduced Method for Speeding Up Structured Grid-Based Water Quantity and Quality Models in Large-Scale River Networks
Kang, Jin | Wang, Yonggui | Xu, Jing | Yang, Shuihua | Hou, Haobo
Water, Vol. 11 (2019), Iss. 3 P.437
https://doi.org/10.3390/w11030437 [Citations: 1]
A computational discussion on brain topodynamics

Henry, Christopher J.

Physics of Life Reviews, Vol. 21 (2017), Iss. P.32
https://doi.org/10.1016/j.plrev.2017.04.007 [Citations: 1]
A GPU-Oriented Application Programming Interface for Digital Audio Workstations
Bianchi, Daniele | Avanzini, Federico | Barate, Adriano | Ludovico, Luca A. | Presti, Giorgio
IEEE Transactions on Parallel and Distributed Systems, Vol. 33 (2022), Iss. 8 P.1924
https://doi.org/10.1109/TPDS.2021.3131659 [Citations: 1]
Multi‐GPU room response simulation with hardware raytracing
Thoman, Peter | Wippler, Markus | Hranitzky, Robert | Gschwandtner, Philipp | Fahringer, Thomas
Concurrency and Computation: Practice and Experience, Vol. 34 (2022), Iss. 4
https://doi.org/10.1002/cpe.6663 [Citations: 2]
Daisen: A Framework for Visualizing Detailed GPU Execution
Sun, Yifan | Zhang, Yixuan | Mosallaei, Ali | Shah, Michael D. | Dunne, Cody | Kaeli, David
Computer Graphics Forum, Vol. 40 (2021), Iss. 3 P.239
https://doi.org/10.1111/cgf.14303 [Citations: 9]
New Advances of the P-SBAS Approach for an Efficient Parallel Processing of Large Volumes of Full-Resolution Multitemporal DInSAR Interferograms
Bonano, Manuela | Striano, Pasquale | Yasir, Muhammad | Buonanno, Sabatino | Casu, Francesco | De Luca, Claudio | Fusco, Adele | Roa, Yenni Lorena Belen | Zinno, Ivana | Virelli, Maria | Manunta, Michele | Lanari, Riccardo
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 18 (2025), Iss. P.2317
https://doi.org/10.1109/JSTARS.2024.3507542 [Citations: 0]
Learning Binary Descriptors for Fingerprint Indexing
Bai, Chaochao | Li, Mingqiang | Zhao, Tong | Wang, Weiqiang
IEEE Access, Vol. 6 (2018), Iss. P.1583
https://doi.org/10.1109/ACCESS.2017.2779562 [Citations: 5]
Improving fuzzy C-mean-based community detection in social networks using dynamic parallelism
Al-Ayyoub, Mahmoud | Al-andoli, Mohammed | Jararweh, Yaser | Smadi, Mohammad | Gupta, Brij
Computers & Electrical Engineering, Vol. 74 (2019), Iss. P.533
https://doi.org/10.1016/j.compeleceng.2018.01.003 [Citations: 20]
A survey on graphic processing unit computing for large‐scale data mining

Cano, Alberto

WIREs Data Mining and Knowledge Discovery, Vol. 8 (2018), Iss. 1
https://doi.org/10.1002/widm.1232 [Citations: 50]
A novel shadow calculation approach based on multithreaded parallel computing
Zhou, Xin | Shen, Xiaohan | Liu, Zhaoru | Sun, Hongsan | An, Jingjing | Yan, Da
Energy and Buildings, Vol. 312 (2024), Iss. P.114237
https://doi.org/10.1016/j.enbuild.2024.114237 [Citations: 0]
Accelerating range minimum queries with ray tracing cores
Meneses, Enzo | Navarro, Cristóbal A. | Ferrada, Héctor | Quezada, Felipe A.
Future Generation Computer Systems, Vol. 157 (2024), Iss. P.98
https://doi.org/10.1016/j.future.2024.03.040 [Citations: 2]
GPU-Based Data Processing for 2-D Microwave Imaging on MAST
Chorley, J. C | Akers, R. J | Brunner, K. J | Dipper, N. A | Freethy, S. J | Sharples, R. M | Shevchenko, V. F | Thomas, D. A | Vann, R. G. L
Fusion Science and Technology, Vol. 69 (2016), Iss. 3 P.643
https://doi.org/10.13182/FST15-188 [Citations: 3]
Use of GPUs to boost the performance of a lattice-free tumour growth model
Stella, Sabrina | Chignola, Roberto | Milotti, Edoardo
Journal of Physics: Conference Series, Vol. 566 (2014), Iss. P.012019
https://doi.org/10.1088/1742-6596/566/1/012019 [Citations: 0]
GPU parallel simulation algorithm of Brownian particles with excluded volume using Delaunay triangulations
Carter, Francisco | Hitschfeld, Nancy | Navarro, Cristóbal A. | Soto, Rodrigo
Computer Physics Communications, Vol. 229 (2018), Iss. P.148
https://doi.org/10.1016/j.cpc.2018.04.006 [Citations: 7]
Algorithms of Machine Learning and Application for Signal Compensation

Peng, Yudong

Highlights in Science, Engineering and Technology, Vol. 70 (2023), Iss. P.571
https://doi.org/10.54097/hset.v70i.13985 [Citations: 1]
Active and passive cooling techniques of graphical processing units in automotive applications - a review
Praveen, S M | A, Rammohan
Engineering Research Express, Vol. 6 (2024), Iss. 2 P.022506
https://doi.org/10.1088/2631-8695/ad513b [Citations: 1]
Efficient microscopy image analysis on CPU-GPU systems with cost-aware irregular data partitioning
Barreiros, Willian | Melo, Alba C.M.A. | Kong, Jun | Ferreira, Renato | Kurc, Tahsin M. | Saltz, Joel H. | Teodoro, George
Journal of Parallel and Distributed Computing, Vol. 164 (2022), Iss. P.40
https://doi.org/10.1016/j.jpdc.2022.02.004 [Citations: 3]
ShaderNet
Zhao, Lin | Khan, Arijit | Luo, Robby
Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), (2022), P.1
https://doi.org/10.1145/3534540.3534688 [Citations: 2]
A Tool for Translating Sequential Source Code to Parallel Code Written in C++ and OpenACC
Alsubhi, K. | Alsolami, F. | Algarni, A. | Albassam, E. | Khemakhem, M. | Eassa, F. | Jambi, K. | Ashraf, M. Usman
2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), (2019), P.1
https://doi.org/10.1109/AICCSA47632.2019.9035292 [Citations: 4]
Advances in Medical Image Segmentation: A Comprehensive Review of Traditional, Deep Learning and Hybrid Approaches
Xu, Yan | Quan, Rixiang | Xu, Weiting | Huang, Yi | Chen, Xiaolong | Liu, Fengyuan
Bioengineering, Vol. 11 (2024), Iss. 10 P.1034
https://doi.org/10.3390/bioengineering11101034 [Citations: 18]
Robot Intelligence Technology and Applications 6

Elimination of Race Condition During GPU Acceleration of Probabilistic Height Map
Kwon, Soonpyo | Byun, Juwoong | Park, Hae-Won
2022
https://doi.org/10.1007/978-3-030-97672-9_28 [Citations: 0]
Simultaneous detection for multiple anomaly data in internet of energy based on random forest
Li, Qiang | Zhang, Limei | Zhang, Guanghui | Ouyang, Hanyi | Bai, Muke
Applied Soft Computing, Vol. 134 (2023), Iss. P.109993
https://doi.org/10.1016/j.asoc.2023.109993 [Citations: 4]
Maximal clique enumeration problem on graphs: status and challenges
许, 绍显 | 廖, 小飞 | 邵, 志远 | 华, 强胜 | 金, 海
SCIENTIA SINICA Informationis, Vol. 52 (2022), Iss. 5 P.784
https://doi.org/10.1360/SSI-2021-0155 [Citations: 3]
A high-speed tracking algorithm for dense granular media
Cerda, Mauricio | Navarro, Cristóbal A. | Silva, Juan | Waitukaitis, Scott R. | Mujica, Nicolás | Hitschfeld, Nancy
Computer Physics Communications, Vol. 227 (2018), Iss. P.8
https://doi.org/10.1016/j.cpc.2018.02.010 [Citations: 10]
Job Scheduling Strategies for Parallel Processing

Memory-Aware Latency Prediction Model for Concurrent Kernels in Partitionable GPUs: Simulations and Experiments
Masola, Alessio | Capodieci, Nicola | Cavicchioli, Roberto | Olmedo, Ignacio Sanudo | Rouxel, Benjamin
2023
https://doi.org/10.1007/978-3-031-43943-8_3 [Citations: 0]
Rough Sets and Knowledge Technology

A Parallel Matrix-Based Approach for Computing Approximations in Dominance-Based Rough Sets Approach
Li, Shaoyong | Li, Tianrui
2014
https://doi.org/10.1007/978-3-319-11740-9_17 [Citations: 4]
A Systematic Study of Parallelization Strategies for Optimizing Scientific Computing Performance Bounds
Saravanan, Vijayalakshmi | Navuluru, Sai Karthik | Ibrahim, Khaled Z
2024 IEEE 37th International System-on-Chip Conference (SOCC), (2024), P.1
https://doi.org/10.1109/SOCC62300.2024.10737865 [Citations: 0]
GGArray: A Dynamically Growable GPU Array
Meneses, Enzo | Navarro, Cristobal A. | Ferrada, Hector
2022 41st International Conference of the Chilean Computer Science Society (SCCC), (2022), P.1
https://doi.org/10.1109/SCCC57464.2022.10000385 [Citations: 0]
State of the Art in Parallel and Distributed Systems: Emerging Trends and Challenges
Dai, Fei | Hossain, Md Akbar | Wang, Yi
Electronics, Vol. 14 (2025), Iss. 4 P.677
https://doi.org/10.3390/electronics14040677 [Citations: 0]
Analyzing GPU Tensor Core Potential for Fast Reductions
Carrasco, Roberto | Vega, Raimundo | Navarro, Cristobal A.
2018 37th International Conference of the Chilean Computer Science Society (SCCC), (2018), P.1
https://doi.org/10.1109/SCCC.2018.8705253 [Citations: 6]
RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network
Sun, Haozhe | Guyon, Isabelle | Mohr, Felix | Tabia, Hedi
2023 International Joint Conference on Neural Networks (IJCNN), (2023), P.1
https://doi.org/10.1109/IJCNN54540.2023.10191770 [Citations: 1]
AAP4All: An Adaptive Auto Parallelization of Serial Code for HPC Systems
Usman Ashraf, M. | Alburaei Eassa, Fathy | J. Osterweil, Leon | Ahmad Albeshri, Aiiad | Algarni, Abdullah | Ilyas, Iqra
Intelligent Automation & Soft Computing, Vol. 29 (2021), Iss. 3 P.615
https://doi.org/10.32604/iasc.2021.019044 [Citations: 2]
InSAR Greece with Parallelized Persistent Scatterer Interferometry: A National Ground Motion Service for Big Copernicus Sentinel-1 Data
Papoutsis, Ioannis | Kontoes, Charalampos | Alatza, Stavroula | Apostolakis, Alexis | Loupasakis, Constantinos
Remote Sensing, Vol. 12 (2020), Iss. 19 P.3207
https://doi.org/10.3390/rs12193207 [Citations: 35]
GPU Implementation of Adaptive Fourier Decomposition

Borowicz, Adam

2019 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), (2019), P.17
https://doi.org/10.23919/SPA.2019.8936752 [Citations: 1]
Optimized thread-block arrangement in a GPU implementation of a linear solver for atmospheric chemistry mechanisms
Guzman Ruiz, Christian | Acosta, Mario | Jorba, Oriol | Cesar Galobardes, Eduardo | Dawson, Matthew | Oyarzun, Guillermo | Pérez García-Pando, Carlos | Serradell, Kim
Computer Physics Communications, Vol. 302 (2024), Iss. P.109240
https://doi.org/10.1016/j.cpc.2024.109240 [Citations: 1]
Hardware Interrupt and CPU Contention aware CPU/GPU Co-Scheduling on Multi-Cluster System
Hwang, Sunjun | Choi, Jin | Yoo, Seohwan | Park, Hayeon | Lee, Chang-Gun
2022 5th International Conference on Information and Computer Technologies (ICICT), (2022), P.117
https://doi.org/10.1109/ICICT55905.2022.00028 [Citations: 2]
A Visual MapReduce Program Development Environment for Heterogeneous Computing on Clouds
Liang, Tyng-Yeu | Yeh, Li-Wei | Wu, Chi-Hong
Proceedings of the 2018 International Conference on Computing and Data Engineering, (2018), P.83
https://doi.org/10.1145/3219788.3219800 [Citations: 0]
A Fast Algorithm Based on Apriori Algorithms to Explore the Set of Repetitive Items of Large Transaction Data
Ghofrani, Javad | Bozorgmehr, Arezoo | Panah, Amir
Proceedings of the 2nd International Conference on Compute and Data Analysis, (2018), P.13
https://doi.org/10.1145/3193077.3193089 [Citations: 0]
A hadoop based platform for natural language processing of web pages and documents
Nesi, Paolo | Pantaleo, Gianni | Sanesi, Gianmarco
Journal of Visual Languages & Computing, Vol. 31 (2015), Iss. P.130
https://doi.org/10.1016/j.jvlc.2015.10.017 [Citations: 28]
Innovative Research and Applications in Next-Generation High Performance Computing

Hardware Transactional Memories
Shahid, Arsalan | Murad, Maryam | Qadri, Muhammad Yasir | Qadri, Nadia N. | Ahmed, Jameel
2016
https://doi.org/10.4018/978-1-5225-0287-6.ch003 [Citations: 1]
CLUS_GPU-BLASTP: accelerated protein sequence alignment using GPU-enabled cluster
Rani, Sita | Gupta, O. P.
The Journal of Supercomputing, Vol. 73 (2017), Iss. 10 P.4580
https://doi.org/10.1007/s11227-017-2036-4 [Citations: 8]
RenderBench: The CPU Rendering Benchmark Suite Based on Microarchitecture-Independent Characteristics
Wang, Peng | Yu, Zhibin
Electronics, Vol. 12 (2023), Iss. 19 P.4153
https://doi.org/10.3390/electronics12194153 [Citations: 0]
A review on light transport algorithms and simulation tools to model daylighting inside buildings

Ayoub, Mohammed

Solar Energy, Vol. 198 (2020), Iss. P.623
https://doi.org/10.1016/j.solener.2020.02.018 [Citations: 46]
A Study of Memory Consumption and Execution Performance of the cuFFT Library
Jodra, Jose Luis | Gurrutxaga, Ibai | Muguerza, Javier
2015 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), (2015), P.323
https://doi.org/10.1109/3PGCIC.2015.66 [Citations: 7]
A multi-GPU approach for the exchange Monte Carlo method
Navarro, Cristobal A. | Wei, Huang | Deng, Youjin
2015 34th International Conference of the Chilean Computer Science Society (SCCC), (2015), P.1
https://doi.org/10.1109/SCCC.2015.7416571 [Citations: 0]
Fast kNN query processing over a multi-node GPU environment
Barrientos, Ricardo J. | Riquelme, Javier A. | Hernández-García, Ruber | Navarro, Cristóbal A. | Soto-Silva, Wladimir
The Journal of Supercomputing, Vol. 78 (2022), Iss. 2 P.3045
https://doi.org/10.1007/s11227-021-03975-2 [Citations: 8]
GPU-accelerated rectangular decomposition for sound propagation modeling in 2D
Chango, Juan F. | Navarro, Cristobal A. | Gonzalez-Montenegro, Mario A.
2019 38th International Conference of the Chilean Computer Science Society (SCCC), (2019), P.1
https://doi.org/10.1109/SCCC49216.2019.8966434 [Citations: 1]
Speeding up the patch ordering method for image denoising
Munir, Badre | Hussain, Syed Fawad | Noor, Adnan
Multimedia Tools and Applications, Vol. 78 (2019), Iss. 16 P.23639
https://doi.org/10.1007/s11042-019-7708-z [Citations: 1]
Implementing a Reduction Clause to Overcome Critical Section Deficiencies in Parallel Computing
Syamsuddin, Sadly | Jufri, Jufri | Akhriana, Asmah | Ahyuna, Ahyuna | Rahman, Baharuddin | Djamro, Risnayanti Andi | Samsie, Indra
2024 IEEE International Conference on Control & Automation, Electronics, Robotics, Internet of Things, and Artificial Intelligence (CERIA), (2024), P.1
https://doi.org/10.1109/CERIA64726.2024.10914812 [Citations: 0]
Cooperative modular reinforcement learning for large discrete action space problem
Ming, Fangzhu | Gao, Feng | Liu, Kun | Zhao, Chengmei
Neural Networks, Vol. 161 (2023), Iss. P.281
https://doi.org/10.1016/j.neunet.2023.01.046 [Citations: 10]
Using GPUs to speed-up FCM-based community detection in Social Networks
Alandoli, Mohammed | Shehab, Mohammed | Al-Ayyoub, Mahmoud | Jararweh, Yaser | Al-Smadi, Mohammad
2016 7th International Conference on Computer Science and Information Technology (CSIT), (2016), P.1
https://doi.org/10.1109/CSIT.2016.7549467 [Citations: 14]
Analysis of a Self-Similar GPU Thread Map for Data-parallel m-Simplex Domains
Navarro, Cristobal A. | Bustos, Benjamin | Hitschfeld, Nancy
2019 International Conference on High Performance Computing & Simulation (HPCS), (2019), P.1002
https://doi.org/10.1109/HPCS48598.2019.9188081 [Citations: 0]
Performance Analysis of Multi-GPU Implementations of Krylov-Subspace Methods Applied to FEA of Electromagnetic Phenomena
Peixoto de Camargos, Ana Flavia | Silva, Viviane Cristine
IEEE Transactions on Magnetics, Vol. 51 (2015), Iss. 3 P.1
https://doi.org/10.1109/TMAG.2014.2363047 [Citations: 3]
Analyzing the limitations of parallelism in hardware and software through threaded programming

Wang, Chenyi

Highlights in Science, Engineering and Technology, Vol. 41 (2023), Iss. P.23
https://doi.org/10.54097/hset.v41i.6738 [Citations: 0]
pyC 2 Ray: A flexible and GPU-accelerated radiative transfer framework for simulating the cosmic epoch of reionization
Hirling, P. | Bianco, M. | Giri, S.K. | Iliev, I.T. | Mellema, G. | Kneib, J.-P.
Astronomy and Computing, Vol. 48 (2024), Iss. P.100861
https://doi.org/10.1016/j.ascom.2024.100861 [Citations: 1]
Block-Space GPU Mapping for Embedded Sierpiński Gasket Fractals
Navarro, Cristobal A. | Vega, Raimundo | Bustos, Benjamin | Hitschfeld, Nancy
2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), (2017), P.427
https://doi.org/10.1109/HPCC-SmartCity-DSS.2017.56 [Citations: 0]
Parallel family trees for transfer matrices in the Potts model
Navarro, Cristobal A. | Canfora, Fabrizio | Hitschfeld, Nancy | Navarro, Gonzalo
Computer Physics Communications, Vol. 187 (2015), Iss. P.55
https://doi.org/10.1016/j.cpc.2014.10.011 [Citations: 2]
Adaptive kinetic-fluid solvers for heterogeneous computing architectures
Zabelok, Sergey | Arslanbekov, Robert | Kolobov, Vladimir
Journal of Computational Physics, Vol. 303 (2015), Iss. P.455
https://doi.org/10.1016/j.jcp.2015.10.003 [Citations: 28]
A concept for data-driven computational mechanics in the presence of polymorphic uncertain properties
Zschocke, Selina | Leichsenring, Ferenc | Graf, Wolfgang | Kaliske, Michael
Engineering Structures, Vol. 267 (2022), Iss. P.114672
https://doi.org/10.1016/j.engstruct.2022.114672 [Citations: 11]
Computation Augmentation Techniques for Computing Continuum
Grosu, George-Mircea | Nistor, Silvia-Elena | Ciobanu, Radu-Ioan | Kolodziej, Joanna | Pop, Florin
2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW), (2024), P.136
https://doi.org/10.1109/CCGridW63211.2024.00023 [Citations: 0]
Asynchronous Processing for Latent Fingerprint Identification on Heterogeneous CPU-GPU Systems
Sanchez-Fernandez, Andres J. | Romero, Luis F. | Peralta, Daniel | Medina-Perez, Miguel Angel | Saeys, Yvan | Herrera, Francisco | Tabik, Siham
IEEE Access, Vol. 8 (2020), Iss. P.124236
https://doi.org/10.1109/ACCESS.2020.3005476 [Citations: 10]
Parallel Optimization for Large Scale Interferometric Synthetic Aperture Radar Data Processing
Zhang, Weikang | You, Haihang | Wang, Chao | Zhang, Hong | Tang, Yixian
Remote Sensing, Vol. 15 (2023), Iss. 7 P.1850
https://doi.org/10.3390/rs15071850 [Citations: 6]
Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing

GPU Computation and Platforms
Bhargavi, K. | Babu B., Sathish
2016
https://doi.org/10.4018/978-1-4666-8853-7.ch007 [Citations: 1]
Methodology of building intelligent systems on parallel processor
Seitkulov, Yerzhan | Tokhtabayev, Amur | Atanov, Sabyrzhan | Verenik, Nikolai L. | Girel, Alexey I. | Tatur, Mikhail M.
2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), (2014), P.1
https://doi.org/10.1109/ICAICT.2014.7035973 [Citations: 0]
Intensification of research work using images processing by application of parallel filtering on multi-core architectures
Bosakova-Ardenska, Atanaska | Andreeva, Hristina
INTERNATIONAL CONFERENCE ON ENVIRONMENTAL, MINING, AND SUSTAINABLE DEVELOPMENT 2022, (2024), P.030007
https://doi.org/10.1063/5.0195739 [Citations: 0]
Gamification in Education

Applying Gamification in a Parallel Programming Course
Fresno, Javier | Ortega-Arranz, Hector | Ortega-Arranz, Alejandro | Gonzalez-Escribano, Arturo | Llanos, Diego R.
2018
https://doi.org/10.4018/978-1-5225-5198-0.ch015 [Citations: 0]
A GPU-accelerated Monte Carlo code, RT2 for coupled transport of photon, electron/positron, and neutron
Lee, Chang-Min | Ye, Sung-Joon
Physics in Medicine & Biology, Vol. 69 (2024), Iss. 17 P.175005
https://doi.org/10.1088/1361-6560/ad694f [Citations: 0]
Scalable CAIM discretization on multiple GPUs using concurrent kernels
Cano, Alberto | Ventura, Sebastián | Cios, Krzysztof J.
The Journal of Supercomputing, Vol. 69 (2014), Iss. 1 P.273
https://doi.org/10.1007/s11227-014-1151-8 [Citations: 7]
Utilization of OpenCL for Large Graph Problems on Graphics Processing Unit
Mishra, Vinod Kumar | Sammal, Pankaj Singh
Electronic Notes in Discrete Mathematics, Vol. 63 (2017), Iss. P.125
https://doi.org/10.1016/j.endm.2017.11.007 [Citations: 0]
Replacement policies for a parallel system with shortage and excess costs
Zhao, Xufeng | Chen, Mingchih | Nakagawa, Toshio
Reliability Engineering & System Safety, Vol. 150 (2016), Iss. P.89
https://doi.org/10.1016/j.ress.2016.01.008 [Citations: 16]
Proposta de Experimento Didático para Compreender as Limitações do Uso de Arquiteturas Distribuídas para CAD
Beserra, David | Moreno, Edward David | Karman, Rubens | Galdino, Sergio
International Journal of Computer Architecture Education, Vol. 4 (2015), Iss. 1 P.25
https://doi.org/10.5753/ijcae.2015.4928 [Citations: 0]
mcRPL: a general purpose parallel raster processing library on distributed heterogeneous architectures
Gao, Huan | Peng, Xuantong | Guan, Qingfeng | Wang, Jingyi | Liu, Ziqi | Yang, Xue | Zeng, Wen
International Journal of Geographical Information Science, Vol. 37 (2023), Iss. 9 P.2043
https://doi.org/10.1080/13658816.2023.2244550 [Citations: 3]
Coding Dimensions and the Power of Finite Element, Volume, and Difference Methods

Utilizing Graphics Processing Units (GPUs) for Numerical Computations
Tayyeh, Alnoman Mundher | Shather, Akram H. | Hussein, Husam Abdulhameed | Abdalbaqi, Luma Saad
2024
https://doi.org/10.4018/979-8-3693-3964-0.ch011 [Citations: 0]
Deep learning-based community detection in complex networks with network partitioning and reduction of trainable parameters
Al-Andoli, Mohammed | Cheah, Wooi Ping | Tan, Shing Chiang
Journal of Ambient Intelligence and Humanized Computing, Vol. 12 (2021), Iss. 2 P.2527
https://doi.org/10.1007/s12652-020-02389-x [Citations: 24]
Using Dynamic Parallelism to Speed Up Clustering-Based Community Detection in Social Networks
Alandoli, Mohammed | Al-Ayyoub, Mahmoud | Al-Smadi, Mohammad | Jararweh, Yaser | Benkhelifa, Elhadj
2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), (2016), P.240
https://doi.org/10.1109/W-FiCloud.2016.57 [Citations: 12]
Cognitive information processing based on a parallel processor
Verenik, Nikolai L. | Girel, Alexey I. | Seitkulov, Yerzhan N. | Tatur, Mikhail M. | Razhkova, Hanna P.
The 10th International Conference on Digital Technologies 2014, (2014), P.356
https://doi.org/10.1109/DT.2014.6868739 [Citations: 1]
A Novel Method for Aircraft Structural Dynamic Strain Trend Signal Processing via Optimized Parallel Computing
Tian, Yongwei | Zhang, Fang | Jiang, Jinhui | Fan, Zhe
Applied Sciences, Vol. 14 (2024), Iss. 19 P.8892
https://doi.org/10.3390/app14198892 [Citations: 0]
SODECL
Avramidis, Eleftherios | Lalik, Marta | Akman, Ozgur E.
ACM Transactions on Mathematical Software, Vol. 46 (2020), Iss. 3 P.1
https://doi.org/10.1145/3385076 [Citations: 1]
Utilization of parallel computing in chemical engineering
Danko, Matej | Labovský, Juraj | Janošovský, Ján | Labovská, Zuzana | Jelemenský, Ľudovít
Acta Chimica Slovaca, Vol. 8 (2015), Iss. 2 P.146
https://doi.org/10.1515/acs-2015-0025 [Citations: 4]
Parallel Algorithms of Well-Balanced and Weighted Average Flux for Shallow Water Model Using CUDA
Sataporn, Nugool | Suwannik, Worasait | Maleewong, Montri | Murillo, Javier
Modelling and Simulation in Engineering, Vol. 2021 (2021), Iss. P.1
https://doi.org/10.1155/2021/9534495 [Citations: 0]
Time-dependent QED approach to x-ray nonlinear Compton scattering
Krebs, Dietrich | Reis, David A. | Santra, Robin
Physical Review A, Vol. 99 (2019), Iss. 2
https://doi.org/10.1103/PhysRevA.99.022120 [Citations: 17]
Competitiveness of a Non-Linear Block-Space GPU Thread Map for Simplex Domains
Navarro, Cristobal A. | Vernier, Matthieu | Bustos, Benjamin | Hitschfeld, Nancy
IEEE Transactions on Parallel and Distributed Systems, Vol. 29 (2018), Iss. 12 P.2728
https://doi.org/10.1109/TPDS.2018.2849705 [Citations: 6]
DiffMat: Data-driven inverse design of energy-absorbing metamaterials using diffusion model
Wang, Haoyu | Du, Zongliang | Feng, Fuyong | Kang, Zhong | Tang, Shan | Guo, Xu
Computer Methods in Applied Mechanics and Engineering, Vol. 432 (2024), Iss. P.117440
https://doi.org/10.1016/j.cma.2024.117440 [Citations: 2]
Spectrum Occupancy Measurements: A Survey and Use of Interference Maps
Hoyhtya, Marko | Mammela, Aarne | Eskola, Marina | Matinmikko, Marja | Kalliovaara, Juha | Ojaniemi, Jaakko | Suutala, Jaakko | Ekman, Reijo | Bacchus, Roger | Roberson, Dennis
IEEE Communications Surveys & Tutorials, Vol. 18 (2016), Iss. 4 P.2386
https://doi.org/10.1109/COMST.2016.2559525 [Citations: 160]
Modeling GPU Dynamic Parallelism for self similar density workloads
Quezada, Felipe A. | Navarro, Cristóbal A. | Romero, Miguel | Aguilera, Cristhian
Future Generation Computer Systems, Vol. 145 (2023), Iss. P.239
https://doi.org/10.1016/j.future.2023.03.046 [Citations: 3]
Age replacement models: A summary with new perspectives and methods
Zhao, Xufeng | Al-Khalifa, Khalifa N. | Magid Hamouda, Abdel | Nakagawa, Toshio
Reliability Engineering & System Safety, Vol. 161 (2017), Iss. P.95
https://doi.org/10.1016/j.ress.2017.01.011 [Citations: 102]
Rational Jacobi Kernel Functions: A novel massively parallelizable orthogonal kernel for support vector machines
Moghaddam, Mahdi Movahedian | Aghaei, Alireza Afzal | Parand, Kourosh
2024 Third International Conference on Distributed Computing and High Performance Computing (DCHPC), (2024), P.1
https://doi.org/10.1109/DCHPC60845.2024.10454075 [Citations: 1]
Empirical investigation: performance and power‐consumption based dual‐level model for exascale computing systems
Ashraf, Muhammad Usman | Eassa, Fathy Alboraei | Ahmad, Aiiad | Algarni, Abdullah
IET Software, Vol. 14 (2020), Iss. 4 P.319
https://doi.org/10.1049/iet-sen.2018.5062 [Citations: 5]
On GPU Connected Components and Properties: A Systematic Evaluation of Connected Component Labeling Algorithms and Their Extension for Property Extraction
Asad, Pedro | Marroquim, Ricardo | Souza, Andrea L. e L.
IEEE Transactions on Image Processing, Vol. 28 (2019), Iss. 1 P.17
https://doi.org/10.1109/TIP.2018.2851445 [Citations: 7]

A Survey on Parallel Computing and Its Applications in Data-Parallel Problems Using GPU Architectures

Abstract

Full Text

Additional Information

Journal Article Details

Author Details

Cited By

A scalable and energy efficient GPU thread map for m-simplex domains

Potential benefits of a block-space GPU approach for discrete tetrahedral domains

Analysis of Global and Local Synchronization in Parallel Computing

Francis 99 CFD through RapidCFD accelerated GPU code

GPU Tensor Cores for Fast Arithmetic Reductions

Synthesis and feedback on the distribution and parallelization of FMI-CS-based co-simulations with the DACCOSIM platform

Modified Fully Homomorphic Encryption based on Parallel Processing in Cloud Computing

Follow the Leader: Alternating CPU/GPU Computations in PDES

Integration of CPU and GPU to accelerate RSA modular exponentiation operation

Solving Poisson’s equation using FFT in a GPU cluster

Inverse characterization of composite materials via surrogate modeling

A GPU-based parallel Object kinetic Monte Carlo algorithm for the evolution of defects in irradiated materials

Heterogeneous parallel computing accelerated iterative subpixel digital image correlation

Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing

History and Evolution of GPU Architecture

Intelligent Scene Modeling and Human-Computer Interaction

Model Reconstruction of Real-World 3D Objects: An Application with Microsoft HoloLens

MC64-Cluster: Many-Core CPU Cluster Architecture and Performance Analysis in B-Tree Searches

Minimization of high computational cost in data preprocessing and modeling using MPI4Py

Massively Parallel Discrete Element Method Simulations on Graphics Processing Units

DARIO: Differentiable Vision Transformer Pruning With Low-Cost Proxies

GPU based numerical simulation of core shooting process

PCIe-based FPGA-GPU heterogeneous computation for real-time multi-emitter fitting in super-resolution localization microscopy

GPU Maps for the Space of Computation in Triangular Domain Problems

Efficient GPU thread mapping on embedded 2D fractals

An Evaluation of Directive-Based Parallelization on the GPU Using a Parboil Benchmark

Co-Processing Parallel Computation for Distributed Optical Fiber Vibration Sensing

An Empirical Investigation of a Fault Tolerant Containerized Application Deployment

Faster search for long gravitational-wave transients: GPU implementation of the transient $ \newcommand{\F}{\mathcal{F}}\boldsymbol{ \F}$ -statistic

BiqBin: A Parallel Branch-and-bound Solver for Binary Quadratic Problems with Linear Constraints

Coding Dimensions and the Power of Finite Element, Volume, and Difference Methods

Parallel Computing Techniques

Towards a GPU accelerated spatial computing framework

Improving performance of GPU code using novel features of the NVIDIA kepler architecture

Reducing the replication time for structural estimations: A successful replication of “An Anatomy of International Trade” using GPU computing

Proceedings of the Future Technologies Conference (FTC) 2023, Volume 3

Parallel Fingerprint Recognition Using Generalized Hough Transform in a Virtual Grid

Turbomachinery GPU Accelerated CFD: An Insight into Performance

Adaptive multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model

Acoustic Vibration of a Fluid in a Three-Dimensional Cavity: Finite Element Method Simulation using CUDA and MATLAB

Lattice Monte Carlo simulation of Galilei variant anomalous diffusion

Big Data Analytics Using Cloud Computing Based Frameworks for Power Management Systems: Status, Constraints, and Future Recommendations

A GPU-Accelerated Filtered Density Function Simulator of Turbulent Reacting Flows

Fusion of Calling Sites

An efficient infinity norm minimization algorithm for under-determined inverse problems

Efficiency analysis of discontinuous Galerkin approaches for the application onto quantum Liouville-type equations

Implementation of Floating‐Point Arithmetic Processing on Content Addressable Memory‐Based Massive‐ParallelSIMDmatriX Core

High-Performance and Parallel Computing Techniques Review: Applications, Challenges and Potentials to Support Net-Zero Transition of Future Grids

Gamification-Based E-Learning Strategies for Computer Programming Education

Applying Gamification in a Parallel Programming Course

Comparison of Pulse Compression Algorithm Implementations on Various Hardware Platforms

ESIREOS: Efficient, Scalable, Internal, Relative Evaluation of Outliers Solutions

Irregular alignment of arbitrarily long DNA sequences on GPU

Supercomputing

Multi GPU Implementation to Accelerate the CFD Simulation of a 3D Turbo-Machinery Benchmark Using the RapidCFD Library

RTX-RSim

A Mesh Reduced Method for Speeding Up Structured Grid-Based Water Quantity and Quality Models in Large-Scale River Networks

A computational discussion on brain topodynamics

A GPU-Oriented Application Programming Interface for Digital Audio Workstations

Multi‐GPU room response simulation with hardware raytracing

Daisen: A Framework for Visualizing Detailed GPU Execution

New Advances of the P-SBAS Approach for an Efficient Parallel Processing of Large Volumes of Full-Resolution Multitemporal DInSAR Interferograms

Learning Binary Descriptors for Fingerprint Indexing

Improving fuzzy C-mean-based community detection in social networks using dynamic parallelism

A survey on graphic processing unit computing for large‐scale data mining

A novel shadow calculation approach based on multithreaded parallel computing

Accelerating range minimum queries with ray tracing cores

GPU-Based Data Processing for 2-D Microwave Imaging on MAST

Use of GPUs to boost the performance of a lattice-free tumour growth model

GPU parallel simulation algorithm of Brownian particles with excluded volume using Delaunay triangulations

Algorithms of Machine Learning and Application for Signal Compensation

Active and passive cooling techniques of graphical processing units in automotive applications - a review

Efficient microscopy image analysis on CPU-GPU systems with cost-aware irregular data partitioning