Ahmad Abdelfattah
Ahmad Abdelfattah
Research Scientist, Innovative Computing Laboratory, University of Tennessee
Verified email at icl.utk.edu
Title
Cited by
Cited by
Year
Performance, design, and autotuning of batched GEMM for GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
International Conference on High Performance Computing, 21-38, 2016
952016
High-performance tensor contractions for GPUs
A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ...
Procedia Computer Science 80, 108-118, 2016
542016
Parallel programming models for dense linear algebra on heterogeneous systems
J Dongarra, M Abalenkovs, A Abdelfattah, M Gates, A Haidar, J Kurzak, ...
Supercomputing frontiers and innovations 2 (4), 67-86, 2015
542015
High-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
European Conference on Parallel Processing, 659-671, 2016
532016
The design of fast and energy-efficient linear solvers: On the potential of half-precision arithmetic and iterative refinement techniques
A Haidar, A Abdelfattah, M Zounon, P Wu, S Pranesh, S Tomov, ...
International Conference on Computational Science, 586-600, 2018
392018
Kblas: An optimized library for dense matrix-vector multiplication on gpu accelerators
A Abdelfattah, D Keyes, H Ltaief
ACM Transactions on Mathematical Software (TOMS) 42 (3), 1-31, 2016
392016
With extreme computing, the rules have changed
J Dongarra, S Tomov, P Luszczek, J Kurzak, M Gates, I Yamazaki, H Anzt, ...
Computing in Science & Engineering 19 (3), 52-62, 2017
352017
A novel fast and accurate pseudo-analytical simulation approach for MOAO
E Gendron, A Charara, A Abdelfattah, D Gratadour, D Keyes, H Ltaief, ...
Adaptive Optics Systems IV 9148, 91486L, 2014
332014
A survey of numerical methods utilizing mixed precision arithmetic
A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, ...
arXiv preprint arXiv:2007.06674, 2020
242020
Fast batched matrix multiplication for small sizes using half-precision arithmetic on gpus
A Abdelfattah, S Tomov, J Dongarra
2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019
222019
C++ api for blas and lapack
M Gates, P Luszczek, A Abdelfattah, J Kurzak, J Dongarra, K Arturov, ...
SLATE Working Notes, 2017
22*2017
A guide for achieving high performance with very small matrices on GPU: A case study of batched LU and Cholesky factorizations
A Haidar, A Abdelfattah, M Zounon, S Tomov, J Dongarra
IEEE Transactions on Parallel and Distributed Systems 29 (5), 973-984, 2017
212017
Fast Cholesky factorization on GPUs for batch and native modes in MAGMA
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Journal of Computational Science 20, 85-93, 2017
192017
Pipelining computational stages of the tomographic reconstructor for multi-object adaptive optics on a multi-gpu system
A Charara, H Ltaief, D Gratadour, D Keyes, A Sevin, A Abdelfattah, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
192014
Optimizing memory-bound SYMV kernel on GPU hardware accelerators
A Abdelfattah, J Dongarra, D Keyes, H Ltaief
International Conference on High Performance Computing for Computational …, 2012
192012
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic
A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, A Fox, ...
The International Journal of High Performance Computing Applications 35 (4 …, 2021
162021
Systematic approach in optimizing numerical memory-bound kernels on GPU
A Abdelfattah, D Keyes, H Ltaief
European Conference on Parallel Processing, 207-216, 2012
162012
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Proceedings of the International Conference on Supercomputing, 1-10, 2017
152017
Performance tuning and optimization techniques of fixed and variable size batched Cholesky factorization on GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Procedia Computer Science 80, 119-130, 2016
142016
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
Parallel Computing 81, 1-21, 2019
122019
The system can't perform the operation now. Try again later.
Articles 1–20