Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA G Bosilca, A Bouteiller, A Danalis, M Faverge, A Haidar, T Herault, ... 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 193* | 2011 |

Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers A Haidar, S Tomov, J Dongarra, NJ Higham SC18: International Conference for High Performance Computing, Networking …, 2018 | 115 | 2018 |

Accelerating numerical dense linear algebra calculations with GPUs J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ... Numerical computations with GPUs, 3-28, 2014 | 100 | 2014 |

Seismic wave modeling for seismic imaging J Virieux, S Operto, H Ben-Hadj-Ali, R Brossier, V Etienne, F Sourbier, ... The Leading Edge 28 (5), 538-544, 2009 | 96 | 2009 |

Performance, design, and autotuning of batched GEMM for GPUs A Abdelfattah, A Haidar, S Tomov, J Dongarra International Conference on High Performance Computing, 21-38, 2016 | 93 | 2016 |

Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels A Haidar, H Ltaief, J Dongarra Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 72 | 2011 |

Batched matrix computations on hardware accelerators based on GPUs A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra The International Journal of High Performance Computing Applications 29 (2 …, 2015 | 61 | 2015 |

Image-based date fruit classification A Haidar, H Dong, N Mavridis 2012 IV International Congress on Ultra Modern Telecommunications and …, 2012 | 59 | 2012 |

High-performance matrix-matrix multiplications of very small matrices I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ... European Conference on Parallel Processing, 659-671, 2016 | 51 | 2016 |

High-performance tensor contractions for GPUs A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ... Procedia Computer Science 80, 108-118, 2016 | 51 | 2016 |

Car parking vacancy detection and its application in 24-hour statistical analysis J Jermsurawong, MU Ahsan, A Haidar, H Dong, N Mavridis 2012 10th International Conference on Frontiers of Information Technology, 84-90, 2012 | 51 | 2012 |

Parallel programming models for dense linear algebra on heterogeneous systems J Dongarra, M Abalenkovs, A Abdelfattah, M Gates, A Haidar, J Kurzak, ... Supercomputing frontiers and innovations 2 (4), 67-86, 2016 | 49 | 2016 |

LU factorization of small matrices: Accelerating batched DGETRF on the GPU T Dong, A Haidar, P Luszczek, JA Harris, S Tomov, J Dongarra 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 …, 2014 | 48 | 2014 |

An improved parallel singular value algorithm and its implementation for multicore hardware A Haidar, J Kurzak, P Luszczek Proceedings of the International Conference on High Performance Computing …, 2013 | 45 | 2013 |

Parallel scalability study of hybrid preconditioners in three dimensions L Giraud, A Haidar, LT Watson Parallel Computing 34 (6-8), 363-379, 2008 | 45 | 2008 |

A framework for batched and GPU-resident factorization algorithms applied to block householder transformations A Haidar, TT Dong, S Tomov, P Luszczek, J Dongarra International Conference on High Performance Computing, 31-47, 2015 | 43 | 2015 |

Sparse approximations of the Schur complement for parallel algebraic hybrid linear solvers in 3D L Giraud, A Haidar, Y Saad INRIA, 2010 | 43 | 2010 |

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures A Haidar, H Ltaief, A YarKhan, J Dongarra Concurrency and Computation: Practice and Experience 24 (3), 305-321, 2012 | 42 | 2012 |

Investigating half precision arithmetic to accelerate dense linear system solvers A Haidar, P Wu, S Tomov, J Dongarra Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms …, 2017 | 39 | 2017 |

Hpc programming on intel many-integrated-core hardware with magma port to xeon phi J Dongarra, M Gates, A Haidar, Y Jia, K Kabir, P Luszczek, S Tomov Scientific Programming 2015, 2015 | 39 | 2015 |