Scalable Bayesian Optimization Using Deep Neural Networks J Snoek, O Rippel, K Swersky, R Kiros, N Satish, N Sundaram, M Patwary, ... arXiv preprint arXiv:1502.05700, 2015 | 584 | 2015 |

Twitter trending topic classification K Lee, D Palsetia, R Narayanan, MMA Patwary, A Agrawal, A Choudhary Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on …, 2011 | 327 | 2011 |

GraphMat: High performance graph analytics made productive N Sundaram, N Satish, MMA Patwary, SR Dulloor, MJ Anderson, ... Proceedings of the VLDB Endowment 8 (11), 1214-1225, 2015 | 248 | 2015 |

Navigating the maze of graph analytics frameworks using massive graph datasets N Satish, N Sundaram, MMA Patwary, J Seo, J Park, MA Hassaan, ... Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014 | 205* | 2014 |

A new scalable parallel DBSCAN algorithm using the disjoint-set data structure MMA Patwary, D Palsetia, A Agrawal, W Liao, F Manne, A Choudhary SC'12: Proceedings of the International Conference on High Performance …, 2012 | 180 | 2012 |

Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro arXiv preprint arXiv:1909.08053, 2019 | 154 | 2019 |

Deep Learning Scaling is Predictable, Empirically J Hestness, S Narang, N Ardalani, G Diamos, H Jun, H Kianinejad, ... arXiv preprint arXiv:1712.00409, 2017 | 142 | 2017 |

Fast Algorithms for the Maximum Clique Problem on Massive Sparse Graphs B Pattabiraman, M Patwary, M Ali, AH Gebremedhin, W Liao, ... arXiv preprint arXiv:1209.5818, 2012 | 86 | 2012 |

Fast maximum clique algorithms for large graphs RA Rossi, DF Gleich, AH Gebremedhin, MMA Patwary Proceedings of the companion publication of the 23rd international …, 2014 | 81 | 2014 |

ColPack: Software for graph coloring and related problems in scientific computing AH Gebremedhin, D Nguyen, MMA Patwary, A Pothen ACM Transactions on Mathematical Software (TOMS) 40 (1), 1-31, 2013 | 80 | 2013 |

Deep learning at 15PF: supervised and semi-supervised classification for scientific data T Kurth, J Zhang, N Satish, E Racah, I Mitliagkas, MMA Patwary, T Malas, ... Proceedings of the International Conference for High Performance Computing …, 2017 | 64 | 2017 |

Parallel efficient sparse matrix-matrix multiplication on multicore platforms MMA Patwary, NR Satish, N Sundaram, J Park, MJ Anderson, ... International Conference on High Performance Computing, 48-57, 2015 | 56 | 2015 |

Efficient shared-memory implementation of high-performance conjugate gradient benchmark and its application to unstructured matrices J Park, M Smelyanskiy, K Vaidyanathan, A Heinecke, DD Kalamkar, X Liu, ... High Performance Computing, Networking, Storage and Analysis, SC14 …, 2014 | 56 | 2014 |

Controlled vs. Automatic Processing: A Graph-Theoretic Approach to the Analysis of Serial vs. Parallel Processing in Neural Network Architectures S Musslick, B Dey, K Ozcimder, MMA Patwary, TL Willke, JD Cohen Proceedings of the 38th Annual Conference of the Cognitive Science Society …, 2016 | 55 | 2016 |

Controlled vs. Automatic Processing: A Graph-Theoretic Approach to the Analysis of Serial vs. Parallel Processing in Neural Network Architectures S Musslick, B Dey, K Ozcimder, MMA Patwary, TL Willke, JD Cohen Proceedings of the 38th Annual Conference of the Cognitive Science Society …, 2016 | 55 | 2016 |

Experiments on union-find algorithms for the disjoint-set data structure M Patwary, J Blair, F Manne Experimental Algorithms, 411-423, 2010 | 52 | 2010 |

Scalable parallel OPTICS data clustering using graph algorithmic techniques MA Patwary, D Palsetia, A Agrawal, W Liao, F Manne, A Choudhary Proceedings of SC13: International Conference for High Performance Computing …, 2013 | 50 | 2013 |

Graphpad: Optimized graph primitives for parallel and distributed platforms MJ Anderson, N Sundaram, N Satish, MMA Patwary, TL Willke, P Dubey 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016 | 45 | 2016 |

Multi-core spanning forest algorithms using the disjoint-set data structure M Patwary, M Ali, P Refsnes, F Manne Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th …, 2012 | 44 | 2012 |

Parallel greedy graph matching using an edge partitioning approach MMA Patwary, RH Bisseling, F Manne Proceedings of the fourth international workshop on High-level parallel …, 2010 | 37 | 2010 |