Zhen ZHENG
Zhen ZHENG
Alibaba Group
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα alibaba-inc.com - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer
H Fu, J Liao, W Xue, L Wang, D Chen, L Gu, J Xu, N Ding, X Wang, C He, ...
SC'16: Proceedings of the International Conference for High Performance …, 2016
262016
Versapipe: a versatile programming framework for pipelined computing on GPU
Z Zheng, C Oh, J Zhai, X Shen, Y Yi, W Chen
2017 50th Annual IEEE/ACM International Symposium on Microarchitecture …, 2017
222017
DAPPLE: A pipelined data parallel approach for training large models
S Fan, Y Rong, C Meng, Z Cao, S Wang, Z Zheng, C Wu, G Long, J Yang, ...
Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021
142021
HiWayLib: A software framework for enabling high performance communications for heterogeneous pipeline computations
Z Zheng, C Oh, J Zhai, X Shen, Y Yi, W Chen
Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019
62019
Fusionstitching: boosting memory intensive computations for deep learning workloads
Z Zheng, P Zhao, G Long, F Zhu, K Zhu, W Zhao, L Diao, J Yang, W Lin
arXiv preprint arXiv:2009.10924, 2020
52020
Understanding and bridging the gaps in current GNN performance optimizations
K Huang, J Zhai, Z Zheng, Y Yi, X Shen
Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021
42021
GOPipe: a granularity-oblivious programming framework for pipelined stencil executions on GPU
C Oh, Z Zheng, X Shen, J Zhai, Y Yi
Proceedings of the ACM International Conference on Parallel Architectures …, 2020
32020
Exploring deep reuse in winograd CNN inference
R Wu, F Zhang, Z Zheng, X Du, X Shen
Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021
22021
Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads
S Wang, Y Rong, S Fan, Z Zheng, LS Diao, G Long, J Yang, X Liu, W Lin
arXiv preprint arXiv:2007.04069, 2020
22020
DISC: A Dynamic Shape Compiler for Machine Learning Workloads
K Zhu, WY Zhao, Z Zheng, TY Guo, PZ Zhao, JJ Bai, J Yang, XY Liu, ...
Proceedings of the 1st Workshop on Machine Learning and Systems, 89-95, 2021
12021
Optimizing distributed training deployment in heterogeneous GPU clusters
X Yi, S Zhang, Z Luo, G Long, L Diao, C Wu, Z Zheng, J Yang, W Lin
Proceedings of the 16th International Conference on emerging Networking …, 2020
12020
基于 CUPTI 接口的典型 GPU 程序负载特征分析
郑祯, 翟季冬, 李焱, 陈文光
计算机研究与发展 53 (6), 1249-1262, 2016
2016
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–12