Παρακολούθηση
Liang Luo
Liang Luo
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα cs.washington.edu - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel. CoRR abs/2304.11277 (2023)
Y Zhao, A Gu, R Varma, L Luo, CC Huang, M Xu, L Wright, H Shojanazeri, ...
221*2023
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
191*2022
Incbricks: Toward in-network computation with an in-network cache
M Liu, L Luo, J Nelson, L Ceze, A Krishnamurthy, K Atreya
Proceedings of the Twenty-Second International Conference on Architectural …, 2017
1762017
Parameter hub: a rack-scale parameter server for distributed deep neural network training
L Luo, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
Proceedings of the ACM Symposium on Cloud Computing, 41-54, 2018
1502018
PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud.
L Luo, P West, J Nelson, A Krishnamurthy, L Ceze
Proceedings of the 3rd MLSys Conference, 2020, 2020
87*2020
Laser: Light, accurate sharing detection and repair
L Luo, A Sriraman, B Fugate, S Hu, G Pokam, CJ Newburn, J Devietti
2016 IEEE International Symposium on High Performance Computer Architecture …, 2016
412016
Troubleshooting {Transiently-Recurring} Errors in Production Systems with {Blame-Proportional} Logging
L Luo, S Nath, LR Sivalingam, M Musuvathi, L Ceze
2018 USENIX Annual Technical Conference (USENIX ATC 18), 321-334, 2018
232018
DHEN: A deep and hierarchical ensemble network for large-scale click-through rate prediction
B Zhang, L Luo, X Liu, J Li, Z Chen, W Zhang, X Wei, Y Hao, M Tsang, ...
arXiv preprint arXiv:2203.11014, 2022
212022
Motivating in-network aggregation for distributed deep neural network training
L Luo, M Liu, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
Workshop on Approximate Computing Across the Stack, 2017
182017
Parameter box: High performance parameter servers for efficient distributed deep neural network training
L Luo, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
MLSys 2018, 2018
142018
Wukong: Towards a Scaling Law for Large-Scale Recommendation
B Zhang, L Luo, Y Chen, J Nie, X Liu, D Guo, Y Zhao, S Li, Y Hao, Y Yao, ...
arXiv preprint arXiv:2403.02545, 2024
122024
Moma: Efficient early-fusion pre-training with mixture of modality-aware experts
XV Lin, A Shrivastava, L Luo, S Iyer, M Lewis, G Ghosh, L Zettlemoyer, ...
arXiv preprint arXiv:2407.21770, 2024
102024
{NetHint}:{White-Box} networking for {Multi-Tenant} data centers
J Chen, H Zhang, W Zhang, L Luo, J Chase, I Stoica, D Zhuo
19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022
102022
Pre-train and search: Efficient embedding table sharding with pre-trained neural cost models
D Zha, L Feng, L Luo, B Bhushanam, Z Liu, Y Hu, J Nie, Y Huang, Y Tian, ...
Proceedings of Machine Learning and Systems 5, 68-88, 2023
92023
Srifty: Swift and thrifty distributed neural network training on the cloud
L Luo, P West, P Patel, A Krishnamurthy, L Ceze
Proceedings of Machine Learning and Systems 4, 833-847, 2022
82022
P4SGD: Programmable Switch Enhanced Model-Parallel Training on Generalized Linear Models on Distributed FPGAs
H Huang, Y Li, J Sun, X Zhu, J Zhang, L Luo, J Li, Z Wang
IEEE Transactions on Parallel and Distributed Systems 34 (8), 2311-2324, 2023
32023
Accelerating spmm kernel with cache-first edge sampling for graph neural networks
CY Lin, L Luo, L Ceze
arXiv preprint arXiv:2104.10716, 2021
32021
Cloud collectives: Towards cloud-aware collectives forml workloads with rank reordering
L Luo, J Nelson, A Krishnamurthy, L Ceze
arXiv preprint arXiv:2105.14088, 2021
22021
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large Scale Recommendation
L Luo, B Zhang, M Tsang, Y Ma, CH Chu, Y Chen, S Li, Y Hao, Y Zhao, ...
Proceedings of Machine Learning and Systems 6, 266-278, 2024
12024
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
W Liang, L Yu, L Luo, S Iyer, N Dong, C Zhou, G Ghosh, M Lewis, W Yih, ...
arXiv preprint arXiv:2411.04996, 2024
2024
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20