PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections. H Wang, J Zhai, M Gao, Z Ma, S Tang, L Zheng, Y Li, K Rong, Y Chen, ... 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2021 | 55 | 2021 |
Toward edge-assisted video content intelligent caching with long short-term memory learning C Zhang, H Pang, J Liu, S Tang, R Zhang, D Wang, L Sun IEEE access 7, 152832-152846, 2019 | 44 | 2019 |
BaGuaLu: targeting brain scale pretrained models with over 37 million cores Z Ma, J He, J Qiu, H Cao, Y Wang, Z Sun, L Zheng, H Wang, S Tang, ... Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of …, 2022 | 36 | 2022 |
FreeTensor: a free-form DSL with holistic optimizations for irregular tensor programs S Tang, J Zhai, H Wang, L Jiang, L Zheng, Z Yuan, C Zhang Proceedings of the 43rd ACM SIGPLAN International Conference on Programming …, 2022 | 8 | 2022 |
EINNET: Optimizing Tensor Programs with Derivation-Based Transformations L Zheng, H Wang, J Zhai, M Hu, Z Ma, T Wang, S Huang, X Miao, S Tang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 2 | 2023 |
OLLIE: Derivation-based Tensor Program Optimizer L Zheng, H Wang, J Zhai, M Hu, Z Ma, T Wang, S Tang, L Xie, K Huang, ... arXiv preprint arXiv:2208.02025, 2022 | 2 | 2022 |
Mat2Stencil: A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid H Cao, S Tang, Q Zhu, B Yu, W Chen Proceedings of the ACM on Programming Languages 7 (OOPSLA2), 686-715, 2023 | 1 | 2023 |
Optimizing DNNs with Partially Equivalent Transformations and Automated Corrections H Wang, J Zhai, M Gao, F Zhang, T Wang, Z Ma, S Tang, L Zheng, ... IEEE Transactions on Computers, 2023 | 1 | 2023 |
Unified Programming Models for Heterogeneous High-Performance Computers ZX Ma, YY Jin, SZ Tang, HJ Wang, WC Xue, JD Zhai, WM Zheng Journal of Computer Science and Technology 38 (1), 211-218, 2023 | 1 | 2023 |
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR Z Ma, H Wang, J Xing, L Zheng, C Zhang, H Cao, K Huang, S Tang, ... arXiv preprint arXiv:2307.04995, 2023 | | 2023 |
Programming Matrices as Staged Sparse Rows to Generate Efficient Matrix-free Differential Equation Solver H Cao, S Tang, B Yu, W Chen arXiv preprint arXiv:2204.13304, 2022 | | 2022 |
Student Cluster Competition 2018, Team Tsinghua University: Reproducing performance of multi-physics simulations of the Tsunamigenic 2004 Sumatra megathrust earthquake on the … J He, C Zhao, J Yu, X Yu, L Zheng, C Lou, S Tang, W Han, J Zhai Parallel Computing 90, 102570, 2019 | | 2019 |
Towards Automatic Function Call Generation for Deep Learning S Tang, J Zhai | | |