Follow
Zhen Qin
Zhen Qin
Unknown affiliation
Verified email at xd.com - Homepage
Title
Cited by
Cited by
Year
cosformer: Rethinking softmax in attention
Z Qin, W Sun, H Deng, D Li, Y Wei, B Lv, J Yan, L Kong, Y Zhong
arXiv preprint arXiv:2202.08791, 2022
2262022
Hierarchically gated recurrent neural network for sequence modeling
Z Qin, S Yang, Y Zhong
Advances in Neural Information Processing Systems 36, 2024
542024
The devil in linear transformer
Z Qin, X Han, W Sun, D Li, L Kong, N Barnes, Y Zhong
arXiv preprint arXiv:2210.10340, 2022
472022
Toeplitz neural network for sequence modeling
Z Qin, X Han, W Sun, B He, D Li, D Li, Y Dai, L Kong, Y Zhong
arXiv preprint arXiv:2305.04749, 2023
272023
Vicinity vision transformer
W Sun, Z Qin, H Deng, J Wang, Y Zhang, K Zhang, N Barnes, S Birchfield, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (10 …, 2023
262023
Hgrn2: Gated linear rnns with state expansion
Z Qin, S Yang, W Sun, X Shen, D Li, W Sun, Y Zhong
arXiv preprint arXiv:2404.07904, 2024
242024
Scaling transnormer to 175 billion parameters
Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, F Yuan, X Luo, ...
arXiv preprint arXiv:2307.14995, 2023
202023
Neural architecture search on efficient transformers and beyond
Z Liu, D Li, K Lu, Z Qin, W Sun, J Xu, Y Zhong
arXiv preprint arXiv:2207.13955, 2022
162022
Fine-grained audible video description
X Shen, D Li, J Zhou, Z Qin, B He, X Han, A Li, Y Dai, L Kong, M Wang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
142023
Lightning attention-2: A free lunch for handling unlimited sequence lengths in large language models
Z Qin, W Sun, D Li, X Shen, W Sun, Y Zhong
arXiv preprint arXiv:2401.04658, 2024
62024
Exploring Transformer Extrapolation
Z Qin, Y Zhong, H Deng
Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 18897 …, 2024
52024
Accelerating toeplitz neural network with constant-time inference complexity
Z Qin, Y Zhong
arXiv preprint arXiv:2311.08756, 2023
52023
Linearized Relative Positional Encoding
Z Qin, W Sun, K Lu, H Deng, D Li, X Han, Y Dai, L Kong, Y Zhong
arXiv preprint arXiv:2307.09270, 2023
52023
CO2: Efficient distributed training with full communication-computation overlap
W Sun, Z Qin, W Sun, S Li, D Li, X Shen, Y Qiao, Y Zhong
arXiv preprint arXiv:2401.16265, 2024
42024
All-pairs Consistency Learning forWeakly Supervised Semantic Segmentation
W Sun, Y Zhang, Z Qin, Z Liu, L Cheng, F Wang, Y Zhong, N Barnes
Proceedings of the IEEE/CVF International Conference on Computer Vision, 826-837, 2023
42023
Linear video transformer with feature fixation
K Lu, Z Liu, J Wang, W Sun, Z Qin, D Li, X Shen, H Deng, X Han, Y Dai, ...
arXiv preprint arXiv:2210.08164, 2022
42022
Transnormerllm: A faster and better large language model with improved transnormer
Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, X Luo, Y Qiao, ...
32023
TAVGBench: Benchmarking text to audible-video generation
Y Mao, X Shen, J Zhang, Z Qin, J Zhou, M Xiang, Y Zhong, Y Dai
Proceedings of the 32nd ACM International Conference on Multimedia, 6607-6616, 2024
22024
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Z Qin, W Sun, D Li, X Shen, W Sun, Y Zhong
arXiv preprint arXiv:2405.17381, 2024
22024
Unlocking the secrets of linear complexity sequence model from a unified perspective
Z Qin, X Shen, D Li, W Sun, S Birchfield, R Hartley, Y Zhong
arXiv preprint arXiv:2405.17383, 2024
22024
The system can't perform the operation now. Try again later.
Articles 1–20