cosformer: Rethinking softmax in attention Z Qin, W Sun, H Deng, D Li, Y Wei, B Lv, J Yan, L Kong, Y Zhong arXiv preprint arXiv:2202.08791, 2022 | 184 | 2022 |
Hierarchically gated recurrent neural network for sequence modeling Z Qin, S Yang, Y Zhong Advances in Neural Information Processing Systems 36, 2024 | 37 | 2024 |
The devil in linear transformer Z Qin, X Han, W Sun, D Li, L Kong, N Barnes, Y Zhong arXiv preprint arXiv:2210.10340, 2022 | 37 | 2022 |
Toeplitz neural network for sequence modeling Z Qin, X Han, W Sun, B He, D Li, D Li, Y Dai, L Kong, Y Zhong arXiv preprint arXiv:2305.04749, 2023 | 23 | 2023 |
Vicinity vision transformer W Sun, Z Qin, H Deng, J Wang, Y Zhang, K Zhang, N Barnes, S Birchfield, ... IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (10 …, 2023 | 19 | 2023 |
Scaling transnormer to 175 billion parameters Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, F Yuan, X Luo, ... arXiv preprint arXiv:2307.14995, 2023 | 16 | 2023 |
Neural architecture search on efficient transformers and beyond Z Liu, D Li, K Lu, Z Qin, W Sun, J Xu, Y Zhong arXiv preprint arXiv:2207.13955, 2022 | 15 | 2022 |
Hgrn2: Gated linear rnns with state expansion Z Qin, S Yang, W Sun, X Shen, D Li, W Sun, Y Zhong arXiv preprint arXiv:2404.07904, 2024 | 14 | 2024 |
FedAPEN: personalized cross-silo federated learning with adaptability to statistical heterogeneity Z Qin, S Deng, M Zhao, X Yan Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 12 | 2023 |
Fine-grained audible video description X Shen, D Li, J Zhou, Z Qin, B He, X Han, A Li, Y Dai, L Kong, M Wang, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 10 | 2023 |
Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes Z Qin, D Chen, B Qian, B Ding, Y Li, S Deng arXiv preprint arXiv:2312.06353, 2023 | 9 | 2023 |
Accelerating toeplitz neural network with constant-time inference complexity Z Qin, Y Zhong arXiv preprint arXiv:2311.08756, 2023 | 5 | 2023 |
Linearized Relative Positional Encoding Z Qin, W Sun, K Lu, H Deng, D Li, X Han, Y Dai, L Kong, Y Zhong arXiv preprint arXiv:2307.09270, 2023 | 5 | 2023 |
Lightning attention-2: A free lunch for handling unlimited sequence lengths in large language models Z Qin, W Sun, D Li, X Shen, W Sun, Y Zhong arXiv preprint arXiv:2401.04658, 2024 | 4 | 2024 |
Linear video transformer with feature fixation K Lu, Z Liu, J Wang, W Sun, Z Qin, D Li, X Shen, H Deng, X Han, Y Dai, ... arXiv preprint arXiv:2210.08164, 2022 | 4 | 2022 |
Blockdfl: A blockchain-based fully decentralized federated learning framework Z Qin, X Yan, M Zhou, P Zhao, S Deng arXiv preprint arXiv:2205.10568, 2022 | 4 | 2022 |
Transnormerllm: A faster and better large language model with improved transnormer Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, X Luo, Y Qiao, ... | 3 | 2023 |
All-pairs Consistency Learning forWeakly Supervised Semantic Segmentation W Sun, Y Zhang, Z Qin, Z Liu, L Cheng, F Wang, Y Zhong, N Barnes Proceedings of the IEEE/CVF International Conference on Computer Vision, 826-837, 2023 | 3 | 2023 |
Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective Z Qin, X Shen, W Sun, D Li, S Birchfield, R Hartley, Y Zhong arXiv preprint arXiv:2405.17383, 2024 | 2 | 2024 |
BlockDFL: A Blockchain-based Fully Decentralized Peer-to-Peer Federated Learning Framework Z Qin, X Yan, M Zhou, S Deng Proceedings of the ACM on Web Conference 2024, 2914-2925, 2024 | 2 | 2024 |