Follow
Yongxin Zhu
Title
Cited by
Cited by
Year
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Z Cheng, S Leng, H Zhang, Y Xin, X Li, G Chen, Y Zhu, W Zhang, Z Luo, ...
arXiv preprint arXiv:2406.07476, 2024
822024
Difformer: Empowering diffusion models on the embedding space for text generation
Z Gao, J Guo, X Tan, Y Zhu, F Zhang, J Bian, L Xu
arXiv preprint arXiv:2212.09412, 2022
552022
Sequence-to-action: Grammatical error correction with action guided sequence generation
J Li, J Guo, Y Zhu, X Sheng, D Jiang, B Ren, L Xu
Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 10974 …, 2022
222022
Span-level aspect-based sentiment analysis via table filling
M Zhang, Y Zhu, Z Liu, Z Bao, Y Wu, X Sun, L Xu
Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023
142023
Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA
Y Zhu, Z Liu, Y Liang, X Li, H Liu, C Bao, L Xu
The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23) 37 …, 2023
82023
DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation
Y Zhu, Z Gao, X Zhou, Z Ye, L Xu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023
32023
Addressing representation collapse in vector quantized models with one linear layer
Y Zhu, B Li, Y Xin, L Xu
arXiv preprint arXiv:2411.02038, 2024
22024
Visual Hallucination Elevates Speech Recognition
F Zhang, Y Zhu, X Wang, H Chen, X Sun, L Xu
Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19542 …, 2024
22024
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Y Zhu, B Li, H Zhang, X Li, L Xu, L Bing
NeurIPS 2024, 2024
12024
Itrievalkd: an iterative retrieval framework assisted with knowledge distillation for noisy text-to-image retrieval
Z Liu, Y Zhu, Z Gao, X Sheng, L Xu
Pacific-Asia Conference on Knowledge Discovery and Data Mining, 257-268, 2023
12023
Summarizing Like Human: Edit-Based Text Summarization with Keywords
Y Liang, J Guo, Y Zhu, L Xu
International Conference on Artificial Neural Networks, 333-351, 2024
2024
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
H Yan, Y Zhu, K Zheng, B Liu, H Cao, D Jiang, L Xu
Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024
2024
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer
Y Zhu, D Su, L He, L Xu, D Yu
Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–13