Follow
Xingjian He
Xingjian He
Institute of Automation of the Chinese Academy Sciences (CASIA)
Verified email at nlpr.ia.ac.cn
Title
Cited by
Cited by
Year
Valor: Vision-audio-language omni-perception pretraining model and dataset
S Chen, X He, L Guo, X Zhu, W Wang, J Tang, J Liu
arXiv preprint arXiv:2304.08345, 2023
512023
Non-autoregressive image captioning with counterfactuals-critical multi-agent learning
L Guo, J Liu, X Zhu, X He, J Jiang, H Lu
arXiv preprint arXiv:2005.04690, 2020
492020
Global-local propagation network for RGB-D semantic segmentation
S Chen, X Zhu, W Liu, X He, J Liu
arXiv preprint arXiv:2101.10801, 2021
182021
Vlab: Enhancing video language pre-training by feature adapting and blending
X He, S Chen, F Ma, Z Huang, X Jin, Z Liu, D Fu, Y Yang, J Liu, J Feng
arXiv preprint arXiv:2305.13167, 2023
152023
An efficient sampling-based attention network for semantic segmentation
X He, J Liu, W Wang, H Lu
IEEE Transactions on Image Processing 31, 2850-2863, 2022
92022
Dynamic warping network for semantic video segmentation
J Li, Y Zhao, X He, X Zhu, J Liu
Complexity 2021, 1-10, 2021
72021
Mamo: masked multimodal modeling for fine-grained vision-language representation learning
Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu
arXiv preprint arXiv:2210.04183, 2022
42022
MAMO: Fine-Grained Vision-Language Representations Learning with Masked Multimodal Modeling
Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu
Proceedings of the 46th International ACM SIGIR Conference on Research and …, 2023
32023
Cosa: Concatenated sample pretrained vision-language foundation model
S Chen, X He, H Li, X Jin, J Feng, J Liu
arXiv preprint arXiv:2306.09085, 2023
32023
Consistent-separable feature representation for semantic segmentation
X He, J Liu, J Fu, X Zhu, J Wang, H Lu
Proceedings of the AAAI Conference on Artificial Intelligence 35 (2), 1531-1539, 2021
32021
CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation
W Wang, X He, Y Zhang, L Guo, J Shen, J Li, J Liu
IEEE Transactions on Multimedia, 2024
22024
Mmnet: Multi-mask network for referring image segmentation
Y Yan, X He, W Wan, J Liu
arXiv preprint arXiv:2305.14969, 2023
22023
WL-MSR: Watch and Listen for Multimodal Subtitle Recognition
J Liu, H Wang, W Wang, X He, J Liu
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
12023
Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing
X He, W Wang, Z Xu, H Wang, J Jiang, J Liu
arXiv preprint arXiv:2109.02281, 2021
12021
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
T Yue, J Cheng, L Guo, X Dai, Z Zhao, X He, G Xiong, Y Lv, J Liu
arXiv preprint arXiv:2403.13263, 2024
2024
Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
W Wang, Y Zhang, X He, Y Yan, Z Zhao, X Wang, J Liu
arXiv preprint arXiv:2402.11265, 2024
2024
Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation
W Wang, T Yue, Y Zhang, L Guo, X He, X Wang, J Liu
arXiv preprint arXiv:2312.08007, 2023
2023
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Z Liu, S Chen, L Guo, H Li, X He, J Liu
Proceedings of the 31st ACM International Conference on Multimedia, 5120-5131, 2023
2023
CSDNet: Contrastive Similarity Distillation Network for Multi-lingual Image-Text Retrieval
S Lu, L Guo, X He, X Zhu, J Liu, S Liu
International Conference on Image and Graphics, 385-395, 2023
2023
EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Y Yan, X He, W Wang, S Chen, J Liu
arXiv preprint arXiv:2308.09779, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20