Follow
Yige Li
Title
Cited by
Cited by
Year
Neural attention distillation: Erasing backdoor triggers from deep neural networks
Y Li, X Lyu, N Koren, L Lyu, B Li, X Ma
ICLR 2021, 2021
4402021
Anti-backdoor learning: Training clean models on poisoned data
Y Li, X Lyu, N Koren, L Lyu, B Li, X Ma
NeurIPS 2021, 2021
2942021
Reconstructive Neuron Pruning for Backdoor Defense
Y Li, X Lyu, X Ma, N Koren, L Lyu, B Li, YG Jiang
ICML 2023, 2023
322023
Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing
W Zhao, Z Li, Y Li, Y Zhang, J Sun
EMNLP 2024, 2024
92024
Multi-Trigger Backdoor Attacks: More Triggers, More Threats
Y Li, X Ma, J He, H Huang, YG Jiang
arXiv preprint arXiv:2401.15295, 2024
72024
Backdoorllm: A comprehensive benchmark for backdoor attacks on large language models
Y Li, H Huang, Y Zhao, X Ma, J Sun
arXiv preprint arXiv:2408.12798, 2024
42024
End-to-End Anti-Backdoor Learning on Images and Time Series
Y Jiang, X Ma, SM Erfani, Y Li, J Bailey
arXiv preprint arXiv:2401.03215, 2024
12024
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Y Zhao, X Zheng, L Luo, Y Li, X Ma, YG Jiang
arXiv preprint arXiv:2410.20971, 2024
2024
Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Y Li, H Huang, J Zhang, X Ma, YG Jiang
arXiv preprint arXiv:2410.19427, 2024
2024
AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models
J Zhang, J Ye, X Ma, Y Li, Y Yang, J Sang, DY Yeung
arXiv preprint arXiv:2410.05346, 2024
2024
Adversarial Suffixes May Be Features Too!
W Zhao, Z Li, Y Li, J Sun
arXiv preprint arXiv:2410.00451, 2024
2024
Do Influence Functions Work on Large Language Models?
Z Li, W Zhao, Y Li, J Sun
arXiv preprint arXiv:2409.19998, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–12