Robust deep reinforcement learning through bootstrapped opportunistic curriculum J Wu, Y Vorobeychik International Conference on Machine Learning, 24177-24211, 2022 | 25 | 2022 |
Exact verification of relu neural control barrier functions H Zhang, J Wu, Y Vorobeychik, A Clark Advances in neural information processing systems 36, 5685-5705, 2023 | 17 | 2023 |
Neural lyapunov control for discrete-time systems J Wu, A Clark, Y Kantaros, Y Vorobeychik Advances in neural information processing systems 36, 2939-2955, 2023 | 16 | 2023 |
On the exploitability of reinforcement learning with human feedback for large language models J Wang, J Wu, M Chen, Y Vorobeychik, C Xiao arXiv preprint arXiv:2311.09641, 2023 | 13 | 2023 |
Axioms for AI Alignment from Human Feedback L Ge, D Halpern, E Micha, AD Procaccia, I Shapira, Y Vorobeychik, J Wu arXiv preprint arXiv:2405.14758, 2024 | 7 | 2024 |
Manipulating Elections by Changing Voter Perceptions J Wu, A Estornell, L Kong, Y Vorobeychik Proceedings of the Thirty-First International Joint Conference on Artificial …, 2022 | 7 | 2022 |
RLHFPoison: Reward poisoning attack for reinforcement learning with human feedback in large language models J Wang, J Wu, M Chen, Y Vorobeychik, C Xiao Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 3 | 2024 |
Preference Poisoning Attacks on Reward Model Learning J Wu, J Wang, C Xiao, C Wang, N Zhang, Y Vorobeychik arXiv preprint arXiv:2402.01920, 2024 | 2 | 2024 |
Learning Generative Deception Strategies in Combinatorial Masking Games J Wu, C Kamhoua, M Kantarcioglu, Y Vorobeychik Decision and Game Theory for Security: 12th International Conference …, 2021 | 2 | 2021 |
Certifying safety in reinforcement learning under adversarial perturbation attacks J Wu, H Sibai, Y Vorobeychik 2024 IEEE Security and Privacy Workshops (SPW), 57-67, 2024 | 1 | 2024 |
Verified Safe Reinforcement Learning for Neural Network Dynamic Models J Wu, H Zhang, Y Vorobeychik arXiv preprint arXiv:2405.15994, 2024 | | 2024 |