Direct preference optimization: Your language model is secretly a reward model R Rafailov, A Sharma, E Mitchell, CD Manning, S Ermon, C Finn Advances in Neural Information Processing Systems 36, 2024 | 1899 | 2024 |
Dynamics-aware unsupervised discovery of skills A Sharma, S Gu, S Levine, V Kumar, K Hausman International Conference on Learning Representations (ICLR), 2020, 2019 | 473 | 2019 |
Open X-Embodiment: Robotic learning Datasets and RT-X Models A Padalkar, A Pooley, A Jain, A Bewley, A Herzog, A Irpan, A Khazatsky, ... arXiv preprint arXiv:2310.08864, 2023 | 375* | 2023 |
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback K Tian, E Mitchell, A Zhou, A Sharma, R Rafailov, H Yao, C Finn, ... arXiv preprint arXiv:2305.14975, 2023 | 190 | 2023 |
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset A Khazatsky, K Pertsch, S Nair, A Balakrishna, S Dasari, S Karamcheti, ... arXiv preprint arXiv:2403.12945, 2024 | 78 | 2024 |
Variational empowerment as representation learning for goal-based reinforcement learning J Choi, A Sharma, H Lee, S Levine, SS Gu arXiv preprint arXiv:2106.01404, 2021 | 56* | 2021 |
Preference fine-tuning of llms should leverage suboptimal, on-policy data F Tajwar, A Singh, A Sharma, R Rafailov, J Schneider, T Xie, S Ermon, ... arXiv preprint arXiv:2404.14367, 2024 | 50 | 2024 |
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning A Sharma, M Ahn, S Levine, V Kumar, K Hausman, S Gu Robotics: Science and Systems (RSS), 2020 | 49 | 2020 |
Waypoint-Based Imitation Learning for Robotic Manipulation LX Shi, A Sharma, TZ Zhao, C Finn arXiv preprint arXiv:2307.14326, 2023 | 42 | 2023 |
Autonomous Reinforcement Learning via Subgoal Curricula A Sharma, A Gupta, S Levine, K Hausman, C Finn Thirty-Fifth Conference on Neural Information Processing Systems, 2021 | 37 | 2021 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections LX Shi, Z Hu, TZ Zhao, A Sharma, K Pertsch, J Luo, S Levine, C Finn arXiv preprint arXiv:2403.12910, 2024 | 35 | 2024 |
Autonomous Reinforcement Learning: Formalism and Benchmarking A Sharma, K Xu, N Sardana, A Gupta, K Hausman, S Levine, C Finn arXiv preprint arXiv:2112.09605, 2021 | 33* | 2021 |
An Emulator for Fine-Tuning Large Language Models using Small Language Models E Mitchell, R Rafailov, A Sharma, C Finn, CD Manning arXiv preprint arXiv:2310.12962, 2023 | 30 | 2023 |
You Only Live Once: Single-Life Reinforcement Learning A Chen, A Sharma, S Levine, C Finn Advances in Neural Information Processing Systems 35, 14784-14797, 2022 | 23 | 2022 |
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning J Luo, Z Hu, C Xu, YL Tan, J Berg, A Sharma, S Schaal, C Finn, A Gupta, ... arXiv preprint arXiv:2401.16013, 2024 | 22 | 2024 |
A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning A Sharma, R Ahmad, C Finn arXiv preprint arXiv:2205.05212, 2022 | 20 | 2022 |
Robot fine-tuning made easy: Pre-training rewards and policies for autonomous real-world reinforcement learning J Yang, MS Mark, B Vu, A Sharma, J Bohg, C Finn 2024 IEEE International Conference on Robotics and Automation (ICRA), 4804-4811, 2024 | 15 | 2024 |
When to ask for help: Proactive interventions in autonomous reinforcement learning A Xie, F Tajwar, A Sharma, C Finn Advances in Neural Information Processing Systems 35, 16918-16930, 2022 | 14 | 2022 |
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning A Sharma, AM Ahmed, R Ahmad, C Finn arXiv preprint arXiv:2303.01488, 2023 | 13 | 2023 |
Stream of Search (SoS): Learning to Search in Language K Gandhi, D Lee, G Grand, M Liu, W Cheng, A Sharma, ND Goodman arXiv preprint arXiv:2404.03683, 2024 | 11 | 2024 |