Follow
Stephen McAleer
Stephen McAleer
OpenAI
Verified email at openai.com - Homepage
Title
Cited by
Cited by
Year
Highly accurate machine fault diagnosis using deep transfer learning
S Shao, S McAleer, R Yan, P Baldi
IEEE Transactions on Industrial Informatics 15 (4), 2446-2455, 2018
11952018
Solving the Rubik’s cube with deep reinforcement learning and search
F Agostinelli*, S McAleer*, A Shmakov*, P Baldi
Nature Machine Intelligence 1 (8), 356-363, 2019
2312019
Language Models can Solve Computer Tasks
G Kim, P Baldi, S McAleer
Neural Information Processing Systems (NeurIPS), 2023
2182023
Mastering the game of Stratego with model-free multiagent reinforcement learning
J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub, V de Boer, ...
Science 378 (6623), 990-996, 2022
1972022
Llemma: An Open Language Model for Mathematics
Z Azerbayev, H Schoelkopf, K Paster, M Dos Santos, S McAleer, AQ Jiang, ...
International Conference on Learning Representations (ICLR), 2023
1512023
AI Alignment: A Comprehensive Survey
J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang, Y Duan, Z He, J Zhou, ...
arXiv preprint arXiv:2310.19852, 2023
1362023
Solving the Rubik's Cube with Approximate Policy Iteration
S McAleer*, F Agostinelli*, A Shmakov*, P Baldi
International Conference on Learning Representations (ICLR), 2018
98*2018
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Y Chen, Y Yang, T Wu, S Wang, X Feng, J Jiang, SM McAleer, H Dong, ...
36th Conference on Neural Information Processing Systems (NeurIPS 2022 …, 2022
812022
Pipeline PSRO: A scalable approach for finding approximate nash equilibria in large games
S McAleer*, J Lanier*, R Fox, P Baldi
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
782020
Evolutionary reinforcement learning for sample-efficient multiagent coordination
S Majumdar, S Khadka, S Miret, S McAleer, K Tumer
International Conference on Machine Learning (ICML), 2020
702020
XDO: A double oracle algorithm for extensive-form games
S McAleer, J Lanier, P Baldi, R Fox
Advances in Neural Information Processing Systems (NeurIPS), 2021
552021
Independent Natural Policy Gradient Always Converges in Markov Potential Games
R Fox, S McAleer, W Overman, I Panageas
AISTATS 2022, 2021
542021
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems (NeurIPS), 2021
48*2021
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
342024
Online Double Oracle
LC Dinh, Y Yang, S McAleer, NP Nieves, O Slumbers, Z Tian, DH Mguni, ...
Transactions on Machine Learning Research, 2021
302021
White Paper: ARIANNA-200 high energy neutrino telescope
A Anker, P Baldi, SW Barwick, D Bergman, H Bernhoff, DZ Besson, ...
arXiv preprint arXiv:2004.09841, 2020
292020
Deep-learning-based reconstruction of the neutrino direction and energy for in-ice radio detectors
C Glaser, S McAleer, S Stjärnholm, P Baldi, SW Barwick
Astroparticle Physics 145, 102781, 2023
28*2023
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games
S McAleer, JB Lanier, K Wang, P Baldi, R Fox, T Sandholm
International Conference on Learning Representations (ICLR), 2022
24*2022
Reducing variance in temporal-difference value estimation via ensemble of deep networks
L Liang, Y Xu, S McAleer, D Hu, A Ihler, P Abbeel, R Fox
International Conference on Machine Learning (ICML), 2022
23*2022
Curiosity-Driven Multi-Criteria Hindsight Experience Replay
J Lanier, S McAleer, P Baldi
NeurIPS 2019 Deep RL Workshop, 2019
232019
The system can't perform the operation now. Try again later.
Articles 1–20