Follow
Silviu Pitis
Silviu Pitis
University of Toronto, Vector Institute
Verified email at cs.toronto.edu - Homepage
Title
Cited by
Cited by
Year
Large language models are human-level prompt engineers
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
International Conference on Learning Representations (ICLR 2023), 2023
8242023
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
S Pitis, H Chan, S Zhao, B Stadie, J Ba
International Conference on Machine Learning (ICML 2020), 2020
1372020
Counterfactual data augmentation using locally factored dynamics
S Pitis, E Creager, A Garg
Neural Information Processing Systems (NeurIPS 2020), 2020
972020
Identifying the risks of lm agents with an lm-emulated sandbox
Y Ruan, H Dong, A Wang, S Pitis, Y Zhou, J Ba, Y Dubois, CJ Maddison, ...
arXiv preprint arXiv:2309.15817, 2023
562023
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
S Pitis
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019
562019
Boosted prompt ensembles for large language models
S Pitis, MR Zhang, A Wang, J Ba
arXiv preprint arXiv:2304.05970, 2023
402023
MoCoDA: Model-based Counterfactual Data Augmentation
S Pitis, E Creager, A Mandlekar, A Garg
Neural Information Processing Systems (NeurIPS 2022), 2022
392022
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
K De Asis, A Chan, S Pitis, RS Sutton, D Graves
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020
362020
Large language models are human-level prompt engineers (2022)
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
arXiv preprint arXiv:2211.01910, 2022
242022
An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
S Pitis, H Chan, K Jamali, J Ba
Eighth International Conference on Learning Representations (ICLR 2020), 2020
242020
Source Traces for Temporal Difference Learning
S Pitis
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018
212018
Calibrating language models via augmented prompt ensembles
M Jiang, Y Ruan, S Huang, S Liao, S Pitis, RB Grosse, J Ba
162023
Failure modes of learning reward models for llms and other sequence models
S Pitis
ICML 2023 Workshop The Many Facets of Preference-Based Learning, 2023
112023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
S Pitis
Neural Information Processing Systems (NeurIPS 2023), 2023
10*2023
Steering large language models using APE
Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis, H Chan, J Ba
NeurIPS ML Safety Workshop, 2022
52022
Return augmentation gives supervised RL temporal compositionality
K Paster, S Pitis, SA McIlraith, J Ba
Deep Reinforcement Learning Workshop NeurIPS 2022, 2022
42022
CSC 311: Introduction to Machine Learning
R Grosse, C Maddison, J Bae, S Pitis
University of Toronto, Fall, 2020
42020
Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
S Pitis, MR Zhang
International Conference on Autonomous Agents and Multi-Agent Systems 2020, 2020
32020
ProtoGE: Prototype Goal Encodings for Multi-goal Reinforcement Learning
S Pitis, H Chan, J Ba
The 4th Multidisciplinary Conference on Reinforcement Learning and Decision …, 2019
32019
Methods for retrieving alternative contract language using a prototype
S Pitis
The Sixteenth International Conference on Law and Artificial Intelligence …, 2017
32017
The system can't perform the operation now. Try again later.
Articles 1–20