Παρακολούθηση
Tor Lattimore
Tor Lattimore
DeepMind
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα google.com - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Bandit algorithms
T Lattimore, C Szepesvári
Cambridge University Press, 2020
25552020
Unifying PAC and regret: Uniform PAC bounds for episodic reinforcement learning
C Dann, T Lattimore, E Brunskill
Advances in Neural Information Processing Systems 30, 2017
2992017
Causal bandits: Learning good interventions via causal inference
F Lattimore, T Lattimore, MD Reid
Advances in neural information processing systems 29, 2016
254*2016
Degenerate feedback loops in recommender systems
R Jiang, S Chiappa, T Lattimore, A György, P Kohli
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 383-390, 2019
2082019
Learning with good feature representations in bandits and in rl with a generative model
T Lattimore, C Szepesvari, G Weisz
International conference on machine learning, 5662-5670, 2020
1752020
Behaviour suite for reinforcement learning
I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ...
arXiv preprint arXiv:1908.03568, 2019
1712019
PAC bounds for discounted MDPs
T Lattimore, M Hutter
Algorithmic Learning Theory: 23rd International Conference, ALT 2012, Lyon …, 2012
1382012
The end of optimism? an asymptotic analysis of finite-armed linear bandits
T Lattimore, C Szepesvari
Artificial Intelligence and Statistics, 728-737, 2017
1292017
Conservative bandits
Y Wu, R Shariff, T Lattimore, C Szepesvári
International Conference on Machine Learning, 1254-1262, 2016
1192016
On explore-then-commit strategies
A Garivier, T Lattimore, E Kaufmann
Advances in Neural Information Processing Systems 29, 2016
1112016
A geometric perspective on optimal representations for reinforcement learning
M Bellemare, W Dabney, R Dadashi, A Ali Taiga, PS Castro, N Le Roux, ...
Advances in neural information processing systems 32, 2019
952019
Model selection in contextual stochastic bandit problems
A Pacchiano, M Phan, Y Abbasi Yadkori, A Rao, J Zimmert, T Lattimore, ...
Advances in Neural Information Processing Systems 33, 10328-10337, 2020
862020
Garbage in, reward out: Bootstrapping exploration in multi-armed bandits
B Kveton, C Szepesvari, S Vaswani, Z Wen, T Lattimore, M Ghavamzadeh
International Conference on Machine Learning, 3601-3610, 2019
722019
Toprank: A practical algorithm for online stochastic ranking
T Lattimore, B Kveton, S Li, C Szepesvari
Advances in Neural Information Processing Systems 31, 2018
702018
Near-optimal PAC bounds for discounted MDPs
T Lattimore, M Hutter
Theoretical Computer Science 558, 125-143, 2014
692014
Bounded Regret for Finite-Armed Structured Bandits
T Lattimore, R Munos
672014
The sample-complexity of general reinforcement learning
T Lattimore, M Hutter, P Sunehag
International Conference on Machine Learning, 28-36, 2013
672013
Linear bandits with stochastic delayed feedback
C Vernade, A Carpentier, T Lattimore, G Zappella, B Ermis, M Brueckner
International Conference on Machine Learning, 9712-9721, 2020
652020
An information-theoretic approach to minimax regret in partial monitoring
T Lattimore, C Szepesvári
Conference on Learning Theory, 2111-2139, 2019
592019
Adaptive exploration in linear contextual bandit
B Hao, T Lattimore, C Szepesvari
International Conference on Artificial Intelligence and Statistics, 3536-3545, 2020
582020
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20