Παρακολούθηση
Harm van Seijen
Harm van Seijen
Sony AI
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα sony.com
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Reducing network agnostophobia
AR Dhamija, M Günther, T Boult
Advances in Neural Information Processing Systems 31, 2018
3482018
A theoretical and empirical analysis of Expected Sarsa
H Van Seijen, H Van Hasselt, S Whiteson, M Wiering
2009 ieee symposium on adaptive dynamic programming and reinforcement …, 2009
2722009
Hybrid reward architecture for reinforcement learning
H Van Seijen, M Fatemi, J Romoff, R Laroche, T Barnes, J Tsang
Advances in Neural Information Processing Systems 30, 2017
2602017
True online TD (lambda)
H Seijen, R Sutton
International Conference on Machine Learning, 692-700, 2014
1262014
True online temporal-difference learning
H Van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
Journal of Machine Learning Research 17 (145), 1-40, 2016
1102016
Systematic generalisation with group invariant predictions
F Ahmed, Y Bengio, H Van Seijen, A Courville
International Conference on Learning Representations, 2020
992020
A Deeper Look at Planning as Learning from Replay
H van Seijen, RS Sutton
International Conference on Machine Learning, 2015
762015
Planning by prioritized sweeping with small backups
H Van Seijen, R Sutton
International Conference on Machine Learning, 361-369, 2013
60*2013
Modular lifelong reinforcement learning via neural composition
JA Mendez, H van Seijen, E Eaton
arXiv preprint arXiv:2207.00429, 2022
412022
Hybrid reward architecture for reinforcement learning
HH Van Seijen, SMF Booshehri, RMH Laroche, JS Romoff
US Patent 10,977,551, 2021
382021
Using a logarithmic mapping to enable lower discount factors in reinforcement learning
H Van Seijen, M Fatemi, A Tavakoli
Advances in Neural Information Processing Systems 32, 2019
292019
Exploiting Best-Match Equations for Efficient Reinforcement Learning.
H van Seijen, S Whiteson, H van Hasselt, M Wiering
Journal of Machine Learning Research 12 (6), 2011
272011
On value function representation of long horizon problems
L Lehnert, R Laroche, H van Seijen
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
262018
Multi-advisor reinforcement learning
R Laroche, M Fatemi, J Romoff, H van Seijen
arXiv preprint arXiv:1704.00756, 2017
252017
Effective multi-step temporal-difference learning for non-linear function approximation
H van Seijen
arXiv preprint arXiv:1608.05151, 2016
232016
Dead-ends and secure exploration in reinforcement learning
M Fatemi, S Sharma, H Van Seijen, SE Kahou
International Conference on Machine Learning, 1873-1881, 2019
212019
Efficient abstraction selection in reinforcement learning
H van Seijen, S Whiteson, L Kester
Computational Intelligence 30 (4), 657-699, 2014
182014
Learning invariances for policy generalization
R Tachet, P Bachman, H van Seijen
arXiv preprint arXiv:1809.02591, 2018
172018
Separation of concerns in reinforcement learning
H van Seijen, M Fatemi, J Romoff, R Laroche
arXiv preprint arXiv:1612.05159, 2016
142016
Towards evaluating adaptivity of model-based reinforcement learning methods
Y Wan, A Rahimi-Kalahroudi, J Rajendran, I Momennejad, S Chandar, ...
International Conference on Machine Learning, 22536-22561, 2022
122022
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20