Παρακολούθηση
Matthieu Geist
Matthieu Geist
Cohere (ex Google, on leave of Professor, Université de Lorraine)
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα univ-lorraine.fr
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
What matters for on-policy deep actor-critic methods? a large-scale study
M Andrychowicz, A Raichuk, P Stańczyk, M Orsini, S Girgin, R Marinier, ...
International conference on learning representations, 2020
350*2020
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
3482023
A theory of regularized markov decision processes
M Geist, B Scherrer, O Pietquin
International Conference on Machine Learning, 2160-2169, 2019
2852019
Human activity recognition using recurrent neural networks
D Singh, E Merdivan, I Psychoula, J Kropf, S Hanke, M Geist, A Holzinger
Machine Learning and Knowledge Extraction: First IFIP TC 5, WG 8.4, 8.9, 12 …, 2017
2032017
Approximate modified policy iteration and its application to the game of Tetris.
B Scherrer, M Ghavamzadeh, V Gabillon, B Lesner, M Geist
J. Mach. Learn. Res. 16 (49), 1629-1676, 2015
1482015
Inverse reinforcement learning through structured classification
E Klein, M Geist, B Piot, O Pietquin
Advances in neural information processing systems 25, 2012
1222012
Kalman temporal differences
M Geist, O Pietquin
Journal of artificial intelligence research 39, 483-532, 2010
1212010
Algorithmic survey of parametric value function approximation
M Geist, O Pietquin
IEEE Transactions on Neural Networks and Learning Systems 24 (6), 845-867, 2013
120*2013
Sample-efficient batch reinforcement learning for dialogue management optimization
O Pietquin, M Geist, S Chandramohan, H Frezza-Buet
ACM Transactions on Speech and Language Processing (TSLP) 7 (3), 1-21, 2011
1192011
Primal wasserstein imitation learning
R Dadashi, L Hussenot, M Geist, O Pietquin
arXiv preprint arXiv:2006.04678, 2020
1172020
User simulation in dialogue systems using inverse reinforcement learning
S Chandramohan, M Geist, F Lefevre, O Pietquin
Interspeech 2011, 1025-1028, 2011
1152011
On the convergence of model free learning in mean field games
R Elie, J Perolat, M Laurière, M Geist, O Pietquin
Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 7143-7150, 2020
109*2020
IQ-Learn: Inverse soft-Q Learning for Imitation
D Garg, S Chakraborty, C Cundy, J Song, M Geist, S Ermon
arXiv preprint arXiv:2106.12142, 2022
1072022
Off-policy learning with eligibility traces: a survey.
M Geist, B Scherrer
J. Mach. Learn. Res. 15 (1), 289-333, 2014
1062014
Fictitious play for mean field games: Continuous time analysis and applications
S Perrin, J Pérolat, M Laurière, M Geist, R Elie, O Pietquin
Advances in neural information processing systems 33, 13199-13213, 2020
1042020
Leverage the average: an analysis of kl regularization in reinforcement learning
N Vieillard, T Kozuno, B Scherrer, O Pietquin, R Munos, M Geist
Advances in Neural Information Processing Systems 33, 12163-12174, 2020
99*2020
Bridging the gap between imitation learning and inverse reinforcement learning
B Piot, M Geist, O Pietquin
IEEE transactions on neural networks and learning systems 28 (8), 1814-1826, 2016
982016
Convolutional and recurrent neural networks for activity recognition in smart environment
D Singh, E Merdivan, S Hanke, J Kropf, M Geist, A Holzinger
Towards Integrative Machine Learning and Knowledge Extraction: BIRS Workshop …, 2017
922017
Boosted bellman residual minimization handling expert demonstrations
B Piot, M Geist, O Pietquin
Machine Learning and Knowledge Discovery in Databases: European Conference …, 2014
872014
Munchausen reinforcement learning
N Vieillard, O Pietquin, M Geist
Advances in Neural Information Processing Systems 33, 4235-4246, 2020
822020
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20