Michal Valko

Cited by

	All	Since 2019
Citations	10825	9980
h-index	42	37
i10-index	97	91

3600

1800

900

2700

2011201220132014201520162017201820192020202120222023202436 26 63 61 108 141 165 199 318 608 1430 2771 3573 1269

Public access

View all

54 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Rémi MunosDeepMindVerified email at inria.fr
Mohammad Gheshlaghi AzarCohere AIVerified email at google.com
Bilal PiotGoogle DeepmindVerified email at google.com
Corentin TallecDeepMindVerified email at google.com
Jean-bastien GrillVerified email at google.com
Zhaohan Daniel GuoDeepMindVerified email at google.com
Daniele CalandrielloResearch Scientist, DeepMindVerified email at google.com
Florent AltchéResearch Engineer, DeepMindVerified email at google.com
Pierre MénardOvGU MagdeburgVerified email at inria.fr
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence ResearchVerified email at inria.fr
Florian STRUBDeepMindVerified email at google.com
Pierre RichemondGoogle DeepMindVerified email at deepmind.com
Emilie KaufmannCNRS & Univ. Lille (CRIStAL)Verified email at inria.fr
Omar Darwiche DominguesOwkinVerified email at owkin.com
Branislav KvetonAmazonVerified email at amazon.com
Milos HauskrechtProfessor of Computer Science, University of PittsburghVerified email at pitt.edu
Yunhao TangResearch Scientist, DeepMindVerified email at columbia.edu
Mark RowlandResearch Scientist, Google DeepMindVerified email at google.com
Matteo PirottaResearch Scientist, Meta (FAIR)Verified email at fb.com
Carl DoerschResearch Scientist, DeepMindVerified email at google.com

Michal Valko

Llama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMind

Verified email at meta.com - Homepage

fine-tuning LLMs rl with human feedback deep reinforcement learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Bootstrap your own latent: A new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... Neural Information Processing Systems, 2020	5844	2020
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, R Munos, P Veličković, M Valko International Conference on Learning Representations, 2022	331*	2022
Finite-time analysis of kernelised contextual bandits M Valko, N Korda, R Munos, I Flaounas, N Cristianini Uncertainty in Artificial Intelligence, 2013	256	2013
Outlier detection for patient monitoring and alerting M Hauskrecht, I Batal, M Valko, S Visweswaran, GF Cooper, G Clermont Journal of Biomedical Informatics, 2013	171	2013
Online influence maximization under independent cascade model with semi-bandit feedback Z Wen, B Kveton, M Valko, S Vaswani Neural Information Processing Systems, 2017	143*	2017
Stochastic simultaneous optimistic optimization M Valko, A Carpentier, R Munos International Conference on Machine Learning, 2013	138	2013
Efficient learning by implicit exploration in bandit problems with side observations T Kocák, G Neu, M Valko, R Munos Neural Information Processing Systems, 2014	127	2014
Spectral bandits for smooth graph functions M Valko, R Munos, B Kveton, T Kocák International Conference on Machine Learning, 2014	126	2014
Broaden your views for self-supervised video learning A Recasens, P Luc, JB Alayrac, L Wang, F Strub, C Tallec, M Malinowski, ... International Conference on Computer Vision, 2021	121	2021
Black-box optimization of noisy functions with unknown smoothness JB Grill, M Valko, R Munos Neural Information Processing Systems, 2015	109	2015
Episodic reinforcement learning in finite MDPs: Minimax lower bounds revisited O Darwiche Domingues, P Ménard, E Kaufmann, M Valko Algorithmic Learning Theory, 2021	103	2021
Simple regret for infinitely many armed bandits A Carpentier, M Valko International Conference on Machine Learning, 2015	101	2015
BYOL works even without batch statistics PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ... NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020	85	2020
Adaptive reward-free exploration E Kaufmann, P Ménard, OD Domingues, A Jonsson, E Leurent, M Valko Algorithmic Learning Theory, 2021	83	2021
Game Plan: What AI can do for Football, and What Football can do for AI K Tuyls, S Omidshafiei, P Muller, Z Wang, J Connor, D Hennes, I Graham, ... Journal of Artificial Intelligence Research 71, 41-88, 2021	82	2021
Gaussian process optimization with adaptive sketching: Scalable and no regret D Calandriello, L Carratino, A Lazaric, M Valko, L Rosasco Conference on Learning Theory, 2019	80	2019
Gamification of pure exploration for linear bandits R Degenne, P Ménard, X Shang, M Valko International Conference on Machine Learning, 2020	78	2020
Monte-Carlo tree search as regularized policy optimization JB Grill, F Altché, Y Tang, T Hubert, M Valko, I Antonoglou, R Munos International Conference on Machine Learning, 2020	68	2020
Fast active learning for pure exploration in reinforcement learning P Ménard, OD Domingues, A Jonsson, E Kaufmann, E Leurent, M Valko International Conference on Machine Learning, 2021	67	2021
DPPy: DPP sampling with Python G Gautier, G Polito, R Bardenet, M Valko Journal of Machine Learning Research 20 (180), 1-7, 2019	65	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors