Synthical
Your space
Profile
Activity
Favorites
Folders
Feeds
All articles
Claim page
Mohammad Gheshlaghi Azar
Follow
Activity
Upvotes
Folders
Articles
25
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
16 January 2025 by
Yannis Flet-Berliac
and
others
Machine Learning
Averaging log-likelihoods in direct alignment
27 June 2024 by
Nathan Grinsztajn
and
others
Machine Learning
Nash Learning from Human Feedback
11 June 2024 by
Rémi Munos
and
others
at
Inria Rocquencourt
Machine Learning
,
Artificial Intelligence
Self-Improving Robust Preference Optimization
1
7 June 2024 by
Eugene Choi
and
others
Machine Learning
,
Artificial Intelligence
Offline Regularised Reinforcement Learning for Large Language Models Alignment
29 May 2024 by
Pierre Harvey Richemond
and
others
at
Google
Machine Learning
,
Artificial Intelligence
An Analysis of Quantile Temporal-Difference Learning
20 May 2024 by
Mark Rowland
and
others
Machine Learning
A General Theoretical Paradigm to Understand Learning from Human Preferences
1
22 November 2023 by
Mohammad Gheshlaghi Azar
and
others
at
Google
Artificial Intelligence
,
Machine Learning
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
22 May 2023 by
Toshinori Kitamura
and
others
Machine Learning
Understanding Self-Predictive Learning for Reinforcement Learning
6 December 2022 by
Yunhao Tang
and
others
Machine Learning
,
Artificial Intelligence
BYOL-Explore: Exploration by Bootstrapped Prediction
16 June 2022 by
Zhaohan Daniel Guo
and
others
Machine Learning
,
Artificial Intelligence
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
27 May 2022 by
Tadashi Kozuno
and
others
Machine Learning
,
Artificial Intelligence
Large-Scale Representation Learning on Graphs via Bootstrapping
4 November 2021 by
Shantanu Thakoor
and
others
Machine Learning
,
Social and Information Networks
Bootstrap your own latent: A new approach to self-supervised Learning
9 September 2020 by
Jean-Bastien Grill
and
others
Machine Learning
,
Computer Vision and Pattern Recognition
The Advantage Regret-Matching Actor-Critic
27 August 2020 by
Audrūnas Gruslys
and
others
Artificial Intelligence
,
Machine Learning
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
30 April 2020 by
Daniel Guo
and
others
Machine Learning
,
Artificial Intelligence
Neural Predictive Belief Representations
19 August 2019 by
Zhaohan Daniel Guo
and
others
Machine Learning
World Discovery Models
21 February 2019 by
Mohammad Gheshlaghi Azar
and
others
Artificial Intelligence
,
Applications
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
19 June 2018 by
Audrunas Gruslys
and
others
Artificial Intelligence
Observe and Look Further: Achieving Consistent Performance on Atari
29 May 2018 by
Tobias Pohlen
and
others
Machine Learning
,
Artificial Intelligence
Noisy Networks for Exploration
15 February 2018 by
Meire Fortunato
and
others
Machine Learning
Minimax Regret Bounds for Reinforcement Learning
1 July 2017 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
,
Artificial Intelligence
Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes
16 February 2016 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback
13 February 2014 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
,
Systems and Control
Sequential Transfer in Multi-armed Bandit with Finite Set of Models
25 July 2013 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
Regret Bounds for Reinforcement Learning with Policy Advice
17 July 2013 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
On the Sample Complexity of Reinforcement Learning with a Generative Model
27 June 2012 by
Mohammad Gheshlaghi Azar
and
others
at
Inria Rocquencourt
Machine Learning
Dynamic Policy Programming
6 September 2011 by
Mohammad Gheshlaghi Azar
and
others
Machine Learning
,
Artificial Intelligence
This is an AI-generated summary
Key points
Topics
Machine Learning
Artificial Intelligence
Systems and Control
Computer Science and Game Theory
Multiagent Systems
Social and Information Networks
Computer Vision and Pattern Recognition
Applications
Optimization and Control