default search action
Bruno Scherrer
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2021
- [i21]Lin Chen, Bruno Scherrer, Peter L. Bartlett:
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm. CoRR abs/2103.09847 (2021) - 2020
- [c35]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. AISTATS 2020: 2529-2538 - [c34]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020 - [i20]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020)
2010 – 2019
- 2019
- [c33]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
How to Combine Tree-Search Methods in Reinforcement Learning. AAAI 2019: 3494-3501 - [c32]Romain Postoyan, Mathieu Granzotto, Lucian Busoniu, Bruno Scherrer, Dragan Nesic, Jamal Daafouz:
Stability guarantees for nonlinear discrete-time systems controlled by approximate value iteration. CDC 2019: 487-492 - [c31]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. ICML 2019: 2160-2169 - [i19]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. CoRR abs/1901.11275 (2019) - [i18]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. CoRR abs/1910.09322 (2019) - 2018
- [c30]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
Beyond the One-Step Greedy Approach in Reinforcement Learning. ICML 2018: 1386-1395 - [c29]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning. NeurIPS 2018: 5244-5253 - [i17]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
Beyond the One Step Greedy Approach in Reinforcement Learning. CoRR abs/1802.03654 (2018) - [i16]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning. CoRR abs/1805.07956 (2018) - [i15]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
How to Combine Tree-Search Methods in Reinforcement Learning. CoRR abs/1809.01843 (2018) - [i14]Matthieu Geist, Bruno Scherrer:
Anderson Acceleration for Reinforcement Learning. CoRR abs/1809.09501 (2018) - 2016
- [b2]Bruno Scherrer:
Contributions algorithmiques au contrôle optimal stochastique à temps discret et horizon infini. University of Lorraine, Nancy, France, 2016 - [j11]Bruno Scherrer:
Improved and Generalized Upper Bounds on the Complexity of Policy Iteration. Math. Oper. Res. 41(3): 758-774 (2016) - [c28]Julien Pérolat, Bilal Piot, Bruno Scherrer, Olivier Pietquin:
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games. AISTATS 2016: 893-901 - [c27]Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
Softened Approximate Policy Iteration for Markov Games. ICML 2016: 1860-1868 - 2015
- [j10]Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, Matthieu Geist:
Approximate modified policy iteration and its application to the game of Tetris. J. Mach. Learn. Res. 16: 1629-1676 (2015) - [j9]Bruno Scherrer, Matthieu Geist:
Recherche locale de politique dans un espace convexe. Rev. d'Intelligence Artif. 29(6): 685-704 (2015) - [c26]Julien Pérolat, Bruno Scherrer, Bilal Piot, Olivier Pietquin:
Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games. ICML 2015: 1321-1329 - [c25]Manel Tagorti, Bruno Scherrer:
On the Rate of Convergence and Error Bounds for LSTD(\(\lambda\)). ICML 2015: 1521-1529 - [c24]Boris Lesner, Bruno Scherrer:
Non-Stationary Approximate Modified Policy Iteration. ICML 2015: 1567-1575 - 2014
- [j8]Matthieu Geist, Bruno Scherrer:
Off-policy learning with eligibility traces: a survey. J. Mach. Learn. Res. 15(1): 289-333 (2014) - [j7]Eugene A. Feinberg, Jefferson Huang, Bruno Scherrer:
Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming. Oper. Res. Lett. 42(6-7): 429-431 (2014) - [c23]Bruno Scherrer:
Approximate Policy Iteration Schemes: A Comparison. ICML 2014: 1314-1322 - [c22]Bruno Scherrer, Matthieu Geist:
Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search. ECML/PKDD (3) 2014: 35-50 - [i13]Bruno Scherrer:
Approximate Policy Iteration Schemes: A Comparison. CoRR abs/1405.2878 (2014) - [i12]Manel Tagorti, Bruno Scherrer:
Rate of Convergence and Error Bounds for LSTD(λ). CoRR abs/1405.3229 (2014) - 2013
- [j6]Bruno Scherrer:
Performance bounds for λ policy iteration and application to the game of Tetris. J. Mach. Learn. Res. 14(1): 1181-1227 (2013) - [c21]Bruno Scherrer:
Improved and Generalized Upper Bounds on the Complexity of Policy Iteration. NIPS 2013: 386-394 - [c20]Victor Gabillon, Mohammad Ghavamzadeh, Bruno Scherrer:
Approximate Dynamic Programming Finally Performs Well in the Game of Tetris. NIPS 2013: 1754-1762 - [i11]Matthieu Geist, Bruno Scherrer:
Off-policy Learning with Eligibility Traces: A Survey. CoRR abs/1304.3999 (2013) - [i10]Boris Lesner, Bruno Scherrer:
Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies. CoRR abs/1304.5610 (2013) - [i9]Bruno Scherrer:
Improved and Generalized Upper Bounds on the Complexity of Policy Iteration. CoRR abs/1306.0386 (2013) - [i8]Bruno Scherrer:
On the Performance Bounds of some Policy Search Dynamic Programming Algorithms. CoRR abs/1306.0539 (2013) - [i7]Bruno Scherrer, Matthieu Geist:
Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee. CoRR abs/1306.1520 (2013) - 2012
- [c19]Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh:
A Dantzig Selector Approach to Temporal Difference Learning. ICML 2012 - [c18]Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist:
Approximate Modified Policy Iteration. ICML 2012 - [c17]Bruno Scherrer, Boris Lesner:
On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes. NIPS 2012: 1835-1843 - [i6]Bruno Scherrer:
On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes. CoRR abs/1203.5532 (2012) - [i5]Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist:
Approximate Modified Policy Iteration. CoRR abs/1205.3054 (2012) - [i4]Bruno Scherrer, Boris Lesner:
On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes. CoRR abs/1211.6898 (2012) - 2011
- [c16]Matthieu Geist, Bruno Scherrer:
ℓ1-Penalized Projected Bellman Residual. EWRL 2011: 89-101 - [c15]Bruno Scherrer, Matthieu Geist:
Recursive Least-Squares Learning with Eligibility Traces. EWRL 2011: 115-127 - [c14]Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Bruno Scherrer:
Classification-based Policy Iteration with a Critic. ICML 2011: 1049-1056 - 2010
- [c13]Bruno Scherrer:
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view. ICML 2010: 959-966 - [c12]Christophe Thiery, Bruno Scherrer:
Least-Squares Policy Iteration: Bias-Variance Trade-off in Control Problems. ICML 2010: 1071-1078 - [i3]Bruno Scherrer:
Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view. CoRR abs/1011.4362 (2010)
2000 – 2009
- 2009
- [j5]Christophe Thiery, Bruno Scherrer:
Building Controllers for Tetris. J. Int. Comput. Games Assoc. 32(1): 3-11 (2009) - [j4]Christophe Thiery, Bruno Scherrer:
Improvements on Learning Tetris with Cross Entropy. J. Int. Comput. Games Assoc. 32(1): 23-33 (2009) - [j3]Christophe Thiery, Bruno Scherrer:
Construction d'un joueur artificiel pour Tetris. Rev. d'Intelligence Artif. 23(2-3): 387-407 (2009) - 2008
- [j2]Amine M. Boumaza, Bruno Scherrer:
Analyse d'un algorithme d'intelligence en essaim pour le fourragement. Rev. d'Intelligence Artif. 22(6): 791-816 (2008) - [c11]Marek Petrik, Bruno Scherrer:
Biasing Approximate Dynamic Programming with a Lower Discount Factor. NIPS 2008: 1265-1272 - [c10]César Torres-Huitzil, Bernard Girau, Amine M. Boumaza, Bruno Scherrer:
Embedded Harmonic Control for Trajectory Planning in Large Environments. ReConFig 2008: 7-12 - 2007
- [c9]Amine M. Boumaza, Bruno Scherrer:
Convergence and rate of convergence of a simple ant model. AAMAS 2007: 152 - [c8]Amine M. Boumaza, Bruno Scherrer:
Convergence and rate of convergence of a foraging ant model. IEEE Congress on Evolutionary Computation 2007: 469-476 - [c7]Amine M. Boumaza, Bruno Scherrer:
Optimal control subsumes harmonic control. ICRA 2007: 2841-2846 - [i2]Bruno Scherrer:
Performance Bounds for Lambda Policy Iteration. CoRR abs/0711.0694 (2007) - 2006
- [i1]Bruno Scherrer:
Modular self-organization. CoRR abs/cs/0609142 (2006) - 2005
- [j1]Bruno Scherrer:
Asynchronous neurocomputing for optimal control and reinforcement learning with large state spaces. Neurocomputing 63: 229-251 (2005) - 2003
- [b1]Bruno Scherrer:
Apprentissage de représentation et auto-organisation modulaire pour un agent autonome. Henri Poincaré University, Nancy, France, 2003 - [c6]Bruno Scherrer:
Parallel asynchronous distributed computations of optimal control in large state space Markov Decision processes. ESANN 2003: 325-330 - [c5]Iadine Chades, Bruno Scherrer, François Charpillet:
Planning Cooperative Homogeneous Multiagent Systems Using Markov Decision Processes. ICEIS (2) 2003: 426-429 - [c4]Bruno Scherrer:
Modular self-organization for a long-living autonomous agent. IJCAI 2003: 1440-1442 - 2002
- [c3]Bruno Scherrer, François Charpillet:
Coevolutive planning in markov decision processes. AAMAS 2002: 843-844 - [c2]Bruno Scherrer, François Charpillet:
Cooperative Co-Learning: A Model-Based Approach for Solving Multi Agent Reinforcement Problems. ICTAI 2002: 463-468 - [c1]Iadine Chades, Bruno Scherrer, François Charpillet:
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem. SAC 2002: 57-62
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-05 20:20 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint