default search action

combined dblp search
author search
venue search
publication search

ask others

Huizhen Yu

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j16]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/Yu24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/Yu24
Huizhen Yu:
On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies. Math. Oper. Res. 49(3): 1734-1760 (2024)
[i13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-16262
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-16262
Yi Wan, Huizhen Yu, Richard S. Sutton:
On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes. CoRR abs/2408.16262 (2024)
[i12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-03915
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-03915
Huizhen Yu, Yi Wan, Richard S. Sutton:
Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning. CoRR abs/2409.03915 (2024)
2023
[i11]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15091
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15091
Huizhen Yu, Yi Wan, Richard S. Sutton:
A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays. CoRR abs/2312.15091 (2023)
2022
[j15]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/Yu22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/Yu22
Huizhen Yu:
On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs. Math. Oper. Res. 47(2): 1474-1499 (2022)
2020
[j14]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/Yu20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/Yu20
Huizhen Yu:
On the Minimum Pair Approach for Average Cost Markov Decision Processes with Countable Discrete Action Spaces and Strictly Unbounded Costs. SIAM J. Control. Optim. 58(2): 660-685 (2020)
[j13]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/Yu20a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/Yu20a
Huizhen Yu:
Average Cost Optimality Inequality for Markov Decision Processes with Borel Spaces and Universally Measurable Policies. SIAM J. Control. Optim. 58(4): 2469-2502 (2020)
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/icetm/Yu20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icetm/Yu20
Huizhen Yu:
Research on the Structural Impact of the Disappearance of China's Demographic Dividend on the Education Industry. ICETM 2020: 151-154

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2018
[j12]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/YuMS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/YuMS18
Huizhen Yu, Ashique Rupam Mahmood, Richard S. Sutton:
On Generalized Bellman Equations and Temporal-Difference Learning. J. Mach. Learn. Res. 19: 48:1-48:49 (2018)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1805-07476
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-07476
Sina Ghiassian, Huizhen Yu, Banafsheh Rafiee, Richard S. Sutton:
Two geometric input transformation methods for fast online reinforcement learning with neural nets. CoRR abs/1805.07476 (2018)
2017
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/ai/YuMS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ai/YuMS17
Huizhen Yu, Ashique Rupam Mahmood, Richard S. Sutton:
On Generalized Bellman Equations and Temporal-Difference Learning. Canadian AI 2017: 3-14
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/MahmoodYS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/MahmoodYS17
Ashique Rupam Mahmood, Huizhen Yu, Richard S. Sutton:
Multi-step Off-policy Learning Without Importance Sampling Ratios. CoRR abs/1702.03006 (2017)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/YuMS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/YuMS17
Huizhen Yu, Ashique Rupam Mahmood, Richard S. Sutton:
On Generalized Bellman Equations and Temporal-Difference Learning. CoRR abs/1704.04463 (2017)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1712-09652
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1712-09652
Huizhen Yu:
On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning. CoRR abs/1712.09652 (2017)
2016
[j11]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/Yu16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/Yu16
Huizhen Yu:
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize. J. Mach. Learn. Res. 17: 220:1-220:58 (2016)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/Yu16b
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/Yu16b
Huizhen Yu:
Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms. CoRR abs/1605.02099 (2016)
2015
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/YuB15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/YuB15
Huizhen Yu, Dimitri P. Bertsekas:
A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies. Math. Oper. Res. 40(4): 926-968 (2015)
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/Yu15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/Yu15
Huizhen Yu:
On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes. SIAM J. Control. Optim. 53(4): 1982-2016 (2015)
[c10]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/colt/Yu15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/Yu15
Huizhen Yu:
On Convergence of Emphatic Temporal-Difference Learning. COLT 2015: 1724-1751
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/Yu15e
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/Yu15e
Huizhen Yu:
On Convergence of Emphatic Temporal-Difference Learning. CoRR abs/1506.02582 (2015)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/MahmoodYWS15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/MahmoodYWS15
Ashique Rupam Mahmood, Huizhen Yu, Martha White, Richard S. Sutton:
Emphatic Temporal-Difference Learning. CoRR abs/1507.01569 (2015)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/Yu15g
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/Yu15g
Huizhen Yu:
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize. CoRR abs/1511.07471 (2015)
2013
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/anor/YuB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/anor/YuB13
Huizhen Yu, Dimitri P. Bertsekas:
Q-learning and policy iteration algorithms for stochastic shortest path problems. Ann. Oper. Res. 208(1): 95-132 (2013)
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/YuB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/YuB13
Huizhen Yu, Dimitri P. Bertsekas:
On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems. Math. Oper. Res. 38(2): 209-227 (2013)
2012
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/BertsekasY12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/BertsekasY12
Dimitri P. Bertsekas, Huizhen Yu:
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming. Math. Oper. Res. 37(1): 66-94 (2012)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/siamco/Yu12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamco/Yu12
Huizhen Yu:
Least Squares Temporal Difference Methods: An Analysis under General Conditions. SIAM J. Control. Optim. 50(6): 3310-3343 (2012)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1207-1421
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1207-1421
Huizhen Yu:
A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies. CoRR abs/1207.1421 (2012)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1207-4154
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1207-4154
Huizhen Yu, Dimitri P. Bertsekas:
Discretized Approximations for POMDP with Average Cost. CoRR abs/1207.4154 (2012)
2011
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/siamjo/BertsekasY11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/siamjo/BertsekasY11
Dimitri P. Bertsekas, Huizhen Yu:
A Unifying Polyhedral Approximation Framework for Convex Optimization. SIAM J. Optim. 21(1): 333-360 (2011)
2010
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/YuB10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/YuB10
Huizhen Yu, Dimitri P. Bertsekas:
Error Bounds for Approximations from Projected Linear Equations. Math. Oper. Res. 35(2): 306-329 (2010)
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/allerton/BertsekasY10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/allerton/BertsekasY10
Dimitri P. Bertsekas, Huizhen Yu:
Distributed asynchronous policy iteration in dynamic programming. Allerton 2010: 1368-1375
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/cdc/BertsekasY10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cdc/BertsekasY10
Dimitri P. Bertsekas, Huizhen Yu:
Q-learning and enhanced policy iteration in discounted dynamic programming. CDC 2010: 1409-1416
[c7]
- view
  - electronic edition @ icml.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/Yu10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Yu10
Huizhen Yu:
Convergence of Least Squares Temporal Difference Methods Under General Conditions. ICML 2010: 1207-1214

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2009
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/tac/YuB09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tac/YuB09
Huizhen Yu, Dimitri P. Bertsekas:
Convergence Results for Some Temporal Difference Methods Based on Least Squares. IEEE Trans. Autom. Control. 54(7): 1515-1531 (2009)
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/YuB09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/YuB09
Huizhen Yu, Dimitri P. Bertsekas:
Basis function adaptation methods for cost approximation in MDP. ADPRL 2009: 74-81
2008
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/mor/YuB08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mor/YuB08
Huizhen Yu, Dimitri P. Bertsekas:
On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP. Math. Oper. Res. 33(1): 1-11 (2008)
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/allerton/YuB08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/allerton/YuB08
Huizhen Yu, Dimitri P. Bertsekas:
New error bounds for approximations from projected linear equations. Allerton 2008: 1116-1123
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/ewrl/YuB08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ewrl/YuB08
Huizhen Yu, Dimitri P. Bertsekas:
New Error Bounds for Approximations from Projected Linear Equations. EWRL 2008: 253-267
2006
[b1]
- view
  - electronic edition via handle.net
  - no references & citations available
- export record
  dblp key:
  - phd/ndltd/Yu06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/ndltd/Yu06
Huizhen Yu:
Approximate solution methods for POMDP and POSMDP. Massachusetts Institute of Technology, Cambridge, MA, USA, 2006
2005
[c3]
- view
  - electronic edition @ dslpitt.org (archived)
  - no references & citations available
- export record
  dblp key:
  - conf/uai/Yu05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/uai/Yu05
Huizhen Yu:
A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies. UAI 2005: 642-657
2004
[c2]
- view
  - electronic edition @ dslpitt.org (archived)
  - no references & citations available
- export record
  dblp key:
  - conf/uai/YuB04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/uai/YuB04
Huizhen Yu, Dimitri P. Bertsekas:
Discretized Approximations for POMDP with Average Cost. UAI 2004: 519
2001
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/pcm/YuG01
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pcm/YuG01
Huizhen Yu, W. Eric L. Grimson:
Combining Configurational and Statistical Approaches in Image Retrieval. IEEE Pacific Rim Conference on Multimedia 2001: 293-300

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.