Seminar第2215讲 Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm

创建时间:  2021/11/25  谭福平   浏览次数:   返回

报告题目 (Title):Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm

报告人 (Speaker):周翔教授(香港城市大学)

报告时间 (Time):2021年11月25日(周四) 10:30

报告地点 (Place):G507

邀请人(Inviter):余长君


报告摘要:Optimal control problem is typically cast as a nonlinear Hamiltonian-Jacobi-Bellman PDE problem which the value function satisfies. In this talk, we show motivations of focusing its gradient and derive a PDE system for the (vector-valued) gradient of the value function (value-gradient function), which is closed and enjoys a nice component-decoupling property. This PDE system of value-gradient can be solved by the method of characteristics as the linear HJB equation: one curve of characteristics will produce the data for both value and value-gradient. Supplemented by this additional value-gradient data, the value function is then computed by minimizing the sum of two mean square errors between the data and the parametric function approximations. We show by a few numerical examples the improvement of both robustness and accuracy when such value-gradient is taken into account. The linear convergence of the iterative algorithm is proved under mild conditions. This is joint work with A. Bensoussan and P. Yam and JY Han.

上一条:Seminar第2216讲 An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation

下一条:Seminar第2213讲 Geometry of Painlevé equations: (5,6)

  版权所有 © 上海大学   沪ICP备09014157   沪公网安备31009102000049号  地址:上海市宝山区上大路99号    邮编:200444   电话查询
 技术支持:上海大学信息化工作办公室   联系我们