Convergence properties of policy iteration

Manuel Santos, John Rust

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

This paper analyzes asymptotic convergence properties of policy iteration in a class of stationary, infinite-horizon Markovian decision problems that arise in optimal growth theory. These problems have continuous state and control variables and must therefore be discretized in order to compute an approximate solution. The discretization may render inapplicable known convergence results for policy iteration such as those of Puterman and Brumelle [Math. Oper. Res., 4 (1979), pp. 60-69]. Under certain regularity conditions, we prove that for piecewise linear interpolation, policy iteration converges quadratically. Also, under more general conditions we establish that convergence is superlinear. We show how the constants involved in these convergence orders depend on the grid size of the discretization. These theoretical results are illustrated with numerical experiments that compare the performance of policy iteration and the method of successive approximations.

Original languageEnglish (US)
Pages (from-to)2094-2115
Number of pages22
JournalSIAM Journal on Control and Optimization
Volume42
Issue number6
DOIs
StatePublished - 2004
Externally publishedYes

Fingerprint

Policy Iteration
Convergence Properties
Interpolation
Discretization
Experiments
Optimal Growth
Convergence Order
Asymptotic Convergence
Linear Interpolation
Successive Approximation
Infinite Horizon
Regularity Conditions
Decision problem
Piecewise Linear
Convergence Results
Asymptotic Properties
Approximate Solution
Numerical Experiment
Grid
Converge

Keywords

  • Complexity
  • Computational cost
  • Method of successive approximations
  • Policy iteration
  • Quadratic and superlinear convergence

ASJC Scopus subject areas

  • Mathematics(all)
  • Applied Mathematics
  • Control and Optimization

Cite this

Convergence properties of policy iteration. / Santos, Manuel; Rust, John.

In: SIAM Journal on Control and Optimization, Vol. 42, No. 6, 2004, p. 2094-2115.

Research output: Contribution to journalArticle

@article{584040e910b147eeaaf11d6912742dc5,
title = "Convergence properties of policy iteration",
abstract = "This paper analyzes asymptotic convergence properties of policy iteration in a class of stationary, infinite-horizon Markovian decision problems that arise in optimal growth theory. These problems have continuous state and control variables and must therefore be discretized in order to compute an approximate solution. The discretization may render inapplicable known convergence results for policy iteration such as those of Puterman and Brumelle [Math. Oper. Res., 4 (1979), pp. 60-69]. Under certain regularity conditions, we prove that for piecewise linear interpolation, policy iteration converges quadratically. Also, under more general conditions we establish that convergence is superlinear. We show how the constants involved in these convergence orders depend on the grid size of the discretization. These theoretical results are illustrated with numerical experiments that compare the performance of policy iteration and the method of successive approximations.",
keywords = "Complexity, Computational cost, Method of successive approximations, Policy iteration, Quadratic and superlinear convergence",
author = "Manuel Santos and John Rust",
year = "2004",
doi = "10.1137/S0363012902399824",
language = "English (US)",
volume = "42",
pages = "2094--2115",
journal = "SIAM Journal on Control and Optimization",
issn = "0363-0129",
publisher = "Society for Industrial and Applied Mathematics Publications",
number = "6",

}

TY - JOUR

T1 - Convergence properties of policy iteration

AU - Santos, Manuel

AU - Rust, John

PY - 2004

Y1 - 2004

N2 - This paper analyzes asymptotic convergence properties of policy iteration in a class of stationary, infinite-horizon Markovian decision problems that arise in optimal growth theory. These problems have continuous state and control variables and must therefore be discretized in order to compute an approximate solution. The discretization may render inapplicable known convergence results for policy iteration such as those of Puterman and Brumelle [Math. Oper. Res., 4 (1979), pp. 60-69]. Under certain regularity conditions, we prove that for piecewise linear interpolation, policy iteration converges quadratically. Also, under more general conditions we establish that convergence is superlinear. We show how the constants involved in these convergence orders depend on the grid size of the discretization. These theoretical results are illustrated with numerical experiments that compare the performance of policy iteration and the method of successive approximations.

AB - This paper analyzes asymptotic convergence properties of policy iteration in a class of stationary, infinite-horizon Markovian decision problems that arise in optimal growth theory. These problems have continuous state and control variables and must therefore be discretized in order to compute an approximate solution. The discretization may render inapplicable known convergence results for policy iteration such as those of Puterman and Brumelle [Math. Oper. Res., 4 (1979), pp. 60-69]. Under certain regularity conditions, we prove that for piecewise linear interpolation, policy iteration converges quadratically. Also, under more general conditions we establish that convergence is superlinear. We show how the constants involved in these convergence orders depend on the grid size of the discretization. These theoretical results are illustrated with numerical experiments that compare the performance of policy iteration and the method of successive approximations.

KW - Complexity

KW - Computational cost

KW - Method of successive approximations

KW - Policy iteration

KW - Quadratic and superlinear convergence

UR - http://www.scopus.com/inward/record.url?scp=10244257713&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10244257713&partnerID=8YFLogxK

U2 - 10.1137/S0363012902399824

DO - 10.1137/S0363012902399824

M3 - Article

VL - 42

SP - 2094

EP - 2115

JO - SIAM Journal on Control and Optimization

JF - SIAM Journal on Control and Optimization

SN - 0363-0129

IS - 6

ER -