Fast LSTD using stochastic approximation: Finite time analysis and application to traffic control

L. A. Prashanth; Korda N.; Munos R.

doi:10.1007/978-3-662-44851-9_5

Profiles Research Units Publications

Journal

Fast LSTD using stochastic approximation: Finite time analysis and application to traffic control

, Korda N., Munos R.

Published in Springer Verlag

2014

DOI: 10.1007/978-3-662-44851-9_5

Volume: 8725 LNAI

Issue: PART 2

Pages: 66 - 81

Abstract

We propose a stochastic approximation based method with randomisation of samples for policy evaluation using the least squares temporal difference (LSTD) algorithm. Our method results in an O(d) improvement in complexity in comparison to regular LSTD, where d is the dimension of the data. We provide convergence rate results for our proposed method, both in high probability and in expectation. Moreover, we also establish that using our scheme in place of LSTD does not impact the rate of convergence of the approximate value function to the true value function. This result coupled with the low complexity of our method makes it attractive for implementation in big data settings, where d is large. Further, we also analyse a similar low-complexity alternative for least squares regression and provide finite-time bounds there. We demonstrate the practicality of our method for LSTD empirically by combining it with the LSPI algorithm in a traffic signal control application. © 2014 Springer-Verlag.

Topics: Stochastic gradient descent (54)%, Rate of convergence (53)% and Stochastic approximation (52)%

View more info for "Fast LSTD using stochastic approximation: finite time analysis and application to traffic control"

About the journal

Journal	Data powered by TypesetLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher	Data powered by TypesetSpringer Verlag
ISSN	03029743
Open Access	No

Authors (1)

L. A. Prashanth
- Department of Computer Science and Engineering

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND