1st International ICST Workshop on Tools for solving Structured Markov Chains

Research Article

A policy iteration algorithm for Markov decision processes skip-free in one direction

Download582 downloads
  • @INPROCEEDINGS{10.4108/smctools.2007.1948,
        author={J.  Lambert and B. Van Houdt and C. Blondia},
        title={A policy iteration algorithm for Markov decision processes skip-free in one direction},
        proceedings={1st International ICST Workshop on Tools for solving Structured Markov Chains},
        proceedings_a={SMCTOOLS},
        year={2010},
        month={5},
        keywords={Matrix analytic methods Markov decision process skip-free in one direction optical buffer fibre delay lines loss rate.},
        doi={10.4108/smctools.2007.1948}
    }
    
  • J. Lambert
    B. Van Houdt
    C. Blondia
    Year: 2010
    A policy iteration algorithm for Markov decision processes skip-free in one direction
    SMCTOOLS
    ICST
    DOI: 10.4108/smctools.2007.1948
J. Lambert1,*, B. Van Houdt1,*, C. Blondia1,*
  • 1: University of Antwerp, Department of Mathematics and Computer Science,Performance Analysis of Telecommunication Systems Research Group,Middelheimlaan, 1, B-2020 Antwerp - Belgium,
*Contact email: oke.lambert@ua.ac.be, benny.vanhoudt@ua.ac.be, chris.blondia@ua.ac.be

Abstract

In this paper we present a new algorithm for policy iteration for Markov decision processes (MDP) skip-free in one direction. This algorithm, which is based on matrix analytic methods, is in the same spirit as the algorithm of White (Stochastic Models, 21:785-797, 2005) which was limited to matrices that are skip-free in both directions. Optimization problems that can be solved using Markov decision processes arise in the domain of optical buffers, when trying to improve loss rates of fibre delay line (FDL) buffers. Based on the analysis of such an FDL buffer we present a comparative study between the different techniques available to solve an MDP. The results illustrate that the exploitation of the structure of the transition matrices places us in a position to deal with larger systems, while reducing the computation times.