In probability theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding a reward rate to each state. An additional variable records the reward accumulated up to the current time.[1] Features of interest in the model include expected reward at a given time and expected time to accumulate a given reward.[2] The model appears in Ronald A. Howard's book.[3] The models are often studied in the context of Markov decision processes where a decision strategy can impact the rewards received.
The Markov Reward Model Checker tool can be used to numerically compute transient and stationary properties of Markov reward models.
The accumulated reward at a time t can be computed numerically over the time domain or by evaluating the linear hyperbolic system of equations which describe the accumulated reward using transform methods or finite difference methods.[4]