mjtsai1974's Dev Blog Welcome to mjt's AI world

Markov Decision Process Addendum

Markov Decision Process Addendum

My MDP tourism is not ending over here, but begins to branch to some other fields related to machine learning, reinforcement learning, maybe over a long horizon or soon, back to MDP or POMDP. The context is all my hand written with the inspiration from many of the on-line lecture in the useful link section. The information space of the grid world, the value iteration stochasticity is organized from ➂Sebastian Thrun's MDP Illustration, the system of 2 states is inspired from ➁Virginia Tech CS5804 Spring 2015 on Youtube, and my overview of MDP is from ➀MIT OCW 6-825-techniques-in-artificial-intelligence-sma-5504-fall-2002 MDP.

MIT OCW 6-825-techniques-in-artificial-intelligence-sma-5504-fall-2002 MDP
It provides a comprehensive overview of MDP, but, not inclusive of the illustration of value iteration, suggestion is to make that you can follow ➁.
Virginia Tech CS5804 Spring 2015 on Youtube
It introduces the basic concept of MDP and computation of value iteration in example of system of 2 states.
Sebastian Thrun’s MDP Illustration
Sebastian Thrun is a very great scientist, no only in Stanford, but lead the mobile auto driving in Google. This is the collection of his on-line course of MDP. It constructs beautiful MDP framework, beginning from intuition, guide you break through the drawback of the conventional planning, lead you to the concept of policy, by using the technique of value iteration to seek for the optimal policy of the optimal action of each state. Suggestion is made that you should follow up the order of sequence of the videos in this series.
➃still others…