This is the third part of the Pacman AI project. In this part of the project, I implemented value iteration agent, a Q-learning reinforcement learning agent, and an approximate Q agent.
Value Iteration Agent
The value iteration agent an offline planner. In the initial planning phase, we set the number of value iterations it should run. It takes an MDP on construction and runs value iteration for a specified number of iterations before the constructor returns.
Q-learning Reinforcement Learning Agent
Approximate Q Agent
This is the end of Pacman AI, Part III.
Readers of the post should not copy any of my code for their own course assignment, but feel free to be inspired and come up with your own ones. I do not want to upload code file to GitHub because that makes my code handy to copy and paste. Also, code screenshots are made unclear on purpose, because technically I should not upload solutions.
For this project, we should all follow the same algorithms of value iteration, computing action based on value, computing Q value etc. The reinforcement learning agent is the most AI-like agent I have done so far in this project because we don’t interfere with how they make specific decisions. We don’t tell it to try this before that, because it can figure it out by trial and error kind of learning process.
A current master student in WUSTL, department of Electrical and System Engineering.