This is the third part of the Pacman AI project. In this part of the project, I implemented value iteration agent, a Q-learning reinforcement learning agent, and an approximate Q agent.
Value Iteration Agent
The value iteration agent an offline planner. In the initial planning phase, we set the number of value iterations it should run. It takes an MDP on construction and runs value iteration for a specified number of iterations before the constructor returns.
Q-learning Reinforcement Learning Agent
Approximate Q Agent
This is the end of Pacman AI, Part III.
Readers of the post should not copy any of my code for their own course assignment, but feel free to be inspired and come up with your own ones.
For this project, we should all follow the same algorithms of value iteration, computing action based on value, computing Q value etc. The reinforcement learning agent is the most AI-like agent I have done so far in this project because we don’t interfere with how they make specific decisions.