This is the second part of the Pacman AI project. In this part of the project, I implemented the Reflex agent, Minimax agent, Alpha-Beta agent and Expectimax agent.
Reflex Agent evaluates the situation right now, and what would the situation become after taking certain actions. In order to implement an agent like this, we must consider carefully about the evaluation function. In my evaluation function, I take food, capsule, and ghosts into consideration.
For the food part, I only calculated the closest food left and the farthest food left. I did this because if I calculate the total distance to all food left, it is not as effective or efficient, which is also supported by the result of my test implementation. Then, I evaluated the capsule-related situation. I used two boolean values as flags for whether I found the closest food and the farthest food. Capsules disable ghosts and keep the agent perfectly safe for a certain amount of time. I consider both
Capsules disable ghosts and keep the agent perfectly safe for a certain amount of time. I calculated the closest capsule distance as I did for food. However, the agent does not always need to eat a capsule, and sometimes it needs the capsule as soon as possible. Hence I consider the scared time left of the ghost together. If ghosts are already scared, I do not want to waste a capsule. If the ghosts are about to take the agent, a capsule would be a lifesaver.
Ghost positions are life and death issue of the game. If the agent gets caught, it the end of the game. I took the distance to the closest ghost as I did for capsules.
Now assume the agent takes a legal move, what happens?
Does it get caught? If so, do not take that move. Also, if it gets very close to the ghost, it will get caught easily. Thus I want the agent to keep a safe distance to the ghosts. What if the scared time is long enough, and even if the ghost comes directly to the agent, it would be killed? Then I don’t really care about the getting caught.
Does it get food? If so, it is a fair move.
Does it get a capsule? If so, it is great. If the scared time is less than 2 seconds, it is even greater.
It took me some time to adjust the parameter for the quantification of each aspect I took into consideration. It worked fine with the “openClassic” map. It wins 10 out of 10, and the average score is above 1000. This is certainly not the best I can do, but I will improve it later.
Minimax Agent (suppose two supreme players are fighting against each other)
The logic is simple. The agent takes the best next move, and the ghosts take their best moves. The agent wants the evaluation of the game state to be the best, and the villains want the exact opposite. Two steps are calculated in one level of digging: maximization and minimization.
Alpha-Beta Agent (Minimax Agent who only explores promising branches)
Alpha-Beta algorithm is an improved version of minimax algorithm. Instead of exploring all branches at each step, Alpha-Beta algorithm avoids expanding branches that we can identify as worse choices. I wrote a global function to perform the minimax pruning.
Expectimax Agent (Minimax Agent when the opponent isn’t absolutely predictable)
As a variation of minimax search tree, the Expectimax search algorithm depends on a combination of the player’s skill and chance elements. Here we suppose the ghosts do not necessarily take the best move for them, which means they do not catch the agent on purpose, and they do not cooperate with each other. Then the mini part of the minimax algorithm is changed. The algorithm is implemented as follows:
Better Evaluation Function
It takes a lot of tuning and testing to improve the evaluation function. After 1 hour of trying and testing, my implementation’s performance got full credits of that question. It was fast (takes on average less than 30 seconds on an inst machine) and effective (wins 10 out of 10 games and gets an average score of 1500).
I used mazeDistance at first, but it slows down the algorithm significantly. Using manhattanDist is much faster, and the scores are close. The limitation of this method is that it is only tested to be fine on “smallClassice” map, which means it is not guaranteed to work well on all Pacman maps. Tuning these parameters and coming up with these calculation functions are kind of arbitrary to me because the only way to tell if the formula works is to test it on the map. It is empirically proved to be fine, but it may not represent the real optimal evaluation strategy of the game. It is slightly biased by the map. The lesson I learned from the tuning process is to involve nonlinear functions as I did for ghostEvaluation when almost all linear combinations of variables do not seem to work.
This is the end of Pacman AI, Part II. The code can be found on my GitHub.
Readers of the post should not copy any of my code for their own course assignment, but feel free to be inspired and come up with your own ones.
If you have better ideas about the evaluation function, please leave a comment on the post. (By better ideas, I mean changing the formula significantly instead of tuning parameters slightly. )
A current master student in WUSTL, department of Electrical and System Engineering.