DFS, BFS, UCS, and A* in Pacman • TheAnig

Abstract#

This was the first real assignment in CSE 5522 (Advanced AI) at Ohio State. The framework is UC Berkeley’s Pacman project, where you fill in specific Python functions and the autograder tests them. The assignment covers depth-first search, breadth-first search, uniform cost search, and A* search, then moves on to designing heuristics for corners and food search problems. What stood out to me was how identical the four search algorithms are in code. The only thing that changes is the data structure for the frontier.

The Pacman AI projects give you a working game with ghosts, food pellets, mazes, and a display. Your job is to write the search logic that controls Pacman. The framework handles everything else: parsing layouts, running the game, drawing the GUI, scoring. You just fill in functions in search.py and searchAgents.py.

Pacman navigating a maze — Pacman in a maze. The search algorithms find a path from the start to the goal.

The code in this post is Python 2 (that’s what the framework used). Variable names are mine, left as-is.

The graph search skeleton#

All four algorithms follow the same pattern:

Push the start state onto the frontier
Pop a state
If it’s the goal, return the path
If it hasn’t been visited, mark it visited, expand its successors, push them onto the frontier
Repeat

The only difference is what “frontier” means. A stack gives you DFS. A queue gives you BFS. A priority queue with path cost gives you UCS. A priority queue with path cost plus a heuristic gives you A*.

I stored each frontier entry as a tuple (node, cost, path) where path is the list of actions taken so far. This way you don’t need to reconstruct the path at the end.

DFS#

def depthFirstSearch(problem):
    stack = util.Stack()
    exploredNodes = set()
    stack.push((problem.getStartState(), 0, []))
    listOfActions=[]

    while not stack.isEmpty():
        (node, cost, path) = stack.pop()
        if problem.isGoalState(node):
            listOfActions=path
            break

        if not node in exploredNodes:
            exploredNodes.add(node)
            for successorNode, action, stepCost in problem.getSuccessors(node):
                successorState = (successorNode,cost+stepCost, path+[action])
                stack.push(successorState)

    return listOfActions

python

util.Stack is LIFO. Last node pushed is the first one popped, so you go deep before you go wide. The exploredNodes set is the graph search part. Without it you’d revisit states and potentially loop forever.

The cost variable is tracked but DFS doesn’t use it for ordering. I carried it along anyway so all four algorithms could share the same tuple shape.

BFS#

def breadthFirstSearch(problem):
    queue = util.Queue()
    exploredNodes = set()
    queue.push((problem.getStartState(), 0, []))
    listOfActions=[]

    while not queue.isEmpty():
        (node, cost, path) = queue.pop()
        if problem.isGoalState(node):
            listOfActions=path
            break

        if not node in exploredNodes:
            exploredNodes.add(node)
            for successorNode, action, stepCost in problem.getSuccessors(node):
                successorState = (successorNode,cost+stepCost, path+[action])
                queue.push(successorState)

    return listOfActions

python

Swap Stack for Queue and you get BFS. That’s it. FIFO means you explore all nodes at depth $d$ before any node at depth $d+1$ , which guarantees the shortest path when all step costs are equal.

I check the goal when popping, not when generating successors. Checking at generation time would be more efficient (you’d return as soon as you generate the goal rather than waiting for it to reach the front of the queue), but checking at pop time is simpler and still correct.

UCS#

def uniformCostSearch(problem):
    Priorityqueue = util.PriorityQueue()
    exploredNodes = set()

    Priorityqueue.push((problem.getStartState(), 0, []),0)
    listOfActions=[]

    while not Priorityqueue.isEmpty():
        (node, cost, path) = Priorityqueue.pop()
        if problem.isGoalState(node):
            listOfActions=path
            break

        if not node in exploredNodes:
            exploredNodes.add(node)
            for successorNode, action, stepCost in problem.getSuccessors(node):
                successorState = (successorNode,cost+stepCost, path+[action])
                Priorityqueue.push(successorState,cost+stepCost)

    return  listOfActions

python

Now the frontier is a priority queue ordered by cumulative path cost $g(n)$ . The node with the cheapest total cost so far gets popped first. This is Dijkstra’s algorithm applied to the search problem.

The priority queue takes two arguments: the item and its priority. The priority is cost+stepCost, the total cost from the start to that successor.

A*#

def aStarSearch(problem, heuristic=nullHeuristic):
    Priorityqueue = util.PriorityQueue()
    exploredNodes = set()

    Priorityqueue.push((problem.getStartState(), 0, []),0+heuristic(problem.getStartState(),problem))
    listOfActions=[]

    while not Priorityqueue.isEmpty():
        (node, cost, path) = Priorityqueue.pop()
        if problem.isGoalState(node):
            listOfActions=path
            break

        if not node in exploredNodes:
            exploredNodes.add(node)
            for successorNode, action, stepCost in problem.getSuccessors(node):
                successorState = (successorNode,cost+stepCost, path+[action])
                Priorityqueue.push(successorState,successorState[1]+heuristic(successorNode,problem))

    return  listOfActions

python

Same as UCS except the priority is $f(n) = g(n) + h(n)$ instead of just $g(n)$ . The heuristic $h(n)$ estimates the remaining cost to the goal. If the heuristic is admissible (never overestimates), A* is optimal. If it’s also consistent ( $h(n) \leq c(n, n') + h(n')$ for every successor $n'$ ), you never need to re-expand a node.

With nullHeuristic (returns 0 for everything), A* degrades to UCS.

The corners problem#

After the four search algorithms, the assignment moves to designing search problems. The corners problem: Pacman has to visit all four corners of the maze. The trick is that the state can’t just be Pacman’s position, because you need to know which corners have been visited.

I represented the state as (position, remaining_corners) where remaining_corners is a tuple of the corners Pacman hasn’t reached yet.

def getStartState(self):
    return (self.startingPosition,self.corners)

def isGoalState(self, state):
    return len(state[1]) == 0

python

The goal test is: are there zero remaining corners? The successor function removes a corner from the tuple when Pacman steps on it:

def getSuccessors(self, state):
    successors = []
    for action in [Directions.NORTH, Directions.SOUTH, Directions.EAST, Directions.WEST]:
        x,y = state[0]
        dx, dy = Actions.directionToVector(action)
        nextx, nexty = int(x + dx), int(y + dy)
        hitsWall = self.walls[nextx][nexty]

        if not hitsWall:
            corners =  tuple(x for x in state[1] if x != (nextx, nexty))
            successors.append((((nextx, nexty), corners),action,1))

    self._expanded += 1
    return successors

python

The line tuple(x for x in state[1] if x != (nextx, nexty)) builds a new tuple without the current corner (if we’re standing on one). If we’re not on a corner, the tuple is unchanged. Using a tuple instead of a list matters because tuples are hashable, so the state can go into the explored set.

Corners heuristic#

For the corners problem, you need an admissible heuristic. I used the maximum Manhattan distance from Pacman’s current position to any unvisited corner:

def cornersHeuristic(state, problem):
    corners = problem.corners
    walls = problem.walls

    heuristic=0
    for corner in state[1]:
        distance=util.manhattanDistance(state[0], corner)
        if heuristic<=distance:
            heuristic=distance

    return heuristic

python

This is admissible because you have to travel at least as far as the farthest remaining corner. It’s not a very tight heuristic though. It ignores the fact that after reaching one corner, you still have to reach the others. A tighter approach would be something like the MST of the remaining corners plus Pacman’s position, but the max-distance version was good enough to pass the autograder.

Food search heuristic#

The food search problem is harder: Pacman has to eat all food pellets. The state is (pacmanPosition, foodGrid) where foodGrid is a 2D boolean grid.

def foodHeuristic(state, problem):
    position, foodGrid = state
    heuristic=0
    foodCoordinates=foodGrid.asList()
    for foodCoordinate in foodCoordinates:
        distance=mazeDistance(position,foodCoordinate,problem.startingGameState)
        if heuristic<=distance:
            heuristic=distance

    return heuristic

python

Same idea as the corners heuristic: take the maximum distance to any remaining food pellet. But instead of Manhattan distance, I used mazeDistance, which runs BFS through the actual maze to get the true shortest-path distance. This gives a much tighter lower bound because it accounts for walls.

The tradeoff is cost. mazeDistance runs a full BFS for every food pellet on every call to the heuristic. With lots of remaining food, that’s expensive. But it expands far fewer nodes than a weaker heuristic would, so overall it was faster on the test mazes.

Closest dot search#

The last part was a greedy suboptimal agent that always goes to the nearest food pellet:

def findPathToClosestDot(self, gameState):
    startPosition = gameState.getPacmanPosition()
    food = gameState.getFood()
    walls = gameState.getWalls()
    problem = AnyFoodSearchProblem(gameState)

    listOfActions=search.astar(problem)
    return listOfActions

python

AnyFoodSearchProblem is a search problem where the goal is to reach any food pellet:

def isGoalState(self, state):
    x,y = state
    foodBool=self.food[x][y]
    return foodBool

python

The agent calls A* (with no heuristic, so it’s UCS) on this problem to find the shortest path to the nearest food, eats it, and repeats. It’s not optimal for eating all the food (greedy nearest-neighbor rarely is), but it’s simple.

What I took away#

The four search algorithms being structurally identical was the main lesson. You look at DFS, BFS, UCS, and A* in a textbook and they feel like separate things with separate proofs. When you implement them, the difference is literally one line: which data structure you use for the frontier. Stack, Queue, PriorityQueue with $g$ , PriorityQueue with $g + h$ .

The heuristic design was more interesting. Getting an admissible heuristic is easy (Manhattan distance is always admissible on a grid). Getting a tight admissible heuristic that actually speeds up the search is the hard part. The food heuristic using true maze distance was the most effective one I wrote, but it was also the most expensive to compute per call.