Correctness Proof for Prim’s Algorithm
The proof of correctness for Prim’s minimum spanning tree (MST) algorithm holds a special place in my heart, because it was what turned me into a computer scientist. Prim’s algorithm is what we call a “greedy algorithm,” meaning it follows what seems like a good strategy to produce a good result. But “seems like” isn’t enough – does it produce an optimal solution? Always? Sometimes greedy strategies work, and sometimes they don’t, and so the most important part is whether you can prove that the solution you get is always optimal. When I first saw that proof of correctness it was a revelation, like getting a glimpse into a deep and important truth. This hooked me on computer science for good!
I first wrote the proof below when I taught data structures in Fall 2016 at UNC Greensboro. The prerequisite for data structures was discrete mathematics (called “Foundations of Computer Science” in the UNCG program), which used the textbook by James L. Hein, and so this proof used the presentation of the algorithm from that book. The pseudocode below is sightly modified from Hein’s presentation.
The intuition behind Prim’s algorithm is simple. We “grow” a tree S from some start vertex (selected in line 2 below), where each step connects a new vertex to S. We keep track of the vertices in S with the set W (initialized on line 3 and updated on line 8), and since new vertices are added to S by connecting to the existing tree, S is always connected. Therefore, when W=V and the algorithm terminates, S must be a spanning tree. Is it a minimum spanning tree? That’s what we prove in the theorem below.
There are multiple ways of proving that Prim’s algorithm always produces a minimum spanning tree. The proof below is fairly direct, and has the benefit that the same proof structure can be used to prove that Dijkstra’s algorithm computes shortest paths. A significantly different proof is in the Introduction to Algorithms book by Cormen, Leiserson, Rivest, and Stein, and while I like that proof quite a bit since it gives additional insight into minimum spanning trees, it requires building up additional terminology and properties specific to minimum spanning trees, and doesn’t generalize to problems in different domains, like Dijkstra’s algorithm (it does however, make correctness proofs for other minimum spanning tree algorithms, like Kruskal’s algorithm, easier).