During November/December months, my church conducts christmas carol rounds. Members assemble at a common location and then proceed to visit nearby houses in the area. At the end of the rounds, they then reassemble for lunch/dinner. Since I am lazy by nature, I have always wondered whether we take the shortest route while visiting all the houses. So for the mathematically inclined, let me go ahead and formally define the Christmas Carol problem.

Given a start point, stop point and N intermediate points, what is the shortest route that starts from the start point, visits all the N intermediate points exactly once and then end at the stop point? This is nothing but the classic Travelling Salesman Problem which unfortunately is NP hard i.e. there exists no algorithm than can meaningfully find the optimal solution for large values of N. The number of possible alternatives is O(N!). Since my church visits at most 12 houses in one stretch, this number turns out to be 12! or 479,001,600. Of course, our Carol captain does not perform this exhaustive search. Instead he often uses a heuristic called the Nearest Neighbor algorithm to find the route to follow. He starts from the start point and then proceeds to the nearest unvisited home. After visiting the house, he then moves to the house that is unvisited and closest to his current location.

But how good is this heuristic? The graph below shows that the heuristic provides a reasonable approximation. Even when there are 11 houses to visit, the route found by the nearest neighbor algorithm is only 50% worse than the best possible route. However this gap widens as we need to visit more and more houses.

Is there a way out? Can we find a better solution with little effort? Apparently you can - here is one way to do it. Start with the nearest neighbor solution and then randomly swap two houses to generate a new solution. So for e.g. let us say we need to visit 6 houses (A, B, C, D, E and F) and the initial solution obtained from the nearest neighbor algorithm is A, C, B, D, E, F. We will then create a new solution by randomly swapping say B and E. The new solution will then be A, C, E, D, B, F. If the new solution is better than the current best solution, we will accept it. We then proceed to mutate the new solution in the same way to get the next solution. Here is what you get when you run the above algorithm for 1,000,000 iterations.

For visiting 12 houses, this algorithm is about 500 times faster than brute force enumeration. It took 15 seconds to run in my i5 quadcore machine. You can see the law of diminishing utility at play here - improvements are plenty in the beginning and then they taper out towards the end. So it takes less than a second to reach close to 4% of the best solution but another 14 seconds to reach the best solution.

To conclude, the nearest neighbor heuristic provides a good enough approximation to the Christmas Carol problem. But you can do even better, if you are willing to spend some processor cycles.

## Recent Comments