singlethreaded

The Wikipedia Game Solver

The Wikipedia Game is simple to understand. Choose any two Wikipedia articles. Find a path between them. Compete on time, compete on finding the shortest path, or just enjoy wandering through Wikipedia articles until you get to where you need to go.

There’s variants and similar concepts. One variant is getting from any article to the article for Jesus in the shortest number of clicks. A somewhat related Wikipedia phenomenon is that clicking on the first link in the main text of an article and repeating that process for subsequent articles usually leads to the article for “Philosophy”. As of February 2016, this was true for 97% of Wikipedia articles. Here’s a cool video from mathematician, Hannah Fry, explaining it.

I just added a Wikipedia game solver on my blog here. It takes any two Wikipedia articles (the source article and the destination article) and attempts to find a short path between the two. The solver conducts a modified breadth first tree search starting from the source article. On each page, it enqueues up to the first 10 unvisited wikipedia links from the main text of the article.

A simple shortcut that this solver does to “learn” about the destination article is to identify the first two links from the destination article’s main text. If, while tree-searching from the source article, the solver encounters either of these “definition” articles anywhere in the text of an article, it adds that article to the queue and searches down that branch in hopes of finding the destination article. The idea is that the first two articles on the destination article will be a definition of some sort. If that definition is encountered on any page in our tree search, we should explore that branch.

This API is served using Django on AWS. I’m using Django-Q and Redis for multiprocessing to handle multiple concurrent search requests.

What’s coming next? I’m planning on adding a search strategy called iterative deepening depth-first search, which has the pro of better space efficiency while still visiting nodes in the same effective order as breadth-first search, so the resulting path will still be “short”. Also, I’ll be making some UI updates so the tool is a little bit more usable.