Vitalik Buterin's blog: Some advanced tips for programming competitions

Since the second Codesprint programming competition just closed, I thought I would leave some tips that helped me answer some of the more challenging questions (and the easier questions) there. This guide assumes that you already know the basics (ie. you can write a program that implements the substitution cipher*) and some simple optimization tips (move stuff outside loops, don't redo computations if you can help it, etc), but want to move bring your skills up to the next level, and I hope it will provide some insight into the kinds of strategies that will provide you with efficient solutions for programming challenges. I will include a hodgepodge of C++, Javascript and Python, since I'll focus not only on Codesprint but also some other programming competitions and personal projects I've done in the past that I consider relevant, and I'm including the code I used as is (with perhaps one or two comments added) so if you have trouble understanding any of the code you can always visit the relevant documentation.

Try some testcases and look for patterns

To illustrate this principle, consider two test cases from the Codesprint competition: Picking Cards and Coin Tosses**. In Picking Cards, there is a deck of cards each with a certain number of them, and you have to figure out the number of ways to arrange them in such a way that each card has before it at least as many cards as is the number on the card. So for example, a deck {A:0,B:0,C:1,D:3} could be arranged 4 ways: ABCD, BACD, ACBD, or BCAD, but not CADB, since C has a value of 1 but 0 cards before it and D has a value of 3 but only 2 cards before it. I tried some decks manually, and settled on a hypothesis: progressing along a sorted list of the cards from left to right, each card multiplies the number of ways by p+1-v, where p is the card's position in the sorted list and c is its value - so, for example, in our list of {A:0,B:0,C:1,D:3} (or {A:3,B:0,C:1,D:0}; the list is sorted first so it doesn't matter), we start off with 1 since there's only one type of empty deck, then the first card, with a position of 0 and a value of 0, gives us a p+1-v of 1, giving us a total of 1*1, or still 1, the second card (p=1,v=0) brings us to 2, the third card (p=2,v=1) doubles us again to 4, and the fourth card (p=3,v=3) keeps us at 4. The hypothesis checked out for a few other examples, so I implemented it in code. Hit submit, 20/20 score, done.

In Coin Tosses, there is a coin that has already been flipped and came out heads m times, and your job is to find the expected value of the number of tosses until the coin comes out n times in a row, potentially including those m. For example, for m = 2, n = 4, you might have a value of 4 (1111), 7 (1101111), 14 (11011000101111), and your job is to find what the value will be on average. Since you can't test expected value in your head (well, you could flip a coin whatever number of thousands of times until you're 95% confident that you're right, but here that would have been even harder than the mathematical approach) I immediately sat down with pen and paper and devised an unwieldy expression to find the expected value, by calculating the expected value in terms of itself (essentially, EV = 0.5(EV + 1) + 0.25(EV + 2) + 0.125(EV + 3) + ... + 1/2ⁿ(EV + n)), and then following the grade 9 algebra trick of moving all the variables to the left. Then I brought m into the picture, and I implemented the result in code:

cases = int(raw_input())
for i in range(cases):
    line = raw_input()
    n = int(line[:line.find(' ')])
    m = int(line[line.find(' ') + 1:])
    evfromscratch = 0
    for i in range(1,n+1): evfromscratch += i * 2.0 ** (n-i)
    evfromscratch += n
    ev = 0
    for i in range(1,n-m+1): ev += (i + evfromscratch) / 2.0 ** i
    ev += (n-m) / 2.0 ** (n-m)
    print ("%.2f" % round(ev,2))

The variable evfromscratch gives you the expected value given 0 prior heads, and ev gives you the value for m prior heads. The algorithm was perfect, and fairly quick, but there was a practical difficulty: I was getting errors because in python numbers start to round off when you try to store really big numbers. So I started to look for a different solution. Looking back now, the reason why it failed is that the ev variable brings in decimals, which causes python to switch from integers, which are stored as exact values, to floats, which round off after about 16 decimal places (52 binary places, to be exact). I didn't know this, but I decided to look for a better solution. I ran the code a few dozen times with some different test cases, and I saw a pattern, which I promptly implemented in code:

cases = int(raw_input())
for i in range(cases):
    line = raw_input()
    n = int(line[:line.find(' ')])
    m = int(line[line.find(' ') + 1:])
    print str(2 * (2 ** n - 2 ** m)) + ".00"

Full marks. Done. I don't know why it works, but it does, no rigorous proof required. I don't know why the rule for Picking Cards works either - I could go and spend hours trying to prove both, but I won't. And that is the key: unlike in the world of mathematics, where something without a formal proof is merely a "conjecture", in computer programming you don't need to know exactly why something works - if it works, you can use it. You do need a theoretical background to be able to make educated guesses in the first place, but often a vague intuitive idea of why something might work will suffice - the insight I had that got me to the solution for the picking cards problem so quickly was "deck of cards problems tend to involve factorials". A few empirical tests from there, and I had my algorithm.

Learn to love binary search

def binsearch(val,element,li):
  if len(li) == 0: return 0
  left = 0
  right = len(li)-1
  while right-left > 1:
    ave = int((left+right)/2)
    if val <= li[ave][element]: right = ave
    else: left = ave
  if val <= li[left][element]: return left
  elif val <= li[right][element]: return right
  else: return right + 1

Binary search is an algorithm that allows you to find a value (or the nearest value to a given input) in a sorted list. The naive solution to solving the search problem is to go through all the elements in the list one by one, but the problem is that that takes O(n) time (if you don't already know this, O() means "at high values becomes proportional to"), which is unacceptably high if you need to do the search inside one of your innermost loops. Binary search, on the other hand, takes a mere O(log n) time, which is a massive speedup for large lists: a search through a list 1000000 items, instead of taking an average of 500000 steps, will take a mere 20 steps, at least a 1000x speedup even if each step becomes much more complex. This particular implementation takes in an argument element, which allows it to look through a sorted list of objects (arrays or dicts) and sort them by an element, making the algorithm much more versatile. Having a key function as an argument provides even more versatility, but at the cost of speed.

Binary search is often overlooked because it requires a sorted list as a precondition, which seems like it would make it useless if you have a big unsorted list that you need to work with, but you can get around that by sorting the list by every key function you need it sorted by first. And how you do sort a list efficiently? Binary search! Just run through your unsorted list and add elements one by one to a new, first empty, list at the positions that binary search gives you. That's an O(n*log(n)) sorting algorithm right there, and one that you can implement in C without worrying about data structures like lists of lists or heaps that other efficient sorts require. Here's an example for a problem that had you keep track of the ranks of specific participants in a yodelling competition throughout the competition as their scores change (hence the use of a "byrank" array to store positions rather than actually sorting the list).

    for (int j = 1; j < yodellers; j++) { //eg. scores 34 56 12 23 40 -> byrank 2 0 4 3 1
      //Binary search
      int left = 0;
      int right = j - 1;
      while (right - left > 1) {
        int ave = (left+right)/2;
        if (scores[j] > scores[byrank[ave]]) right = ave;
        else left = ave;
      } 
      int place = 0;
      if (scores[j] > scores[byrank[left]]) place = left;
      else if (scores[j] > scores[byrank[right]]) place = right;
      else place = right+1;
      //Insertion
      for (int k = j-1; k >= place;k--) byrank[k+1] = byrank[k];
      byrank[place] = j;
    }

int scores[yodellers] and int byrank[yodellers] were both initialized earlier; yodellers is the number of participants. The problem was from a 2004 high school computing competition that I found and did for practice. Binary search is the one algorithm that now appears in almost every single one of my more advanced programming solutions - it simply is that powerful. If you have a problem that looks like it will take O(n²) time to compute, chances are there's a solution involving pre-sorting and binary search that will take O(n*log(n)) time. One example of such a binary search-based solution is my approach to the Subsequence Weighting problem in the Codesprint competition; solution here. I won't go into this example in detail here; those curious can go and try to understand the problem and code themselves, but that's an O(n*log(n)) algorithm for solving it. A second, and much more important one, is search trees.

Search Trees

Search trees are an incredibly powerful tool in solving optimization problems, where you need to find an optimal solution, such as the minimum cost of performing a certain task (eg. pathfinding, the traveling salesman problem, planning). Here's a naive implementation of the concept that I applied in a medieval/fantasy tactics game that I made a while back***:

def generate_distance_map(startx, starty):
  distmap = initialize_array(40,30,0)
  pts = []
  pts.append((startx, starty,1))
  index = 0
  while index < len(pts):
    p = pts[index]
    if (distmap[p[0]][p[1]] == 0 or distmap[p[0]][p[1]] > p[2]):
        distmap[p[0]][p[1]] = p[2]
        if p[0] < 39 and terrain[p[0]+1][p[1]] > 0: pts.append((p[0]+1,p[1],p[2]+1))
        if p[0] >= 1 and terrain[p[0]-1][p[1]] > 0: pts.append((p[0]-1,p[1],p[2]+1))
        if p[1] < 29 and terrain[p[0]][p[1]+1] > 0: pts.append((p[0],p[1]+1,p[2]+1))
        if p[1] >= 1 and terrain[p[0]][p[1]-1] > 0: pts.append((p[0],p[1]-1,p[2]+1))
    index += 1
  return distmap

The idea is to find the shortest path from A to B on a map (terrain) with obstacles in it. The solution I used is to first find the distance from all coordinates to B (returned in distmap), and then move toward the lowest nearby distance for each step starting from A (the reason things were done that way is because there may be 50 units seeking one unit and this allows the heavy lifting to be done once, rather than 50 times). For another example of this kind of searching, here's a solution in C for finding how lng it takes a knight to get from point A to point B on a chessboard (contest here). Of couse, these are not "trees", and do not have to be since the search space is only 1200 (or 64) states, but more complex applications do need trees.

Meet the fifteen puzzle. The idea is to move the fifteen tiles from a random shuffled configuration into the solved configuration of increasing numbers left to right then top to bottom using just the one empty space; post my solution code into the textbox here and run it on 3x3 or 4x4 mode (be prepared to wait up to 1 min for the 4x4 mode, the code solves the easier 3x3 mode nearly instantly) to get a better feel for the puzzle. The challenge is that there are not just 1200 states to look through, but up to (in the 5x5 case) 25! states - so many that as of the time of this writing there is not enough hard disk space in the world to store them all, and won't be for at least another 2 decades. So what do you do? Store just some of the states in a list and continue to make moves from the best states (defining "best" for the specific problem is part of the challenge) and put the resulting states into the list as well, until you eventually arrive at the goal. Here's where binary search comes in - you want the list of solutions to be sorted in order of how good they are, so you can easily pick out the best one to work with every round. And maintaining a sorted list dynamically is exactly what binary search is meant to do!

The way these types of route finding algorithms (whether finding solution steps for the fifteen puzzle or a literal route if you're asking your computer for driving directions) work in detail is like this (if you want, try to follow along in my code):

A state is stored as an object containing the position (in the pathfinding problem, the x,y coordinate, in the fifteen puzzle, the entire map - my code represents it as a string, eg. "0123456789abcdef" for the solved configuration, "048c159d26ae37bf" for the solved configuration flipped along the diagonal, etc) and the moves needed to get to that position, and optionally the score of the position/moves combination (see the discussion on heuristic functions later).
There are two lists maintained: a frontier, containing positions you're interested in evaluating, and a list of positions that have already been explored. The frontier starts off with just one element containing the initial position and an empty set of moves, and the explored list starts off empty.
For each step, take the state in the frontier with the highest score, and take all possible moves from that state that lead to a legal state not yet in the explored list and put them into the frontier. For example, in the 2D pathfinding problem from (0,4) you might add (1,4) and (0,5), but not (0,3) because you had arrived at (0,4) from (0,3) so (0,3) has already been explored, and not (-1,4) because that's out of bounds. This is where binary search comes in handy - you can use it to dynamically maintain a sorted list, so the state with the highest score will always be simply the last element in the frontier.
Take that state out of the frontier and put it into the explored list.
If you find a state that is the desired destination, take it and end the algorithm, and you got yourself a path. Whether or not you will get the best path depends on your heuristic (or scoring) function.

There are 3 types of scoring function:

Breadth first - the score is the opposite of the length of the path to that state (or the score is the length of the path, and you're looking for the lowest score rather than the highest), with no regard to the position itself. This essentially searches through states expanding out from the start in a sphere. It's guaranteed to find the best path since moves are evaluated in order of how long the path to them is.
Closest first - the score is how close the position is to the goal according to some distance metric, with no regard for the length of the path. This is the fastest algorithm, and its speed is necessary in some cases (like the fifteen puzzle), but it will NOT necessarily give you a very efficient solution. Here's my heuristic function for the fifteen puzzle (essentially, distance is defined as the sum of the distances of all the numbers from their "proper" locations):

var heuristic = function(state) {
  var h = 0;
  var size = Math.sqrt(state.str.length);
  for (var x = 0; x < size; x++) {
    for (var y = 0; y < size; y++) {
      var val = sym.indexOf(state.str.charAt(x + y * size));
      var fx = val % size;
      var fy = Math.floor(val / size);
      if (val > 0) h += Math.abs(fx-x) + Math.abs(fy-y);
    }
  }
  return h;
}

Combined score (A* search) - the score is a combination of the length of the path and the distance to the goal. If the distance metric is always less than or equal to the length of the path to the goal, then the solution you get will also be optimal, since it won't cause efficient paths to be left behind due to an erroneously high distance estimation while less efficient paths get explored. This is generally better than breadth first, the exception being situations where you have no idea how to even estimate the distance to the goal.

In the particular case of the fifteen puzzle, the search tree actually is a "tree" - the solutions are sequences of moves, and places where multiple moves are possible are where the tree branches out. Search trees have their applications everywhere - solving pathfinding and optimization problems, solving games, planning, etc - they are like binary search's older brother in terms of their versatility.

Look for crazy optimizations

The magic of binary search is one way to turn many O(n^2) solutions into O(n*log(n)) solutions, but it is not a panacea. Here, I'll dissect the "direct connections" problem from the 2012 January Codesprint competition. The premise is this (direct link here: you have a list of cities c₀,c₁,c₂,...,c_n with populations p₀,p₁,p₂,...,p_n all arranged on a line with coordinates x₀,x₁,x₂,...,x_n. You are trying to build a network to connect every city to every other city, and the cost of connecting cities c_i and c_j is equal to max(p_i,p_j) multiplied by the distance from c_i to c_j (ie. abs(x_i - x_j)). Here's the naive O(n^2) solution I had at first (the answer had to be provided mod 1,000,000,007 as part of the problem specification):

import math
cases = int(raw_input())
for case in range(cases):
  citiesnum = int(raw_input())
  coords = [int(x) for x in raw_input().split(' ')[:citiesnum]]
  pops = [int(x) for x in raw_input().split(' ')[:citiesnum]]
  cities = []
  for x in range(citiesnum): cities.append([pops[x],coords[x]])
  cities.sort(key=lambda x:x[0],reverse=True)
  total = 0
  for i in range(citiesnum):
    nt = 0
    for j in range(i+1,citiesnum):
      nt += abs(cities[i][1] - cities[j][1])
    total += nt * cities[i][0]
    if total > 1000000007: total = total % 1000000007
  print total

Note that python has a built-in efficient sorting function, so you don't need to write your own (although you still do to maintain a sorted list dynamically). However, that did not save me from the horrors of O(n^2), and getting 15/35 points as a result due to time constraints. The algorithm is simple: take the most populated city, and since you know that the cost to connect it to all the others will be based on its population just go through all the other cities and add up the distances to them. Then, take it out and go through the next most populated city, and so on right down to the least two. Reflecting on this later, sorting the cities actually didn't really speed up my code much at all; I could have just gone through all combinations of i and j and added up the costs directly, and this would have been less efficient but still O(n^2).

To take my algorithm to the next step, I had to go beyond this naive approach and make a considerably more advanced algorithm; here is my code. Everything is somewhat explained in the comments, but I'll go over it here too. The idea is to have a list of cities in order of population (we'll call it P), so you have an order to go through cities in without worrying about population, a list of cities in order of position (we'll call it X), and a running total of the positions of all the cities (which we'll call S). For example, if the positions are [1,2,3,5,7,9,12,15], the third list will be [1,3,6,11,18,27,39,54]. To find the sum of the distances from all cities to, say, X₅ (the one at 9), the formula is S₇-S₅-S₄+9(5*2-7), where 7 is the number of cities minus one - try it for yourself, it works! Essentially, it works by adding all the numbers above 9, subtracting all the numbers below 9 (by adding all the numbers and subtracting these twice), subtracting 9 for every number above 9, and adding 9 for every number below 9, getting you the sum of the distances (x_i - 9 for cities to the right of 9 and 9 - x_i for all the cities to the left). Now, we can just find the total distances for each city by going through this list, multiply the result by the city's population, and do that for every city - problem solved!

Well, not quite. There's the tiny little snag that you need to take out cities after you evaluate them, so you don't evaluate every pair an extra time with the wrong population, and to do that every time you take out a city you have to go through the entire list of sums to the right of what you take out and reduce them. And back to O(n²) land we go. Fortunately, though, I discovered a faster solution that worked by splitting up X and S into buckets, and maintaining a running "microsum" of numbers within each bucket and a running "macrosum" of the buckets themselves. Thus, for our example, we would have (in my solution I maintained X the list and X the buckets, but from now on X is just the buckets for the sake of simplification):

X = [1 2 3] [5 7 9] [12 15]
S_mic = [1 3 6] [5 12 21] [12 27]
S_mac = 6 27 54

Now, look at how little you need to change to take out the 9:

S_mic = [1 3 6] [5 12] [12 27]
S_mac = 6 18 45

You only need to update one microbucket (O(n^0.5)) and the macrobuckets (O(n^0.5)). Hooray, n^1.5 solution, and I got 35/35 points. It is actually possible to go even further, making three layers of buckets to go to O(n^1.33), four layers to O(n^1.25), and a dynamic number of layers to go to the holy grail of O(n*log(n)), but I didn't realize that at the time, and this was good enough anyway. As a general rule of thumb, if there's an obvious solution at O(n²), you can go down to O(n*log(n)) if you try hard enough.
So there you go, I hope you'll start using these ideas and others to help you solve more and more challenging programming problems, and possibly even apply these skills when programming in real life (notice how I included the game I made as an example; there are always interesting applications for these around the corner). Some other resources that might help you are:

Stanford's online ML and AI classes; now done, but the videos will be around forever, and as a high-level interactive companion to the AI videos I'll shamelessly plug my AI Challenges website, home of my above linked fifteen puzzle challenge app); AI class in particular helped out a lot in expanding my repertoire of algorithms and strategies.
Past contests, from here and here and here. Remember, don't just skim through the problems, sit down, pick a hard one and try your hardest to solve it!
Wikipedia and Google - in my early high school days, I had spent days thinking up of how to implement a flood filling algorithm (yes, as in flood fill in basic drawing software), and had finally found a slow, but functional, solution, and I was prouder of myself than I had ever been of my coding abilities in my entire life. However, the next day I researched flood filling on Wikipedia to see what other people had to say on the issue and found a simple algorithm 20 times faster than what I made. Also, it's from there that I made them leap to solving the pathfinding problem myself. Good programmers invent, great programmers steal.

Good luck and happy coding!

Vitalik Buterin

* print (lambda a,b: ''.join([(b+" "*6)[(ord(x)%32-1)] for x in a]))(raw_input(),raw_input())
** Link to full text for all 3 problems that I talk about for those without Google/FB accounts: https://docs.google.com/document/d/1vUYNXE_TGaffG6JKN6E-EbJONHwRDaxExmVJwwKFJVc/edit?pli=1
*** This is not the code that runs in the linked game; it is from an earlier downloadable version that I made in python, while the linked game is a webapp with the frontend written in javascript. In the linked game the algorithm is, while essentially the same, more complex, treating a diagonal step as 1.5 rather than 2 to more closely mimic the Pythagorean distance and making some fairly complicated necessary adjustments to preserve efficiency; if you want you can go to the game, hit Ctrl+U in the game mode, and hunt around the source code for the newer algorithm. And you can play the game and send feedback if you really want.

Vitalik Buterin's blog

2012-01-09

Some advanced tips for programming competitions

Try some testcases and look for patterns

Learn to love binary search

Search Trees

Look for crazy optimizations

1 Comments:

Post a Comment

About Me

Previous Posts