Archive for October, 2008


Optimal File Arrangement Puzzle

Puzzle: Suppose you have n files that must be stored on a storage device that only supports sequential access, such as a tape drive. Because the device only supports sequential access, the time to read the i-th file on the device includes time needed to skip over all files stored before it. Assuming each file is equally likely to be needed how would you arrange the files on the device?

Solution: First let's define some notation. Let size[1.. n] be an array listing the file size of each file to be stored, so size[i] is the size of the i-th file. If the files are stored in order from 1 to n, then the cost of accessing the k-th file is the sum of accessing all previous files plus the k-th file:

Because each file is equally likely to be accessed, the expected cost for retrieving a file at random is:

But this is for a particular ordering (1..n). If we were to change the position of files around the cost of accessing some files would become more expensive and some others might become less expensive to access. To examine this let us denote a file storage permutation, f, whereby f(i) is the index of the file stored at position i in the ordering. Since the problem as stated asks us to identify the optimal f let's restate the expected cost of a permutation f.

We defined our problem formally, but by now it should be intuitive what the solution is: store the files in order from the shortest to longest. But how do we prove it? The answer, my friend, is proof by contradiction. Let's assume that our proposed solution is wrong. In that case we would have some file pair not in ascending order. That is, size[f(i)] > size[f(i+1)]. Now consider if we were to swap the position of these files. The cost of accessing f(i) would increase by size[f(i+1)], and the cost of accessing f(i+1) would decrease by size[f(i)]. Thus the net change to the expected cost for the permutation f would be (size[f(i+1)] - size[f(i)])/n. But since size[f(i)] > size[f(i+1)] the net change would be negative indicating an improvement. Hence, if any file pair is not in non-decreasing order we can improve the overall expected access time by swapping them. QED.


Birthday Paradox # 1

Problem: There are 37 students in a classroom. If it was paying even money, would you bet that at least 4 of them would have their birthday in the same month?

Source: Adapted from a puzzle on Thoughts of a Rambler.


Firstly, assume the worst case (from your betting perspective) where all the students have birthdays in different months. This is your worst case scenario because it reduces the chances of one month having a cluster of 4 or more students. But there are only 12 months in the year so the most evenly distributed assignment of students to birthday months is 3 in each month plus 4 in one month, giving 3x11 + 4 = 37. In other words, there is no possible distribution of 37 students to 12 months which does not produce at least 1 month which has 4 or more students assigned to it. Given that you cannot lose on this bet - so bet the house!!! Remember - it's not gambling if you know the odds are in your favour!


Circular Permutations

Problem: Six friends are sitting around a circular dinner table that seats six. What's the probability they're sitting in ascending order of age? What is the probability that they are sitting in either ascending or descending age order?

Solution: Six people and six chairs implies a total of 6! (= 720) different seating permutations. There is only 1 ordering of age that is ascending but because the table is circular the youngest person could sit in any of the six chairs, thus there is 6/720 = 1/120 chance that they are sitting in ascending order. In general, if there are n people they chance that they are in order, either ascending OR descending, will be n/n! = 1/(n-1)!

Similarly there is only 1 descending age ordering but again the oldest person could sit in any of the six seats, thus the probability of the group sitting in either ascending or descending order age is 12/720 = 1/60. i.e not very bloody likely.