Google Code Jam: Save the Universe
The qualification round started with a problem called Save the Universe which asks that you assign search queries to a finite number of search engines in such a way that won't allocate a query for a search engine to that search engine. When I read the problem description I had a chuckle and reflected on a hilarious episode of The IT Crowd where Jen warned her superiors not to type "Google" into the Google search engine because it would break the internet. Clearly the good folks at Google took inspiration from this humorous concept.
But back to the problem at hand...The first realization to grasp is that assigning queries to minimize the number of switches between search engines, is logically equivalent to finding the longest runs where a "run" is the use of a given search engine for a series of sequential search queries. We know we can't send an engine a query for itself so making long "runs" is simple - if you have n search engines you defer making a switch until you have received queries for n distinct search engines. At that point you have to switch to a different engine - the one not represented in the (n-1) distinct search engines you've already used.
In essence this is a greedy algorithm and the obvious question that needs to be answered for this (and indeed most greedy algorithms) is - is it globally optimal or does it only result in a local optima? The answer is Yes and you can see this by scribbling down some examples, and that is the real secret to solving this problem in a timely fashion - quickly convincing yourself that such an approach is going to guarantee optimal solutions in all cases. The optimality of the proposed algorithm can be proved using mathematical induction but I'll leave that as an exercise for the reader
In order to determine the complexity of the proposed greedy algorithm consider that:
- it would be a single-pass solution,
- according to MSDN, the List
.Contains() function internally uses EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T. i.e. it uses hash codes hence the search component is constant time - in general, the run time of greedy algorithms is O(n * O(choice(n))) where choice(n) is the complexity associated with the decision between n alternatives at each step. So in this case choice(n) is constant time, O(k) = O(1) thus the overall complexity is O(n)
The C# code is shown below:
using System; using System.Collections.Generic; using System.IO; namespace GoogleCodeJam.SaveTheUniverse { class Program { static void Main() { StreamReader infile = new System.IO.StreamReader("large.txt"); StreamWriter outfile = new System.IO.StreamWriter("output.txt"); int testCases = Int32.Parse(infile.ReadLine()); for (int caseNo = 1; caseNo <= testCases; caseNo++) { int switchCount = 0; int engineCount = Int32.Parse(infile.ReadLine()); List<string> engineNames = new List<string>(); for (int i = 0; i < engineCount; i++) { engineNames.Add(infile.ReadLine()); } int queryCount = Int32.Parse(infile.ReadLine()); List<string> query = new List<string>(); for (int i = 0; i < queryCount; i++) { query.Add(infile.ReadLine()); } // this is a distinct list of engines already used in current run List<string> processed = new List<string>(); // process the queries in the arrival order while (query.Count > 0) { // pick the first query in the queue string q = query[0]; if (processed.Contains(q.ToUpper())) { query.RemoveAt(0); continue; } else { // add this to the processed list processed.Add(q.ToUpper()); if (processed.Count == engineCount) { switchCount++; processed.Clear(); } else { query.RemoveAt(0); } } } outfile.WriteLine(String.Format("Case #{0}: {1}", caseNo, switchCount)); } infile.Close(); outfile.Flush(); outfile.Close(); } } }
18 Jul 2008 Damien Wintour








[...] from the Save The Universe puzzle the qualification round also had a train timetable problem. Here you are given a train [...]