Archive for July, 2008

Uncategorized

Google Code Jam: Text Messaging Outrage

Google Code Jam Round 1C began with the "Text Messaging Outrage" problem which basically asks that we identify the most efficient letter-to-key assignment on a phone to minimize the number of key presses needed by users. Google quite helpfully provide a frequency distribution for each letter. Given the frequency distribution it's obvious what the optimal assignment is - for the letters that are most frequently used assign them key press combinations that require the fewest key presses. Technically speaking this is a greedy approach and it should be intuitively obvious why it produces the optimal assignment.

This was one of the easiest problems encountered so far! Here's the code in C#:


using System;
using System.IO;

namespace GoogleCodeJam.TextMessage
{
    class Program
    {
        static void Main()
        {
            string basePath = "";
            StreamReader infile = new System.IO.StreamReader(basePath + "large.txt");
            StreamWriter outfile = new System.IO.StreamWriter(basePath + "output.txt");

            int testCases = Int32.Parse(infile.ReadLine());
            for (int caseNo = 1; caseNo <= testCases; caseNo++) // note this is 1-based
            {
                long minKeyPressCount = 0;

                // read in the input data
                string[] data = (infile.ReadLine()).Split(new char[] { ' ' }, StringSplitOptions.None);
                int P = Int32.Parse(data[0]); // max number of letters to place on a key
                int K = Int32.Parse(data[1]); // number of keys available
                int L = Int32.Parse(data[2]); // number of letters in the alphabet

                // read in the letter frequencies
                int[] frequencies = new int[L];
                int[] indexes = new int[L];

                string[] freqData = (infile.ReadLine()).Split(new char[] { ' ' }, StringSplitOptions.None);
                int i = 0;
                foreach (string s in freqData)
                {
                    indexes[i] = i;
                    frequencies[i] = Int32.Parse(s);
                    i++;
                }

                int nextKfree = 0;      // the one-based index of the key with the next lowest key press combo
                int nextPfree = 1;      // the one-based index of the letter on the "low key" with the next lowest key press combo

                // map the most used letters to the shortest key press routines available
                Array.Sort(frequencies,indexes);    // ~ O(n log n) on average since it uses an unstable QuickSort internally
                Array.Reverse(frequencies);         // ~ O(n)
                Array.Reverse(indexes);             // ~ O(n)
                int ii = 0;
                foreach (int x in indexes)  // from greatest- to least-used index
                {
                    // find the next available key press combo walking through from cheapest to dearest
                    nextKfree++;
                    if (nextKfree > K)
                    {
                        nextKfree = 1;
                        nextPfree++;
                    }
                    checked
                    {
                        minKeyPressCount += (frequencies[ii] * nextPfree);
                    }
                    ii++;
                }

                // write out the results
                outfile.WriteLine(String.Format("Case #{0}: {1}", caseNo, minKeyPressCount));
            }
            infile.Close();
            outfile.Flush();
            outfile.Close();
        }
    }
}

Uncategorized

Google CodeJam: Number Set

Google Code JamFollowing on from the fun of Minimum Scalar Product, Milkshakes, and Numbers in Round 1A is three more interesting but rather challenging problems in Round 1B. I first one I attempted was the "Number Sets" problem. This problem requires that we group numbers that have common prime factors thus it involves 3 challenges: managing sets, finding prime numbers within a given range, and finding common factors.

What makes this problem slightly easier is the fact that the numbers are consecutive integers, with a spread no larger than 1 million. This means that when we construct sets we can use a direct-access (indexed) data structure so lookup of a given value is O(1) since we can use it's index. As well, we are never going to have the same number in multiple sets - each number can only be in 1 set at any given time. A useful data structure in this case is Disjoint Sets (see Chapter 21 in Introduction to Algorithms). These can be implemented using an internal linked list, or a rooted tree structure. The later is slightly more efficient if you employ path compression and union by rank.

In the "large" variety of this problem the B parameter which is the upper bound on the range of integers that are to be examined is 1012. This makes for a huge number of elements to check if you go through each one, find all the factors for that number, then figure out which of those factors are prime, then check every pair of integers in the range for common prime factors. How big is huge? Well given that the range of integers to check is 1012, in the worst case, the number of integer pairs is approx. 1024/2 [= 1012.(1012-1)/2], since nCr = n! / (n-r)! r! . Thus it isn't computationally feasible to try this approach.

A better approach is to attack the problem the other way round. That is, find the prime numbers in the range, then for each prime number find every multiple of it and group them into a common set using the aforementioned disjoint set data structure. Furthermore, we know that for a given interval of consecutive integers [a,b] = {x ∈ ℜ, a ≤ x ≤ b} the "spread" of the interval is (b-a+1). Any prime number greater than or equal to the spread can only have, at most, 1 integer which is a multiple of the prime that falls within the range [a,b]. Why? Because after you have found a multiple of the prime that falls within the range, and then added that prime to this number, it must fall outside the range since the prime is greater than the spread of the interval. The fact that we can have, at most, one factor means no set union will ever take place in this case. Hence we need to only find the prime numbers less than the size of the spread of the interval. For a given upper interval bound, the bigger the interval spread, the fewer the number of prime numbers we need to evaluate.

There are several well published, efficient methods to produce a series of prime numbers, including Sieve of Eratosthenes and the Sieve of Atkin. The running time of Eratosthenes is O((nlogn)(loglogn)), although it has a memory requirement of O(n) in order to store the work-in-progress results. In other words, it's sublinear. I chose the slightly less efficient sieve of Eratosthenes over the Atkin sieve since it is simple to code.

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.IO;
   4:   
   5:  namespace GoogleCodeJam.NumberSets
   6:  {
   7:      public enum NumberType : byte
   8:      {
   9:          Unknown = 0,    //the default
  10:          Prime = 1,
  11:          Composite = 2
  12:      }
  13:   
  14:      //Sieve of Eratosthenes
  15:      public class PrimeSieve
  16:      {
  17:          NumberType[] _values;   // NB: this is a zero-based array
  18:   
  19:          public PrimeSieve(ulong max)    //umax is the upper range limit to check
  20:          {
  21:              if (max < 2) throw new ArgumentException();
  22:   
  23:              _values = new NumberType[max];      // 0 -> max-1
  24:              _values[0] = NumberType.Composite;  // one is not prime
  25:          }
  26:   
  27:          public List<ulong> ComputePrimes()
  28:          {
  29:              List<ulong> primes = new List<ulong>();
  30:   
  31:              for (ulong i = 2; i <= (ulong)_values.Length; i++)
  32:              {
  33:                  if (_values[i - 1] == NumberType.Unknown)
  34:                  {
  35:                      //The next unknown value must be prime
  36:                      _values[i - 1] = NumberType.Prime;
  37:                      primes.Add(i);
  38:   
  39:                      //All multiples of this prime must be Composite
  40:                      //but we can start at i squared
  41:                      for (ulong j = i * i; j <= (ulong)_values.Length; j += i)
  42:                      {
  43:                          _values[j - 1] = NumberType.Composite;
  44:                      }
  45:                  }
  46:              }
  47:              return primes;
  48:          }
  49:      }
  50:   
  51:      public class DisjointSets
  52:      {
  53:          private int _elementCount;   // The number of elements in the DisjointSets data structure.
  54:          private int _setCount;       // The number of sets  in the DisjointSets data structure.
  55:          private List<Node> _nodes;   // The list of nodes representing the elements
  56:   
  57:          // Create an empty DisjointSets data structure
  58:          public DisjointSets() : this(0) { }
  59:   
  60:          // Create a DisjointSet data structure with a specified
  61:          // number of elements (with element id's from 0 to count-1)
  62:          public DisjointSets(int count)
  63:          {
  64:              _elementCount = 0;
  65:              _setCount = 0;
  66:              _nodes = new List<Node>();
  67:              AddElements(count);
  68:          }
  69:   
  70:          // Returns the number of elements currently in the DisjointSets data structure.
  71:          public int ElementCount
  72:          {
  73:              get { return _elementCount; }
  74:          }
  75:   
  76:          // Returns the number of sets currently in the DisjointSets data structure.
  77:          public int SetCount
  78:          {
  79:              get { return _setCount; }
  80:          }
  81:   
  82:          // Find the set identifier that an element currently belongs to.
  83:          // Note: some internal data is modified for optimization even though this method is constant.
  84:          public int FindSet(int elementId)
  85:          {
  86:              if (elementId >= _elementCount)
  87:                  throw new ArgumentOutOfRangeException("elementId");
  88:   
  89:              // Find the root element that represents the set which `elementId` belongs to
  90:              Node curNode = _nodes[elementId];
  91:              while (curNode.Parent != null)
  92:                  curNode = curNode.Parent;
  93:   
  94:              Node representative = curNode;
  95:   
  96:              // Walk to the representative, updating the parents of `elementId`. Make those elements 
  97:              // the direct children of `representative`. This optimizes the tree for future 
  98:              // FindSet invocations.
  99:              curNode = _nodes[elementId];
 100:              while (curNode != representative)
 101:              {
 102:                  Node next = curNode.Parent;
 103:                  curNode.Parent = representative;
 104:                  curNode = next;
 105:              }
 106:   
 107:              return representative.Index;
 108:          }
 109:   
 110:          // Combine two sets into one. All elements in those two sets will share the
 111:          // same set id that can be gotten using FindSet.
 112:          public void Union(int setId1, int setId2)
 113:          {
 114:              if (setId1 >= _elementCount) throw new ArgumentOutOfRangeException("setId1");
 115:              if (setId2 >= _elementCount) throw new ArgumentOutOfRangeException("setId2");
 116:   
 117:              Node set1Rep = _nodes[setId1];
 118:              while (set1Rep.Parent != null)
 119:                  set1Rep = set1Rep.Parent;
 120:   
 121:              Node set2Rep = _nodes[setId2];
 122:              while (set2Rep.Parent != null)
 123:                  set2Rep = set2Rep.Parent;
 124:   
 125:              if (set1Rep.Index == set2Rep.Index)
 126:                  return; // already unioned
 127:   
 128:              Node set1 = _nodes[setId1];
 129:              Node set2 = _nodes[setId2];
 130:   
 131:              // Determine which node representing a set has a higher rank. The node with the higher rank
 132:              // is likely to have a bigger subtree so in order to better balance the tree representing
 133:              // the union, the node with the higher rank is made the parent of the one with the lower
 134:              // rank and not the other way around.
 135:              if (set1.Rank > set2.Rank)
 136:              {
 137:                  set2.Parent = set1;
 138:                  ++set1.Rank;
 139:              }
 140:              else if (set1.Rank < set2.Rank)
 141:              {
 142:                  set1.Parent = set2;
 143:                  ++set2.Rank;
 144:              }
 145:              else // set1.Rank == set2.Rank
 146:              {
 147:                  set2.Parent = set1;
 148:                  ++set1.Rank; // update rank
 149:              }
 150:   
 151:              // Since two sets have fused into one, there is now one less set so update the set count.
 152:              --_setCount;
 153:          }
 154:   
 155:          // Add a specified number of elements to the DisjointSets data structure. 
 156:          // The element id's of the new elements are numbered
 157:          // consecutively starting with the first never-before-used elementId.
 158:          public void AddElements(int addCount)
 159:          {
 160:              // insert and initialize the specified number of element nodes to the end of the `_nodes` array
 161:              for (int i = _elementCount; i < _elementCount + addCount; ++i)
 162:              {
 163:                  Node newNode = new Node();
 164:                  newNode.Parent = null;
 165:                  newNode.Index = i;
 166:                  newNode.Rank = 0;
 167:                  _nodes.Add(newNode);
 168:              }
 169:   
 170:              // update element and set counts
 171:              _elementCount += addCount;
 172:              _setCount += addCount;
 173:          }
 174:   
 175:          // Internal Node data structure used for representing an element.
 176:          private class Node
 177:          {
 178:              public int Rank;    // This roughly represent the max height of the node in its subtree.
 179:              public int Index;   // The index of the element the node represents.
 180:              public Node Parent; // The parent node of the node.
 181:          }
 182:      }
 183:   
 184:      public class Program
 185:      {
 186:          static void Main()
 187:          {
 188:              StreamReader file = new System.IO.StreamReader("large.txt");
 189:              StreamWriter outfile = new System.IO.StreamWriter("output.txt");
 190:   
 191:              List<ulong> primes = (new PrimeSieve(1000001)).ComputePrimes();
 192:   
 193:              int testCases = Int32.Parse(file.ReadLine());
 194:              for (int caseNo = 1; caseNo <= testCases; caseNo++) // note this is 1-based
 195:              {
 196:                  
 197:                  string[] d = (file.ReadLine()).Split(new char[] { ' ' }, StringSplitOptions.None);
 198:                  ulong a = UInt64.Parse(d[0]); // interval start
 199:                  ulong b = UInt64.Parse(d[1]); // interval end
 200:                  ulong p = UInt64.Parse(d[2]); // minimum prime factor
 201:   
 202:                  int elementCount = (int)(b - a + 1);
 203:                  if (elementCount < 2)
 204:                  {
 205:                      outfile.Write(String.Format("Case #{0}: {1}\r\n", caseNo, 1));
 206:                  }
 207:                  else
 208:                  {
 209:                      DisjointSets sets = new DisjointSets(elementCount);
 210:   
 211:                      // using a sieve find all prime numbers between P and A
 212:                      foreach (ulong i in primes)
 213:                      {
 214:                          if (i > (ulong)elementCount) break;
 215:   
 216:                          if (i >= p)
 217:                          {
 218:                              // find all integers in the given range which have this prime factor
 219:                              for (ulong x = (a % i == 0) ? a : (a + i - a % i); x + i <= b; x +=i )
 220:                              {
 221:                                  // i is a prime factor of x
 222:                                  sets.Union(sets.FindSet((int)(x-a)), sets.FindSet((int)(x+i-a)));
 223:                              }
 224:                          }
 225:                      }
 226:                      outfile.Write(String.Format("Case #{0}: {1}\r\n", caseNo, sets.SetCount));
 227:                  }
 228:              }
 229:   
 230:              file.Close();
 231:              outfile.Flush();
 232:              outfile.Close();
 233:          }
 234:      }
 235:  }
 236:   

Uncategorized

Google Code Jam: Crop Triangles

Google Code JamRound 1B kicks off with a problem called "Crop Triangles". You are given a grid of integer-valued coordinates and need to find out how many distinct triangles can be formed that have an integer-valued midpoint. The obvious answer that comes to mind - to construct every possible triangle and chceck it's midpoint - is not really feasible as the problem size increases. With n grid points, and 3 points needed for a triangle, the number of triangles to examine in the brute force approach is: nC3 = n(n-1)(n-2)/6  so when n = 100,000 , the upper limit of n in the large problem, the triangle count is approx. 1 x 1015. Ergo the brute force approach is not feasible as it has running time O(n3). So how can we reduce the search space? A little bit of number theory comes in very handy here...

We are told that: (i) all coordinate pairs are integer values, (ii) the midpoint of a triangle must be an integer value, and (iii) the x- and y-coordinates of the midpoint are the average of the x- and y-coordinates of the 3 vertices respectively. So in mathematical notation (focusing on just the x-coordinates):


 


 
The same logic applies to the y-coordinates, so we need only examine the coordinates mod 3 that I'm referring to herein as the "adjusted coordinates". Adjusting with a modulus 3 operation effectively reduces the x- and y-coordinates to values in the range [0,1,2]. The following matrix depicts this graphically.

y mod 3
0 1 2
x mod 3
0 C0,0 C0,1 C0,2
1 C1,0 C1,1 C1,2
2 C2,0 C2,1 C2,2

Since the sum of the adjusted coordinates for the x-values, and the y-values must equal 0 (the last equation given above) we only need to group together either:

  1. Three x-values that have the same mod 3 result since 3x0=0, 3x1=3, 3x2=6 and 0,3,6 are all multiples of 3; or
  2. Three x-values where each value has a different mod 3 result, since 0 + 1 + 2 =3 is a multiple of 3.

Obviously the same logic applies to the y-values, so our approach is to create a 3x3 matrix representing the mod 3 adjusted values. We iterate over all the points given and increment the appropriate counters in the matrix after we do the mod 3 operation on both the x- and y-values. Then we can examine the resulting 3x3 matrix of counts and produce the final answer by adding the number of 3-point combinations that can be made:

  1. by items within each cell;
  2. by items across columns for each row. This entails taking one point from each column for a given row in the matrix;
  3. by items across rows for each column. This entails taking one point from each row for a given column in the matrix;
  4. by items across rows and columns taking, at most, one point from each row and column in the matrix;

Here's my C# code:

using System;
using System.IO;

namespace CropTriangles
{
    class Program
    {
        static void Main()
        {
            string basePath = @"";
            var infile = new StreamReader(basePath + "large.txt");
            var outfile = new StreamWriter(basePath + "output.txt");

            int testCases = Int32.Parse(infile.ReadLine());
            for (int caseNo = 1; caseNo <= testCases; caseNo++) // note this is 1-based
            {
                // read in the input data
                string[] data = (infile.ReadLine()).Split(new char[] { ' ' }, StringSplitOptions.None);

                int treeCount = Int32.Parse(data[0]);
                long A = Int64.Parse(data[1]);
                long B = Int64.Parse(data[2]);
                long C = Int64.Parse(data[3]);
                long D = Int64.Parse(data[4]);

                long x = Int64.Parse(data[5]);
                long y = Int64.Parse(data[6]);
                long M = Int64.Parse(data[7]);

                long[,] bins = new long[3, 3];
                long answer = 0;

                bins[x%3, y%3]++;

                for(int i=1; i<treeCount; i++)
                {
                    checked
                    {
                        x = (A*x + B)%M;
                        y = (C*y + D)%M;

                        bins[x % 3, y % 3]++;
                    }
                }

                for(int i=0; i< 3; i++)
                    for(int j=0; j< 3; j++)
                        answer += (bins[i, j] * (bins[i, j] - 1) * (bins[i, j] - 2) / 6);

                // get column and row products
                for (int i = 0; i < 3; i++)
                {
                    answer += (bins[i,0] * bins[i,1] * bins[i,2]);
                    answer += (bins[0,i] * bins[1,i] * bins[2,i]);
                }

                // get 
                for (int i = 0; i < 3; i++)
                    for (int j = 0; j < 3; j++)
                        for (int k = 0; k < 3; k++)
                            if (i !=j & i!=k & j!=k)
                                answer += (bins[0,i] * bins[1,j] * bins[2,k]);

                // write out the results
                outfile.WriteLine(String.Format("Case #{0}: {1}", caseNo, answer));
            }

            infile.Close();
            outfile.Flush();
            outfile.Close();
        }
    }
}

Next »