A Great Book
Recently, I've been ploughing through Hacker's Delight by Dr Henry Warren. Despite the misleading title, and the illusion of a series of tips from an arcane science relevant only to compiler developers, this book is actually extremely useful in shedding light on various "under the covers" techniques that can actually permeate up into high-level coding and be useful to "everyday" coders when faced with performance/optimisation issues. Right from the get go, Chapter 2 has some remarkable insights into bitwise operations that led to several lightbulb moments for me.

A Familiar Problem 
For example, Section 2-19 discusses "Exchanging Registers" without using a 3rd register. This is a very common interview question reduced to it's most basic form. The interview question usually goes something like this:

How would you swap the value of 2 integers without using a temporary variable?

The Solution
The solution I learnt at university many years ago was, of course, to resort to clever mathematics... If we are given a and b which are integers, then we can apply the following 3 equations (I'm using apostrophes here merely to indicate that the variables have changed):

a' = a + b   (1)
b' = a' - b   (2)
a'' =  a' - b'  (3)

After (1), a holds the sum of the 2 integers, but we still know what b is so we can set b to a to effect the first half of the swap by employing (2). To see this just do the substitution: b' = (a + b) - b = a.

After this we can go back and set a to b to finish the swap routine by applying (3). To see why this works, again consider the substituton:  a'' = (a + b) - a = b. No temporary variable is needed because setting the sum of the 2 digits to one of the variables is enough to allow us to get around the problem with simple algebra. Here's the code in C#:

        public static void SwapWithAlgebra(ref int a, ref int b)
        {
            checked
            {
                a = a + b;
            }
            b = a - b;
            a = a - b;
        }

Note that we have to use the checked keyword since it's possible that addition of two 32-bit signed integers produces an integer which cannot be stored in a 32-bit signed integer and we'd prefer an Overflow exception be thrown rather than silent supression - which is the default behaviour for overflow exceptions in C# ! From a mathematical perspective we know that addition and subtraction are closed over the set of integers but in most programming languages this is not the case since a fixed number of bits are allocated to "integers". The inclusion of the checked keyword makes the code safer but slower. It also means we have to catch and handle any thrown exceptions which is a good reason to consider alternatives...

Dr Warren's Solution
As well as solutions based on addition and subtraction, Dr Warren gives a slightly different approach using bitwise exclusive or (XOR):

a' = a XOR b     (1)
b' = a' XOR b    (2)
a'' =  a' XOR b'  (3)

To understand how this works first note that:

  1. XOR will produce a 1 only if exactly one of it's operands is 1, and the other is zero; and
  2. XOR is commutative so  a XOR b = b XOR a; and
  3. XOR is associative so (a XOR b) XOR c = a XOR (b XOR c); and
  4. a XOR a = 0 (this should be obvious from the definition in [1] above)

After (1), the binary representation of a will have 1-bits only in the bit positions where a and b have opposing bits. That is either (ak=1, bk=0) or (ak=0, bk=1). Now when we do the substitution in (2) we get: 

b' = (a XOR b) XOR b
    = a XOR (b XOR b)     because XOR is associative
   = a XOR 0                   because of [4] above
   = a                              due to definition of XOR (see [1] above)

Now we can substitute into (3):

a'' =  (a XOR b) XOR a  
    =  (b XOR a) XOR a   because XOR is commutative
    =  b XOR (a XOR a)   because XOR is associative
    =  b XOR 0               because of [4] above
    =  b                          due to definition of XOR (see [1] above)

Here's the code in c#:

        public static void SwapWithXOR(ref int a, ref int b)
        {
            a ^= b;
            b ^= a;
            a ^= b;
        }

Measuring the Results
As always, when dealing with performance options it's imperative we collect actual data via measurement and base our recommendations on that. Accordingly, here's a simple test program that performs 1 million iterations of several solutions to this puzzle:

using System;
using System.Diagnostics;

namespace Algorithms
{
    static class SwapInts
    {

        public static void SwapWithXOR(ref int a, ref int b)
        {
            a ^= b;
            b ^= a;
            a ^= b;
        }

        public static void SwapWithAlgebra(ref int a, ref int b)
        {
            checked
            {
                a = a+b;
                b = a-b;
                a = a-b;
            }
        }

        public static void SwapWithTemp(ref int a, ref int b)
        {
            int t = a;
            a = b;
            b = t;
        }

        unsafe public static void SwapWithXORUnsafe(int* a, int* b)
        {
            *a ^= *b;
            *b ^= *a;
            *a ^= *b;
        }

        unsafe public static void SwapWithTempUnsafe(int* a, int* b)
        {
            int t = *a;
            *a = *b;
            *b = t;
        }

        static void Main()
        {

            Stopwatch sw = new Stopwatch();
            const int LoopMax = 1000000;

            int x = 455;
            int y = 1123;

            sw = new Stopwatch();
            sw.Start();
            for (int l = 0; l < LoopMax; l++)
                SwapWithXOR(ref x, ref y);
            Console.WriteLine(x);
            Console.WriteLine(y);
            Console.WriteLine(String.Format("Timing for XOR-based swap bit count: {0}", sw.ElapsedTicks));

            sw = new Stopwatch();
            sw.Start();
            for (int l = 0; l < LoopMax; l++)
                SwapWithTemp(ref x, ref y);
            Console.WriteLine(x);
            Console.WriteLine(y);
            Console.WriteLine(String.Format("Timing for TempVar-based swap bit count: {0}", sw.ElapsedTicks));

            sw = new Stopwatch();
            sw.Start();
            for (int l = 0; l < LoopMax; l++)
                SwapWithTemp(ref x, ref y);
            Console.WriteLine(x);
            Console.WriteLine(y);
            Console.WriteLine(String.Format("Timing for Algebra-based swap bit count: {0}", sw.ElapsedTicks));

            sw = new Stopwatch();
            sw.Start();
            for (int l = 0; l < LoopMax; l++)
            {
                unsafe
                {
                    SwapWithXORUnsafe(&x, &y);
                }
            }
            Console.WriteLine(x);
            Console.WriteLine(y);
            Console.WriteLine(String.Format("Timing for XOR unsafe-based swap bit count: {0}", sw.ElapsedTicks));

            sw = new Stopwatch();
            sw.Start();
            for (int l = 0; l < LoopMax; l++)
            {
                unsafe
                {
                    SwapWithTempUnsafe(&x, &y);
                }
            }
            Console.WriteLine(x);
            Console.WriteLine(y);
            Console.WriteLine(String.Format("Timing for TempVar unsafe-based swap bit count: {0}", sw.ElapsedTicks));

        }
    }
}

From this we get the following results:

Timing for XOR-based swap bit count:                   50,877
Timing for TempVar-based swap bit count: 36,061
Timing for Algebra-based swap bit count:              36,318
Timing for XOR unsafe-based swap bit count:        63,235
Timing for TempVar unsafe-based swap bit count: 37,904

Conclusion
If you get this question at an interview you can offer 2 alternatives to the problem - one based on addition and subtraction, and one based on XOR. However, it's wise to point out that the performance of the version that uses a temporary variable is better than the XOR version and here's why: the temporary variable solution uses 3 assignment operations, whereas the XOR solution uses 3 assignment operations and 3 bitwise XOR operations hence the temp. variable approach should be quicker. Interestingly, from my results it appears that the addition/substitution version is roughly as fast as the temporary version but it has the slight inconvenience of not working in all cases due to overflow exceptions. Given that my recommendation would be to use the temporary variable approach if running time is more important than space complexity, otherwise use the XOR version which is more memory efficient that the version using the temporary variable, and doesn't suffer from the overflow exceptions.

Bookmark and Share