Friday, January 6, 2012

2 Sum Problem : Given an integer array and a number T, find all unique pairs of (a, b) whose sum is equal to T

You are given an array of n integers and a target sum T. The goal is to determine whether or not there are two numbers x,y in A with x+y=T.

Example : Suppose we have an int array = {5, 3, 7, 0, 1, 4, 2} and T = 5. The unique pairs that sum up to 5 are (5, 0) (3, 2) and (1, 4).

There are three approaches to solve this problem - 1) brute force, 2) sort the array and use binary and search, and 3) Using the hashtable. 
Please scroll down, if you are only interested in the best approach i.e. approach 3 using hashtables.

Approach 1 : Brute force method
The first approach is to use brute force which gives time complexity of O(n^2) and space complexity of O(n). We basically just have to loop through the array and for each number we add it to each of other numbers and see if they sum up to 5. If so, we print out the pair. Here is the code for brute force approach:
void findPairOfSum(int arrayOfNum[], int arraySize, int sum)
  for (int i = 0; i < arraySize; i++)
    for (int j = i; j < arraySize; j++)
      if (arrayOfNum[i] + arrayOfNum[j] == sum)
        cout << "(" << arrayOfNum[i] << ", " << arrayOfNum[j] << ")" << endl;
The second approach is to use a hash table to store the difference between sum and each of the elements in the array. Then as we loop through the array, check each element against the hash table. If the element presents, then print out the key value pair. For example, if we hash the example array we'll have this hash table:
Key   Value
5        5 - 5 = 0
3        5 - 3 = 2
7        5 - 7 = -2
0        5 - 0 = 5
1        5 - 1 = 4
4        5 - 4 = 1
2        5 - 3 = 2

This approach will have the time complexity of O(n) and space complexity of O(n). Thus, chose your method wisely, depending on your need (speed or space efficiency).

2nd Approach - Use sorted array
A better way would be to sort the array. This takes O(n log n)
Then for each x in array A, use binary search to look for T-x. This will take O(nlogn).
So, overall search is  O(n log n)

Hers is how the pseudo code will look:
arr = {};//some array
sortedArr = sort(arr);
for( i = 0;i < arr.length - 1; i++)
   x = arr[i];
   bool found = binarySearch(sortedArr, T-x);//Search for T-x in sorted Arrary
    print "pair", x, T-x;

Approach 2b - Using sorting but using variant of binary search
Assuming array sorted in ascending order. Now we take first and last element, and sum them up. If sum is equal to T, we have found the pair, but if sum is greater than T, we reduce right pointer by 1, or increment left pointer otherwise.

arr = {};//some array
sortedArr = sort(arr);
left = start;
right= arr.length;
while(left < right)
   x = arr[left];
   y = arr[right];
   sum = x+y;
   if(sum == T)
   if(sum > T)
   if(sum < T)
    print "pair", x, T-x;

3rd and Best - Use hash table
I have already mentioned in problem in the application of hash table here.
The best way would be to insert every element into a hash table (without sorting). This takes O(n) as constant time insertion.
Then for every x, we can just look up its complement, T-x, which is O(1).
Overall it takes will be O(n).

Here is how the pseudocode will look.
Let arr be the given array.
And T be the given sum

for (i=0 i<arr.length - 1 ;i++)
  hash(arr[i]) = i  // key is the element and value is its index.

for (i=0 i<arr.length - 1; i++)
  if (hash(T - arr[i]) != i ) // if T - ele exists and is different we found a pair
    print "pair i , hash(T - arr[i]) has sum T"

Please let us know if you know any other approach. Thanks.



  1. how to build the hash table in the approach 3. also i think you should check if hash(T-arr[i]) is empty first.

    1. Hi Jianchen, I think we can use hashmap or we can use element as an index in the array. I think only way hash() is empty when the given array is null or has no element. In that case we will not enter into any for loop.


  2. This comment has been removed by the author.

  3. What if there are duplicates in the array? We should keep track of the number of occurrences of each number and update as we go.

    1. Still it work, as we have to just find the sum. If 1 element is in the hashmap, and say other element repeats, we won't care as their sum will not be equal to given sum T.

  4. I don't want to find unique pairs,I want all the numbers that form the sum.
    eg: array = {5, 6, 7, 4, 1, 3, 2, 8, 9, 10} and T = 15
    3+2+6+4=15 ... etc

    1. Hi Vikram, this is a different problem all together. Please refer here for the solution -

  5. The hashmap one is a fucking genius. Got excited a bit :x