Saturday, January 7, 2012

Red Black Tree – Insertion

Introduction:
Please refer the introduction of RBT here. So, we now know the invariants or properties of RBT which make it rbt :)


Insertion of element in RBT or red black tree breaks one of the 4 invariants or properties. So we have to fix it in minimal possible way. To assist us with we have 2 solutions :
  • Flipping the color
  • Right and left rotation

Insertion:
Insertion begins by adding the node much as binary search tree insertion does and by coloring it red. The insertion of red node can cause red violation so we are going to fix it after the insertion of each new node.
let’s look at the node and tree structure.


struct tnode {

        int data;
        char color;
        struct tnode *left;
        struct tnode *right;

};

struct tree {

        struct tnode *root;
}*Tree;


Here is a helper function that returns a new red node.
struct tnode *makenode(int data)/*make a new node*/
{
        struct tnode *p;
        p = talloc();
        p->data = data;
        p->color = RED;
        p->left = p->right = NULL;
        return p;
}


http://athiraamazhichira.wordpress.com/2011/11/07/red-black-tree-insertion/


Red Black Tree – Deletion

Deletion of a node from a red black tree is very difficult. During insertion we selected the color of a new node as red to make it easier. We forced a red violation and fixed it. Red violations are easy to fix, and we took full advantage of that to produce a truly elegant recursive algorithm. When you want to delete a node, you don’t have the option of choosing its color. If it’s red, that’s good. Red nodes can be removed without any violations. Because there will be no red violation by removing a red node from a valid tree. Since a red node doesn’t participate in the black height, removing a red node has no effect on the black height. Therefore, removing a red node cannot violate the rules of a red black tree.
If the node is black, that’s a problem. Removing a black node is sure to cause a black violation just like inserting a black node, which is why we avoided that during insertion, and it could very likely cause a red violation too. Now we are going to delete red node, always since deleting a red node can’t violate any of the rules of a red black tree. How can delete a red node, always? first we force a red node into the tree at the top. Then we push down that red node down using colour flips and rotations. Hence we can ensure that a red node will be deleted, always.
There are only four cases for pushing a red node down the tree without a black violation. I am going to explain how to handle each case using examples. In each example we can see a node is deleted from the tree  given below. Before that first look at the deletion function. This delete function will use a sneaky deletion by copying technique that saves the node to be deleted, and just keeps on going through the loop until the inorder predecessor is found. At that point the walk down loop terminates, the data items are copied and the external node is spliced out of the tree.
The delete function look like this:
 struct tnode *p, *g, *p;
void delete(struct tree *tree, int data)
{
       if (tree->root != NULL) {
              struct tnode head = {0};
              struct tnode *f = NULL;
              struct tnode *child1, *child2, *sib, *next;
              int dir = 1;
              q = &head;
              g = p = NULL;
              q->right = tree->root;
              next = q->right;
              while (next != NULL ){
                    int last = dir;
                    g = p, p = q;
                    q = next;
                    dir = q->data < data;
                    if (dir == 1){
                         next = q->right;
                         if (q->data == data) {
                                child1 = q->left;
                                child2 = q->right;
                         }else {
                                child1 = q->right;
                                child2 = q->left;
                         }
                    } else if (dir == 0) {
                         next = q->left;
                         child1 = q->left;
                         child2 = q->right;
                    }
                    if (data == q->data )
                         f = q;
                  /*code for push down the red node down*/
           }
           if (f != NULL)
                  removal(f);
           tree->root = head.right;
           if ( tree->root != NULL )
                  tree->root->color = BLACK;
      }

}

In this function we used a dummy root to avoid special cases, and a few helper variables to point up the tree. The head is the dummy root, f points to target node. The child1 and child2 points to the two childs of current node. The sib points to sibling of the current node and next points to next node to be examined. The q points to the current node ang g and p are the grand parent and parent of the q respectively. Now we initialize q to points to dummy root and right child of dummy root as real root of tree. The variable dir indicates which subtree should be searched for the target node. Suppose the target node is in the right subtree, if it reaches target node dir become 0 and loop keeps on going through the left subtree of target node inorder to find inorder predecessor. On the way down we look and handle all the four cases for pushing red node down. After the deletion of target node root of the tree should be updated and its colour should be black.
The code for handling four cases are given below:
if (!is_red(q) && !is_red(child1)){
       if(is_red(child2)) {
              case2(dir, last);
       }else  if (!is_red(child2)){
               if (last == 0)
                       sib = p->right;
               else if(last == 1)
                       sib = p->left;
               if ( sib != NULL ) {
                       if ( !is_red ( sib->left ) && !is_red ( sib->right ) ) {
                              case1(sib);
                       }else {
                              int dir2 = g->right == p;
                              case3_4(sib, dir2, last);
                              proper_coloring(dir2);
                        }
               }
         }
}


First we check the whether the current node and one of its child is black or not. If both are black then we check the other child is black or not. if  other child is red then it is case2 that is dreaded red sibling case. Then we call the function case2 to handle it. If other child is black then we check the colour of the children of sibling of the current node. If both the sibling’s children are black then it is case1 that is simple reverse colour flip and we call function case1 to handle it. if one of the children of sibling is red then it is case3 or case 4. If right child is red we perform double rotation(case3) and if left child is red then we perform a single rotation(case4). The function case3_4 is called for handling it. Next we look at each cases. Consider the tree given below:
 Case 1:
The first case is a simple reverse color flip. If a node and it’s sibling are black, and all four of their children are black, make the parent black and both children red:
void case1(struct tnode *sib)
{
       p->color = BLACK;
       sib->color = RED;
       q->color = RED;
}


Let’s delete the node 25 from above red black tree. Since node 25 is red and it’s left child 22 and right child 27 is black, that is case1. After handling case1 now the f points to node 25 and  current node, q points to node 22 (inorder predecessor of node 25). Now we replace the data of node pointed by f with data of node pointed by q. After the deletion of node 25 the tree look like this:

case2:
The second case is dreaded red sibling case. If a node is black and it’s sibling is red then we reduce it  with a single rotation.

void case2(int dir, int last)
{
        if (dir == 1) {
                if (last == 1)
                        p = p->right = single_right(q);
                else if(last == 0)
                        p =  p->left = single_right(q);
        }else if (dir == 0){
                if (last == 1)
                        p = p->right = single_left(q);
                else if(last == 0)
                        p =  p->left = single_left(q);
        }
}

Let’s delete the node 17 from above red black tree. It is black and it’s right child is red and left child is black, that is case2. After handling case2  now the f points to node 17 and current node, q points to 15(inorder prececessor of node 17). Now we check the colour of sibling of node 15 since node 15 and both of its children is black(null pointer). The sibling of node 15 is node 22 and it is black. Hence we have to handle this case 1. After handling case1 we replace the data of f by data of current node, q.
After the case2 the tree look like this:

After case1 and remove tree look like this:

case 3 and case4:
If the node and both of its child are black then we look for the colour of it’s sibling’s children. If the right child of sibling is red then we reduce it with double rotation, that is case3. If sibling’s left child is red then single rotation is enough. It is case4.
case3:


case4:

void case3_4(struct tnode *sib, int dir2, int last)
{
       if ( is_red ( sib->right ) ){/*case 3*/
              if (dir2 == 0 && last == 0)
                     g->left = doubble_right ( p);
              else if (dir2 == 0 && last == 1)
                     g->left = doubble_left(p);
              else if(dir2 == 1 && last == 0)
                     g->right = doubble_right(p);
              else if (dir2 == 1 && last == 1)
                     g->right = doubble_left(p);
       } else if ( is_red ( sib->left ) ){/*case 4*
              if (dir2 == 0 && last == 0)
                     g->left = single_left ( p);
              else if (dir2 == 0 && last == 1)
                     g->left = single_right(p);
              else if (dir2 == 1 && last == 0)
                     g->right = single_left(p);
              else if (dir2 == 1 && last == 1);
                     g->right = single_right(p);

       }
}


Now delete the node 1 from the above red black tree. Since node 1 is at left subtree of root, loop traverse through the left subtree. When it reaches node 8, it’s both children are black so we check the colour of it’s sibling children. The sibling of node 8 is node 17 and it’s right child node 25 is red. Hence it is case3. We perform a double rotation. After double rotation we will force the correct colouring for all affected nodes by calling proper_coloring routine.
after double rotation:

after proper colouring:

Now the loop continues. When it reaches the node 1 its one child is black(Null pointer) and another is red so it is case2. After handling case2 we remove the node 1. Now tree look like this:

The removal function look like this:

void removal(struct tnode *f)
{
        int flag1 = p->right == q;
        int flag2 = q->left == NULL;
        f->data = q->data;
        if (flag1 == 0 && flag2 == 0)
                p->left = q->left;
        else if (flag1 == 0 && flag2 == 1)
                p->left = q->right;
        else if (flag1 == 1 && flag2 == 0)
                p->right = q->left;
        else if (flag1 == 1 && flag2 == 1)
                p->right = q->right;
        free ( q );

}























Bloom filters

The bloom filter is a data structure designed in 1970. It is similar to a hash table. They are more efficient than a hash table, but they do raise false positives at times.


Supported Operations
A bloom filter supports super fast lookup and insertion, just like a hash table. So why have this data structure?

Pros
  1. It is much more space efficient.
    It takes the space even less than that of the object. So, we don't even worry about what the object is, but we only care about whether we have seen that object or not. So, you may have used something called HashSet in java.
Cons
  1. You can't store an associated object. You are only storing a value to indicate whether or not an object is in the data structure. 
  2. Deletions are also not allowed.  (by the way it is still possible to delete, but this point still holds for vanilla or simple bloom filter)
  3. There is also a small false positive probability. This means that even if you haven't inserted object x in the bloom filter, it will say x has been inserted.
Applications
  1. Early spellcheckers (of 1970s) - you add every possible word into the filter. Then you just check word by word if something is in the filter or not. If it is not in the filter, then that is an incorrect word. With small probability, this will say that an incorrectly spelled word is actually correct.
  2. List of forbidden passwords (still relevant) - This could be used to prevent the use of common or too-easily guessed passwords. 
  3. Software used in Network routers (modern) - There is limited memory and it needs a super-fast data structure, which can manage the packets coming at torrential rate, keeping track of denial of service attack, keeping track of banned ip address. 
When to use Bloom filters?
They are light weight than hash table, but they have some probability of false positive error.
So, they can be used where speed and size is the constraint, but little amount of error is allowed.

Implementing Bloom filters
Array of n bits
Like hashtable, they need to use array for fast access. So they use an array and couple of hash functions.

To implement this data structure, you use an array of n bits (bits can be either 0 or 1) . The space occupied by the bloom filter is in terms of number of bits per object that have been inserted in the bloom filter. So, if you have inserted dataset S, and number of bits inserted in the filter is n, then size of the bloom filter or size per object = n / |S|
|S| - size of the set

For eg. if the ratio is 8, then the bloom filter is using the 8 bits per object, which is awesome.(way less than even the storing the pointer of the object)

k hash functions

We also need k has functions, somewhere around 2-5 (You could also make 2 hash functions and make k linear combinations of those 2 functions).

Insert(x)
Insert(x)
{
   for i=1,2,...,k 
      set A[hi(x)]=1
}

So, we set these k positions in array of bits equal to 1.

Lookup(x)
Then for lookup, you just return true if A[hi(x)]==1 for i=1,2,...,k.
Here, you can't get a false negative since we never return an entry in the array back to 0.
However, you can get a false positive if other insertions set all k hi(x) to 1.

For this data structure to be good, we need to see that the probability of a false positive is relatively low, while keeping the space needed small. So, we have to find the trade off curve between the 2.


Implementation in CPP

Bloom filters are very simple. It is a big array of bits, initially all zero.

char *bitarray;
bitarray =malloc(100000000*sizeof(char));
for (i = 0; i < MAXSIZE; i++)
bitarray[i] = 0;


There must also be k different hash functions defined, each of which maps or hashes some set element to one of the m array positions with a uniform random distribution. Let us look at how an element is added in to a bloomfilter. Here i used two different hash function namely, hash1 and hash2. Suppose we are going to represent a dictionary of words using bloom filter, first we will feed each word in to two hash functions to get two array positions which is used to set the corresponding bit in the array of bits. Sometimes there’ll be clashes, where the bit will already be set from some other word. This doesn’t matter.
insert function look like this.
void insert()
{
        char word[ARRSIZE];
        unsigned int h1, h2;

        scanf("%s", word);
        h1 = Hash1(word, strlen(word));
        add(h1);
        h2 = Hash2(word, strlen(word));
        add(h2);
}


Here insert function uses hash1 and hash2 hash functions to produce two array positions. The function add set the corresponding bit in the array of bits. The add function look like this.

#define BITSIZE 8
void add(unsigned int h)
{
        bitarray[h / BITSIZE] |= (1 << (7 - h % BITSIZE));
}

The hash1 and hash2 function look like this.
unsigned int Hash1(char *str, unsigned int len)
{
        unsigned int b    = 1;
        unsigned int a    = 1;
        unsigned int hash = 0;
        unsigned int i    = 0;

       for(i = 0; i < len; str++, i++){
              hash = hash * a + (*str);
              a    = a * b;
        }
        return hash;
}



hash2:
unsigned int Hash2(char* str, unsigned int len)
{
       unsigned int seed = 13;
       unsigned int hash = 0;
       unsigned int i    = 0;

       for(i = 0; i < len; str++, i++){
              hash = (hash * seed) + (*str);
       }
       return hash;
}

Next how to check an element is in the set? first feed the element in to two hash functions to get two array positions. If any of the bits at these positions are
0, the element is not in the set – if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set,
or the bits have been set to 1 during the insertion of other elements(The Bloom filter reports a false positive).
void lookup()
{
        unsigned int h1, h2;
        int true1, true2;
        char word[ARRSIZE];

        scanf("%s", word);
        h1 = Hash1(word, strlen(word));
        true1 = bitarray[h1 / BITSIZE] & (1 << (7 - h1 % BITSIZE));
        h2 = Hash2(word, strlen(word));
        true2 = bitarray[h2 / BITSIZE] & (1 << (7 - h2 % BITSIZE));
        if (true1 && true2)
                printf("%s is in the set\n", word);
        else
               printf("%s is not in the set\n", word);

}


Thanks

Friday, January 6, 2012

AVL Tree : Deleting a node from AVL Tree

Deleting the node works the same way, when we delete the node, if it upsets the balance, it will have to re-balance. Eg. Consider the tree

AVL Tree : Rotations

A tree rotation can be an imtimidating concept at first. You end up in a situation where you're juggling nodes, and these nodes have trees attached to them, and it can all become confusing very fast. I find it helps to block out what's going on with any of the subtrees which are attached to the nodes you're fumbling with, but that can be hard.

Left Rotation (LL) or Single Left rotation

Imagine we have this situation:

a
 \
  b
   \
    c
To fix this, we must perform a left rotation, rooted at A. This is done in the following steps:

b becomes the new root.
a takes ownership of b's left child as its right child, or in this case, null.
b takes ownership of a as its left child.

temp = p->right;
p->right = temp->left;
temp->left = p;
p = temp;

The tree now looks like this:
Figure 1-2
  b
 / \
a   c
Right Rotation (RR) or Single Right Roation

A right rotation is a mirror of the left rotation operation described above. Imagine we have this situation:
Figure 1-3
    c
   /
  b
 /
a
To fix this, we will perform a single right rotation, rooted at C. This is done in the following steps:

b becomes the new root.
c takes ownership of b's right child, as its left child. In this case, that value is null.
b takes ownership of c, as it's right child.

temp = p->left;
p->left = temp->right;
temp->right = p;
p = temp;

The resulting tree:
Figure 1-4
  b
 / \
a   c
Left-Right Rotation (LR) or "Double left"

Sometimes a single left rotation is not sufficient to balance an unbalanced tree. Take this situation:
Figure 1-5
a
 \
  c
Perfect. It's balanced. Let's insert 'b'.
Figure 1-6
a
 \
  c
 / 
b

Our initial reaction here is to do a single left rotation. Let's try that.
Figure 1-7
  c
 /
a
 \
  b
Our left rotation has completed, and we're stuck in the same situation. If we were to do a single right rotation in this situation, we would be right back where we started. What's causing this? The answer is that this is a result of the right subtree having a negative balance. In other words, because the right subtree was left heavy, our rotation was not sufficient. What can we do? The answer is to perform a right rotation on the right subtree. Read that again. We will perform a right rotation on the right subtree. We are not rotating on our current root. We are rotating on our right child. Think of our right subtree, isolated from our main tree, and perform a right rotation on it:

Before:
Figure 1-8
  c
 /
b
After:
Figure 1-9
b
 \
  c
After performing a rotation on our right subtree, we have prepared our root to be rotated left. Here is our tree now:
Figure 1-10
a
 \
  b
   \
    c
Looks like we're ready for a left rotation. Let's do that:
Figure 1-11
  b
 / \
a   c
Voila. Problem solved.



Right-Left Rotiation (RL) or "Double right"


A double right rotation, or right-left rotation, or simply RL, is a rotation that must be performed when attempting to balance a tree which has a left subtree, that is right heavy. This is a mirror operation of what was illustrated in the section on Left-Right Rotations, or double left rotations. Let's look at an example of a situation where we need to perform a Right-Left rotation.
Figure 1-12
  c
 /
a
 \
  b
In this situation, we have a tree that is unbalanced. The left subtree has a height of 2, and the right subtree has a height of 0. This makes the balance factor of our root node, c, equal to -2. What do we do? Some kind of right rotation is clearly necessary, but a single right rotation will not solve our problem. Let's try it:
Figure 1-13
a
 \
  c
 /
b
Looks like that didn't work. Now we have a tree that has a balance of 2. It would appear that we did not accomplish much. That is true. What do we do? Well, let's go back to the original tree, before we did our pointless right rotation:
Figure 1-14
  c
 /
a
 \
  b
The reason our right rotation did not work, is because the left subtree, or 'a', has a positive balance factor, and is thus right heavy. Performing a right rotation on a tree that has a left subtree that is right heavy will result in the problem we just witnessed. What do we do? The answer is to make our left subtree left-heavy. We do this by performing a left rotation our left subtree. Doing so leaves us with this situation:
Figure 1-15
    c
   /
  b
 /
a

This is a tree which can now be balanced using a single right rotation. We can now perform our right rotation rooted at C. The result:
Figure 1-16
  b
 / \
a   c
Balance at last.

Bitonic Sort

http://xlinux.nist.gov/dads/HTML/bitonicSort.html

Bidirectional Bubble Sort

http://sanjaal.com/java/198/java-data-structure/java-code-implementation-of-bidirectional-bubble-sort/

Sorting in Hungarian Way

Checked out some videos on youtube. You too have a look:

Quick Sort:



Insertion Sort



Bubble Sort



Shell sort



Merge Sort


How to rotate an array?

One of the classic problems with array is to perform in-place rotation on the array. For example, if we have an array of integers such as int[]array = new int[]{1, 2, 3, 4, 5, 6} and we want to rotate it around index 2, then the rotated array is {3, 4, 5, 6, 1, 2}. So, how can we rotate an array around a given index? Well here is the algorithm with explanation follows after:
public void reverseArray(int[] array, int start, int end)
{
 int i;
 int temp;

 while (start < end)
 {
  temp = array[start];
  array[start] = array[end];
  array[end] = temp;
  start++;
  end--;
 }
}

private static void rotateArrayByIndex(int array[], int rotatePos, int arraySize)
{
 reverseArray(array, 0, rotatePos - 1);
 reverseArray(array, rotatePos, arraySize - 1);
 reverseArray(array, 0, arraySize - 1);
}
The strategy consists of three steps(consider the array - 1, 2, 3, 4, 5, 6).
  • First, we reverse the first half of the array--starting from index 0 to (rotate index - 1).
    rotateIndex = 2, hence array becomes = 2,1....3,4,5,6
  • Then we reverse the second half of the array--starting from rotate index to the last index of the array. 2,1....6 5 4 3
  • Lastly, we reverse the whole array.
    3, 4, 5, 6, ...1,2
The first function reverses a specific portion of an array. It's simple. We just swap the last element with the first element, the second to last element with the second element and so on until the start and end indices meet.
Once we have the reverse function, the rotate function is trivial. rotateArrayByIndex does the three steps that we outlined above in order to rotate an input array by a specified index.

More and source - http://stackoverflow.com/questions/4457277/algorithm-to-rotate-an-array-in-linear-time

2 Sum Problem : Given an integer array and a number T, find all unique pairs of (a, b) whose sum is equal to T

You are given an array of n integers and a target sum T. The goal is to determine whether or not there are two numbers x,y in A with x+y=T.

Example : Suppose we have an int array = {5, 3, 7, 0, 1, 4, 2} and T = 5. The unique pairs that sum up to 5 are (5, 0) (3, 2) and (1, 4).

There are three approaches to solve this problem - 1) brute force, 2) sort the array and use binary and search, and 3) Using the hashtable. 
Please scroll down, if you are only interested in the best approach i.e. approach 3 using hashtables.

Approach 1 : Brute force method
The first approach is to use brute force which gives time complexity of O(n^2) and space complexity of O(n). We basically just have to loop through the array and for each number we add it to each of other numbers and see if they sum up to 5. If so, we print out the pair. Here is the code for brute force approach:
void findPairOfSum(int arrayOfNum[], int arraySize, int sum)
{
  for (int i = 0; i < arraySize; i++)
  {
    for (int j = i; j < arraySize; j++)
    {
      if (arrayOfNum[i] + arrayOfNum[j] == sum)
        cout << "(" << arrayOfNum[i] << ", " << arrayOfNum[j] << ")" << endl;
    }
  }
} 
The second approach is to use a hash table to store the difference between sum and each of the elements in the array. Then as we loop through the array, check each element against the hash table. If the element presents, then print out the key value pair. For example, if we hash the example array we'll have this hash table:
Key   Value
5        5 - 5 = 0
3        5 - 3 = 2
7        5 - 7 = -2
0        5 - 0 = 5
1        5 - 1 = 4
4        5 - 4 = 1
2        5 - 3 = 2

This approach will have the time complexity of O(n) and space complexity of O(n). Thus, chose your method wisely, depending on your need (speed or space efficiency).

2nd Approach - Use sorted array
A better way would be to sort the array. This takes O(n log n)
Then for each x in array A, use binary search to look for T-x. This will take O(nlogn).
So, overall search is  O(n log n)

Hers is how the pseudo code will look:
arr = {};//some array
sortedArr = sort(arr);
for( i = 0;i < arr.length - 1; i++)
{
   x = arr[i];
   bool found = binarySearch(sortedArr, T-x);//Search for T-x in sorted Arrary
   if(found)
    print "pair", x, T-x;
}

Approach 2b - Using sorting but using variant of binary search
Assuming array sorted in ascending order. Now we take first and last element, and sum them up. If sum is equal to T, we have found the pair, but if sum is greater than T, we reduce right pointer by 1, or increment left pointer otherwise.

arr = {};//some array
sortedArr = sort(arr);
left = start;
right= arr.length;
while(left < right)
{
   x = arr[left];
   y = arr[right];
   sum = x+y;
   if(sum == T)
      found=true;
   if(sum > T)
      right--;
   if(sum < T)
      left++;  
   
   if(found)
    print "pair", x, T-x;
}

3rd and Best - Use hash table
I have already mentioned in problem in the application of hash table here.
The best way would be to insert every element into a hash table (without sorting). This takes O(n) as constant time insertion.
Then for every x, we can just look up its complement, T-x, which is O(1).
Overall it takes will be O(n).

Here is how the pseudocode will look.
Let arr be the given array.
And T be the given sum

for (i=0 i<arr.length - 1 ;i++)
{
  hash(arr[i]) = i  // key is the element and value is its index.
}

for (i=0 i<arr.length - 1; i++)
{
  if (hash(T - arr[i]) != i ) // if T - ele exists and is different we found a pair
    print "pair i , hash(T - arr[i]) has sum T"
  
}


Please let us know if you know any other approach. Thanks.

Printing "Hello World" to screen without using a single semicolon in C/C++

When I first encountered this question, I thought it was impossible. However, there are ways to do so. I don't know if the solutions are exploitation of c / c++. Anyway, here are the most common solutions:
int main()
{
    if (printf("Hello World \n")) { }
}



Or you can use the while loop like this:
int main()
{
    while (printf("Hello World") > 20) { }
}
If you don't think these codes will work, try them out. They are weird but work well in c / c++.
Notice: These don't work in Java. Not sure about other languages, so use them responsibly. After all, this is just for fun :)

Construct a full binary tree from a pre-order list of marked nodes



Question: construct a full binary tree from a list of nodes in pre-order. The nodes are marked 'i' if they are inner nodes and are marked 'l' if they are leaf nodes.
Example given a list of nodes marked as (i, i, i, l, l, i, l, l, i, l, l), we need to construct a tree from those nodes so that the result looks like this:






Notice how all "l" nodes are leaf nodes and "i" nodes are inner nodes? Moreover, since the nodes are given in pre-order, we must take into account the node order when we put them into the tree. The first node in the list will always be the root node because in pre-order, root node comes first then left most and then right most. Also, since this is a full binary tree, we know that every node, except leaf node, must have 2 children.
With the information provided by the question, here is the code to solve this problem:

#include<iostream>
#include<stack>
using namespace std;

struct Node
{
  Node* left;
  Node* right;
  char mark;
};

Node* constructFromPreorder(Node* preorder[], int treeSize)
{
  stack<Node*> stackOfNodes;

  Node* root = preorder[0]; //pointer to root node
  
  stackOfNodes.push(preorder[0]);

  int count = 1;
  
  while (count < treeSize && !stackOfNodes.empty())
  {
    if (stackOfNodes.top()->left == NULL || stackOfNodes.top()->right == NULL)
    {
      Node* currentNode = preorder[count];
      
      if (stackOfNodes.top()->left == NULL)
        stackOfNodes.top()->left = currentNode;
      else if (stackOfNodes.top()->right == NULL)
        stackOfNodes.top()->right = currentNode;

      if (currentNode->mark == 'i')
        stackOfNodes.push(currentNode);
        
      count++;
    }
    else
    {
      stackOfNodes.pop();
    }
  }

  return root;
}
 
 
 
Explanation: assuming that we have the Node struct like the one in the code, then the left pointer of a node points to the left child of that node while the right pointer points to the right child. Hence, we can now put those nodes in a tree by connecting parent nodes to child nodes.
  1. Declare a stack to keep track of inner nodes. Whenever we encounter an inner node, we'll push it into the stack. And, we'll pop it when it has two children. This makes sure that every inner node is connected to its two children.
  2. Add the root which is the first node in the list into the stack. Because we'll add and push the nodes until the stack is empty or until we run out of nodes, we need to add the root to initialize the stack.
  3. Set a count flag to 1. This count flag is used to exit the loop. When count reaches the total number of nodes in the list, it means we've run out of nodes to add into our tree. Moreover, we initialize the flag to 1 since we have already processed one node(root node) from the list in step 2.
  4. Loop until either we run out of nodes to add or until the stack is empty.
  5. For every loop:

    a) If the top of the stack is not connected to its two children yet, we get a node out of our list and attach it to the top node in the stack as that node's child. Then, we push that newly retrieved node into the stack if it is an inner node. We also increase the count flag by 1, so we can retrieve the next node in the list in the next loop.

    b) If the top of the stack has already been connected to two children, we pop it and do nothing.

  6. When the loop is done, we simply return the pointer to the root. Whichever function uses this pointer will have access to the entire tree that we have just constructed.
Performance: this algorithm has time complexity of O(n) where n is the number of nodes in the list because we loop through the list only once. In terms of space, the algorithm has space complexity of O(m) where m is the depth of the tree we have to construct. The reason is that in the worst case we must store the second-deepest nodes into our stack.
 

Converting integers less than 100 to words

This is a classic problem in which we're given an integer less than 100 and we have to output the words corresponding to the integer's value. For instance, if input is 90, then the output is "ninety" Moreover, if the input is -99, then the output is "incorrect input"
To solve this problem, we should have arrays of unique names such as "one" and "thirteen". The reason is that other integers' names are made up of just those unique names. For example, "nine" is unique name and "ninety" is unique name while "ninety nine" is just made up of "ninety" and "nine" The trick is to arrange our array such that it is easy to access the names. Here is one of the solutions to this problem:
string spellIntLessThanOneHundred(int number)
{
  if (number >= 100 || number < 0)
    return "Incorrect Input";
  
  string tensNames[] = {"", "ten", "twenty", 
"thirty", "fourty", "fifty", "sixty", "seventy", "eighty", "ninety"};
  
  string lessThanTwentyNames[] = {"zero", "one", 
"two", "three", " four", "five", "six", "seven", "eight", 
"nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", 
"sixteen", "seventeen", "eighteen", "nineteen"};

  string result;

  //numbers >= 20
  if ((number / 10) > 1)
  {
    result = tensNames[number / 10];

    number %= 10;
    
    if (number > 0)
      result = result + " " + lessThanTwentyNames[number];
  }
  //numbers < 20
  else
    result = lessThanTwentyNames[number];

  return result;
}



Notice that we have two separate arrays of unique names, one for the numbers less than twenty and one for numbers that divisible by 10 such as 20 and 30. With this arrangement, it is very easy to access the name.
  1. Our algorithm first checks for valid input which must be less than 100 and greater or equal 0.
  2. Next, we look at the first digit. If dividing the input by 10 gives a number greater than one, the input must be greater or equal 20. By dividing the input by 10, we can extract the first digit out of the input. For example, dividing 21 by 10 gives 2 (implicit type conversion). We then use the first digit to access its name in the tensNames[] array. So, if the first digit is 2 then its name is twenty.
  3. After that, we look at the last digit. By taking the remainder of the input divided by 10, we can extract the last number. If the last number is greater than 0, then we just need to get its name from the lessThanTwentyNames[] array. For example, 21 % 10 gives 1, so its name is one and the result now becomes "twenty one". However, if the remainder equals 0, we don't add anything to the result because that input is certainly divisible by 10 and has a unique name.
  4. If dividing the input by 10 results in a number less than or equal to 1, then the input must have a unique name and less than 20. Thus, we just need to look for it in lessThanTwentyNames[] array

Find all subsets of a given set OR Find power set of a given set

Problem
Write a method that returns all subsets of a set.
Example
If we're given a set of integers such that S = {1, 2, 3}, how can we find all the subsets of that set? For example, given S, the set of all subsets i.e. P(S) we get are {}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}.

Here is {} is empty set denoted by Φ.

If S has n elements in it then P(s) will have 2^n elements.P(S) is also called power set.

Solution
This can be solved in couple of ways using recursion, backtracking.

Method 1 - Using recursion
If we have to find the power set of a,b,c
Recursively, P(abc) = {abc} + P(ab) + P(ac), and so on
Suppose we are given a set A = {1,2,3}, which is the originalSet. We want to get the outputSet.
Pseudocode
set powerSet(A){
  add set A to outputSet;
  for each item in A {
   let subset = A - item,
   add powerset(substring) to outputSet
  }
  return outputSet;      
}

Here is the code in java
public static <T> Set<Set<T>> powerSet(Set<T> originalSet) {
    Set<Set<T>> sets = new HashSet<Set<T>>();
    if (originalSet.isEmpty()) {
     sets.add(new HashSet<T>());
     return sets;
    }
    List<T> list = new ArrayList<T>(originalSet);
    T head = list.get(0);
    Set<T> rest = new HashSet<T>(list.subList(1, list.size())); 
    for (Set<T> set : powerSet(rest)) {
     Set<T> newSet = new HashSet<T>();
     newSet.add(head);
     newSet.addAll(set);
     sets.add(newSet);
     sets.add(set);
    }  
    return sets;
}

And a test, given your example input:
 Set<Integer> mySet = new HashSet<Integer>();
 mySet.add(1);
 mySet.add(2);
 mySet.add(3);
 for (Set<Integer> s : SetUtils.powerSet(mySet)) {
     System.out.println(s);
 }

Can we get rid of loops? Yes we can, here is the pseudocode:
powerset(originalSet) {
  add originalSet to set;
  add powerset2(originalSet,0) to set;
  return set
}

powerset2(originalSet,pos) {
  if pos<length(originalSet) then
    let subSet = (string excluding the char at pos)
    add powerset(subSet) to set
    add powerset2(originalSet,pos+1) to set
  else 
    add "" to set
  endif
  return set
}

Here is the code in java
public static <T> Set<Set<T>> powerSet2(Set<T> originalSet) {
    Set<Set<T>> sets = new HashSet<Set<T>>();
    if (originalSet.isEmpty()) {
        sets.add(new HashSet<T>());
        return sets;
    }
    List<T> list = new ArrayList<T>(originalSet);
    T head = list.get(0);
    Set<T> rest = new HashSet<T>(list.subList(1, list.size()));
    addSets(sets, powerSet(rest), head);
    return  sets;
}

private static <T> void addSets(Set<Set<T>> sets, Set<Set<T>> setsToAdd, T head) {
    Iterator<Set<T>> iterator = setsToAdd.iterator();
    if (iterator.hasNext()) {
        Set<T> set = iterator.next();
        iterator.remove();
        Set<T> newSet = new HashSet<T>();
        newSet.add(head);
        newSet.addAll(set);
        sets.add(newSet);
        sets.add(set);
        addSets(sets, setsToAdd, head);
    }
}

Time complexity - O(2^n)

Method 2 - Using bit strings

Consider the example
Set  = [a,b,c]
power set size = pow(2, 3) = 8
Run for binary counter = 000 to 111

Value of Counter          Subset
--------------------------------------
   000                      Empty set
   001                      a
   011                      ab
   100                      c
   101                      ac
   110                      bc
   111                      abc

To solve this problem there are two techniques that we need to understand:
  1. Determine the number of subsets using bit strings: we know that the number of subsets equals to 2 to the power of the number of elements in the superset: #subsets = 2 ^ #elements. For instance, if our superset is S = {1, 2, 3}, then it has 2^3 = 8 subsets. If our superset has four elements then it has 2^4 = 16 subsets. Moreover, by shifting the bit representation of the number 1 by n, we also get 2^n. Thus, if we shift the bit string of 1 by the number of elements in the superset, we'll get the number of subsets for that superset. For example, if we have S = {1, 2, 3}, then there are 1 << 3 = 2^3 subsets in S.
  2. Keeping track of data using bit manipulation: for this problem, we will use a bit string to keep track of subsets and their elements. Take S = {1, 2, 3} as an example. If the bit string is 100 then the subset we're working on has only one element which is 1. If the bit string is 110 then the subset we're working on has two elements which are 1 and 2. We will use & and << bit operators to figure out which bit is 1 in the bit string.
If you are not familiar with bit manipulation please read "Bit manipulation tips & tricks" part 1 and part 2. Here is the code for the algorithm written in Java:


public static void findSubsets(int array[])
{
  int numOfSubsets = 1 << array.length; //2^n

  for(int i = 0; i < numOfSubsets; i++)
 {
   int pos = array.length - 1;
   int bitmask = i;

   System.out.print("{");
   while(bitmask > 0)
   {
    if((bitmask & 1) == 1)
      System.out.print(array[pos]+",");
    bitmask >>= 1;
    pos--;
   }
   System.out.print("}");
 }
}
Code explanation:
  1. First, we find out the number of subsets that the superset has by shifting the bit representation of 1 by the number of elements in the superset.
  2. Next, we just loop through each subset and generate its elements accordingly.
  3. Inside the loop, we use pos and bitmask to keep track of the element. Specifically, bitmask is the bit string that represents elements in the current subset. And, we use post to retrieve the correct element from the superset.
  4. The while loop will add the correct element to the subset. Note that bitmask & 1 equals to 1 only when bitmask has a '1' bit at the last position. For example, bitmask = "001" or "011" will make bitmask & 1 equal to 1. That's when we'll add an element into the subset. Why does it work? Well, for each iteration of the while loop, we'll shift bitmask by one bit position to the right (bitmask >>= 1) and we decrement pos by 1 (pos--). Thus, whenever there is a '1' bit at the last bit position, we know exactly which element to add into the subset. If you are confused, don't worry because we'll do an example!
  5. After finishing one subset, we go to a new line and continue processing the rest.
Dry running the above code
set is {1, 2, 3, 4, 5}
i from 0 to 31:
i = 00000 => print {}
i = 00001 => print {5} (or 1(if pos=0, and pos is incremented, 
                            the order in which you do it really shouldn't matter)
i = 00010 => print {4}
    00011 => print {5, 4}
    00100 => print {3}
    00101 => print {5, 3}
    00110 => print {4, 3}
    00111 => print {5, 4, 3}
        ...
    11111 => print {1, 2, 3, 4, 5}


Time complexity - O(n.2^n)

References
http://stackoverflow.com/questions/2779094/what-algorithm-can-calculate-the-power-set-of-a-given-set
http://stackoverflow.com/questions/1670862/obtaining-powerset-of-a-set-in-java
http://stackoverflow.com/questions/14224953/get-all-subsets-of-a-set
http://stackoverflow.com/questions/4640034/calculating-all-of-the-subsets-of-a-set-of-numbers
http://www.geeksforgeeks.org/power-set/

Turning a bit off without affecting other bits

Hello and welcome to the second part of "Bit Manipulation Tips & Tricks." If you haven't read the first part yet, here is the link of you. Read it! :)
1. Turning a bit off without affecting other bits
Turning a bit off means to change 1 bits (on) to 0 bits (off) and if the bits are already off then they'll stay off. This can be accomplished by using the ~, << and & operators. Here is the genera formula:

turnBitOff (int bitString, int bitPosition)
{
  return bitString & ~(1 << bitPosition);
}
  • bitString: is the set of bits we want to change. For instance, we want to turn off a bit in the bit string 1110.
  • bitPosition: is the position of the bit in the bit string that we want to turn off. For instance, we want to turn off the 3rd bit (bold) in 1110. Then, the bit position is 3.
  • 1 is just our help bit string. Its binary value is 0001.
We can now do an example to see how our formula. To turn off the 3rd bit of the bit string 1100, we do this: 1100 & ~(0001 << 3) = 1100 & ~(1000) = 1100 & 0111 = 0100. It works! :)
2. Turning a bit on without affecting other bits
If we can turn a bit off, then can we turn a bit on? Of course, we can. It's just the matter of replacing the & operator with the || operator and getting rid of the ~ operator. Here is the formula:
turnBitOn (int bitString, int bitPosition)
{
  return bitString || (1 << bitPosition);
}
To convince ourselves that this works, we should do two examples. To turn on the 2nd bit of bit string 1011 off, we do this: 1011 || (0001 << 2) = 1111 || (0100) = 1011 || 0100 = 1111. It works! How about trying to turn a bit on even if it's already on? So, to turn on the 3rd bit of bit string 1111 (which is already on), we follow the formula: 1111 || (0001 << 3) = 1111 || 1000 = 1111. As you can see, noting changed because the bit has already on.
3. Keeping track of boolean values of a data set using bits
Let's say we have an array of pictures and we want to keep track of them to see which one is in stock and which one is out of stock. How do we keep track of those pictures with least amount of memory? We can use a bit string.
If we have an array of 4 pictures to keep track of, we will have a bit string of 4 bits 0000. Whichever element is in stock will have its corresponding bit on. For instance, if picture no. 0 is in stock then the 0th bit is on, resulting in the bit string of 0001. If picture no. 3 is in stock then the 3rd bit is on, resulting in the bit string of 0100. If all pictures are in stock then all bits are on -> 1111.
We use the picture selling business as an example for convenience, but this technique can be applied to many different situations. For instance, we can use it to keep track of repeating letters in a word instead of using a hash table.
To make this whole scheme work, we need to employ the 1st and 2nd technique above. We also need a way to check to see if a bit is on or off to find out if a picture is in stock or no. Therefore, we will use a technique discussed in the first part of this tutorial. It is to check if a bit is on or off. If you don't remember, here is the formula for that technique:
int isAvailable (int bitString, int targetNum)
{
  return bit_string & (1 << targetNum);
}
Let's go back to our example. We have the bit string of 0001 that represents our stock. And, we want to know if picture no. 3 is in stock or not. By applying the formula, we can see that:
  • 1 = 0001 is our helper string and targetNum = 3 because we need to know about picture no. 3
  • bit_string & (1 << targetNum) = 0001 & (0001 << 3) = 0001 & 1000 = 0000 = 0 (in decimal)
  • return 0 means not in stock while any number > 0 means in stock :)
If the picture is in stock again, we just have to turn the bit on to 1 by using the "turning on a bit" technique. Moreover, to indicate a picture is out of stock, we use the "turning off a bit" technique.
As you can see, if we use this method for keeping track of a large quantity of data, we can save a lot of memory space because bits are well bits they take the least amount of memory! Besides, computers are faster at executing bit operations.

Thursday, January 5, 2012

Removing a loop in a singly linked list

Problem:
Remove a loop from a singly linked list if the list has a loop.

Example
For example, 1->2->3->4->1 (4 points to the head node 1) should become 1->2->3->4 while 1->2->3->4 should stay the same 1->2->3->4.

Solution 
 The problem has 3 steps :
  1. Detect if there is a loop in the list (already solved here)
  2. Identify the start of the loop (already discussed here)
  3. Delete the loop, which we will discuss here. Please go through 1 and 2.

Once you know where the loop starts, it's easy to identify the last element in the list since it's the element in the list following the start of the loop that ends up pointing back to the start of the loop. It is then trivial to set the next pointer/reference of this element to null to correct the cyclic link list (not circular linked list which is where the last elements points back to the first - this would be a specific instance of cyclic lists).
  1. Floyd's cycle detect algorithm, also called the tortoise and hare algorithm as it involves using two pointers/references that move at different speeds, is one way of detecting the cycle. If there is a cycle, the two pointers (say p1 and p2) will end up pointing to the same element after a finite number of steps. Interestingly, it can be proved that the element at which they meet will be the same distance to the start of the loop (continuing to traverse the list in the same, forward direction) as the start of the loop is to the head of the list. That is, if the linear part of the list has k elements, the two pointers will meet inside the loop of length m at a point m-k from the start of the loop or k elements to the 'end' of the loop (of course, it's a loop so it has no 'end' - it's just the 'start' once again). And that gives us a way to find the start of the loop:
  2. Once a cycle has been detected, let p2 remain pointing to the element where the loop for the step above terminated but reset p1 so that it's pointing back to the head of the list. Now, move each pointer one element at a time. Since p2 began inside the loop, it will continue looping. After k steps (equal to the distance of the start of the loop from the head of the list), p1 and p2 will meet again. This will give you a reference to the start of the loop.
  3. It is now easy to set p1 (or p2) to point to the element starting the loop and traverse the loop until p1 ends up pointing back to the starting element. At this point p1 is referencing the 'last' element list and it's next pointer can be set to null.


Here is the C++ implementation of the algorithm:
struct Node
{
  int data;
  Node* next;
};

void removeLoop(Node* head)
{
  if (head == NULL)
    return;

  Node* slow = head;
  Node* fast = head;
  Node* last = NULL;
  
  bool hasLoop = false;

  while (fast->next != NULL && fast->next->next != NULL) 
  {
    last = slow;
    slow = slow->next;
    fast = fast->next->next;
    
    if (slow == fast)
    {
      hasLoop = true;
      break;
    }
  }

  
  if (hasLoop)
  {
    slow = head;

    while (slow->next != fast->next)
    {
      slow = slow->next;
      fast = fast->next;
    }
    
    if (slow == head && fast == head )
      last->next = NULL;
    else
      fast->next = NULL;
  }
}

Explanation: our method takes only the reference to the list's head node as an argument. The first while loop determines if the list has loop. And, the second while loop points the loop's end node to null, terminating the loop:
  1. If the list is empty (head points to NULL), there is nothing to do so return.
  2. Next, we need three different pointers. slow will go through the list one node after another. fast moves through the list two nodes at a time, so it goes twice as fast as the slow pointer. And, last refers to the previous node visited by the slow pointer.
  3. We just have to loop through the list until fast reaches NULL because if the linked list doesn't have loop, fast will reach NULL eventually. Moreover, if the list does have a loop, we'll break out of the while loop immediately.
  4. As the while loop runs, we check if slow and fast pointer point to the same node. When they do, it means the list has a loop. This is a very common technique to find out if a linked list has loop. Here is a post about that technique if you are not familiar with it.
  5. As soon as we know that the list has loop, we set the flag hasLoop to true. And then, we break out of the while loop immediately because fast will never reach NULL and the while loop will never end.
  6. If the list has loop as indicated by the flag hasLoop, we enter the second while loop to remove that the list's loop.

    a) First, we set the slow pointer back to the list's head node.

    b) Then, we move both the slow and fast pointer one node at a time. They will eventually meet each other because the list still has a loop.

    c) We stop the loop right before the pointers meet. That's how we get the reference to the end node of the loop:

    d) If the slow and fast pointer both point to the head of the list, then the list is circular, meaning that the end node points back to the head node of the list (ie. 1->2->3->4->1). Thus, we'll get the reference to the end node by using the last pointer. Remember that last is pointing to either the head of the list or the end node of the loop after the first while loop. That's why if slow and fast point at the head, then slow must be pointing at the end node.

    e) On the other hand,if the end node is pointing to any node in the list other than the head node (ie. 1->2->3->4->2), then slow and fast is pointing to the end node and last is pointing to the head node. Thus, we use fast pointer instead of last pointer to break the loop.

This algorithm takes O(n) time complexity and O(1) space complexity. It's very efficient!!

Finding an integer that repeats odd number of times in an array of positive integers

Question: In an array of positive integers, all but one integer repeats odd number of times. Can you find that integers in O(n) time complexity?

Solutions
Answer: in order to solve this problem in O(n) time, we need to use bitwise manipulation. Since there is only one integer that repeats odd number of times, we can use the XOR operator to find out that number. When a number XOR with itself, it will become 0. Thus, if a number appears a even number of times, it yield a result of 0.

For example, given the array {2, 3, 2, 3}, we have 2 and 3 repeat two times (even). Thus, if we XOR all of them together we should get 0 as the result. However, if there is an odd repeated number, the result will be the value of that number!

Code in java
public static int getIntOddlyOccured(int[] inputArr)
  {
    int oddNum = inputArr[0];
   
    for(int i = 1; i < inputArr.length; i++)
      oddNum = oddNum ^ inputArr[i];
   
    return oddNum;
  }

c implementation
int getOddOccurringNumber(int arr[], int arr_size)
{
     int i;
     int result= 0; 
     for (i=0; i < arr_size; i++)     
        result= result^ arr[i];
      
     return result;
}

Explanation: our method takes an integer array as argument. It assumes that there is one and only one odd occurring number (conditions given by the question), so it will return that number and does no validation to see whether the input in fact has only one odd repeated number. In the body, the method loops through the array and XOR all the elements together. The result will be the oddly repeated number.

The caret '^' sign means XOR (exclusive OR). And, our algorithm works because XOR of two same number is 0. For example, 2 XOR 2 = 0 because 0010 XOR 0010 = 0000. Thus, all the even repeating numbers will yield the result = 0 while the odd repeating number will yield itself as the result. In our example, we know 3 is the odd-repeating number because 3 XOR 0 = 3.

Example: let's do an example with this array {1, 4, 3, 4, 1}. The method first initializes the result oddNum to 1 and then does the for loop:
  1. First iteration: oddNum = 1(0001) ^ 4(0100) = 5(0101) and i = 1
  2. Second iteration: oddNum = 5(0101) ^ 3(0011) = 6(0110) and i = 2
  3. Third iteration: oddNum = 6(0110) ^ 4(0100) = 2(0010) and i = 3
  4. Fourth iteration: oddNum = 2(0010) ^ 1(0001) = 3(0011) and i = 4
  5. Loop ends because i = 5, so we return oddNum = 3 which is the oddly repeated number in the array
This algorithm takes O(n) time complexity because it loops through the array only once. The space complexity is O(1) because we only need an additional integer for storage. Very efficient! That's all we have for now.

Approach 2 - Using the hash table
For example, we can use a hash table to maintain counts of repetition for each number in the array. Then, we look for the number with odd count.But what if someone asks "not to use any datastructure" / additional storage.

Thanks. Please let me know if you have better solution

Maximum continuous sum subarray problem

Problem

You are given an array of integers (both positive and negative). Find the continuous sequence with the largest sum. Return the sum.
EXAMPLE
Input: {2, -8, 3, -2, 4, -10}
Output: 5 (i.e., {3, -2, 4})

Solution 

Method 1 - Brute force
The outer loop picks the beginning element, the inner loop finds the maximum possible sum with first element picked by outer loop and compares this maximum with the overall maximum. Finally return the overall maximum. The time complexity of the Naive method is O(n^2).

Method 2 - Divide and conquer
1) Divide the given array in two halves
2) Return the maximum of following three
….a) Maximum subarray sum in left half (Make a recursive call)
….b) Maximum subarray sum in right half (Make a recursive call)
….c) Maximum subarray sum such that the subarray crosses the midpoint
The lines 2.a and 2.b are simple recursive calls. How to find maximum subarray sum such that the subarray crosses the midpoint? We can easily find the crossing sum in linear time. The idea is simple, find the maximum sum starting from mid point and ending at some point on left of mid, then find the maximum sum starting from mid + 1 and ending with sum point on right of mid + 1. Finally, combine the two and return.

int maxCrossingSum(int arr[], int l, int m, int h)
{
    // Include elements on left of mid.
    int sum = 0;
    int leftSum = Integer.MIN;
    for (int i = m; i >= l; i--)
    {
        sum = sum + arr[i];
        if (sum > leftSum)
          leftSum = sum;
    }
 
    // Include elements on right of mid
    sum = 0;
    int rightSum = Integer.MIN;
    for (int i = m+1; i <= h; i++)
    {
        sum = sum + arr[i];
        if (sum > rightSum)
          rightSum = sum;
    }
 
    // Return sum of elements on left and right of mid
    return leftSum + rightSum;
}
 
// Returns sum of maxium sum subarray in aa[l..h]
int maxSubArraySum(int arr[], int l, int h)
{
   // Base Case: Only one element
   if (l == h)
     return arr[l];
 
   // Find middle point
   int m = (l + h)/2;
 
   /* Return maximum of following three possible cases
      a) Maximum subarray sum in left half
      b) Maximum subarray sum in right half
      c) Maximum subarray sum such that the subarray crosses the midpoint */
   return Math.max(maxSubArraySum(arr, l, m),
              maxSubArraySum(arr, m+1, h),
              maxCrossingSum(arr, l, m, h));
}

Time Complexity:  O(nLog n)
maxSubArraySum() is a recursive method and time complexity can be expressed as following recurrence relation.
T(n) = 2T(n/2) + Θ(n)
The above recurrence is similar to Merge Sort and can be solved either using Recurrence Tree method or Master method. It falls in case II of Master Method and solution of the recurrence is Θ(nLogn).

Method 3 - Dynamic Programming using Kadane's algorithm (Best)
This problem can be solved by using a modified version of Kadane's Algorithm. The strategy is to calculate the sum of each subarray and keep track of the sum, the start index and the end index of the subarray that gives the largest sum. Moreover, we calculate the sum of a subarray by adding the sum of the previous subarray with an additional element. For example, to calculate the sum of {5, 2, -1}, we add the sum of {5, 2}, which is 7, to the value of the next element, which is -1.

Kadane's algorithm/pseudocode
Initialize:
    maxSoFar = 0
    maxEndingHere = 0

Loop for each element of the array
  (a) maxEndingHere = maxEndingHere + a[i]
  (b) if(maxEndingHere < 0)
            maxEndingHere = 0
  (c) if(maxSoFar < maxEndingHere)
            maxSoFar = maxEndingHere
return maxSoFar
In simpler terms, iterate over the array and have 2 checks:
  • if greater intermediate sum is found, store it constituents as a temporary result;
  • independently, if sum got negative, reset it to 0 and prepare to build a new sequence from next scan position.
Code in java
public static void findMaxSumSequence (int inputArray[])
{
    if (inputArray == null || inputArray.length == 0)
        throw new IllegalArgumentException("Array is empty");

    int size = inputArray.length;

    int maxSum = inputArray[0];
    int maxStartIndex = 0;
    int maxEndIndex = 0;

    int curSum = inputArray[0];
    int curStartIndex = 0;


    for (int i = 1; i < size; i++)
    {
        if (curSum < 0)
        {
            curSum = 0;
            curStartIndex = i;
        }

        curSum = curSum + inputArray[i];

        if (curSum > maxSum)
        {
            maxSum = curSum;
            maxStartIndex = curStartIndex;
            maxEndIndex = i;
        }
    } 

    out.println( "Start index: " + maxStartIndex + " End index: " 
            + maxEndIndex + " Max sum: " + maxSum );
}

Time Complexity: O(n)

Explanation
This method accepts an array and its size as arguments. It prints out the start index, end index and the sum of the subarray that yields the max sum. Here are the main steps:
  1. First, check if the array size is 0. If it is so, we throw an exception because there is nothing we need to do when the array is null.
  2. Then, initialize variables: maxSum is the largest sum found. maxStartIndex and maxEndIndex are respectively the start index and end index of the subarray that yields the max sum. curSum is the sum of the current subarray that we're examining. curStartIndex is the start index of the current subarray we're checking.
  3. Next, we loop through the array and start calculating the sum of subarrays one after another:

    If the sum of the current subarray is less than 0, we reset curSum to 0. Why? If the last subarray's sum is negative, we will only decrease the next subarray's sum by adding the previous subarray's sum with an additional number. For example, if the previous subarray's sum is -2, and the next element is 3, it's better to reset the sum to 0 and add 3 into 0 than to add -2 to 3.

    curSum = curSum + inputArr[i] calculates the sum of the current subarray by adding the sum of the previous subarray with the next value in the array.

    After that, if the sum of the current subarray is greater than the max sum then we replace the max sum with the sum of the current subarray. We also change the maxStartIndex to the start index of the current subarray and the maxEndIndex to the current index i.

    When the loop ends, maxSum will contain the largest sum found. maxEndIndex and maxStartIndex contain respectively the end and start index of the subarray that gives the largest sum. Thus, we just have to print out those values.
If you have any comment or another solution to this problem, please post in the comment below. I would love to learn about it. Thanks for reading and until next post!

References

Andersson Trees

http://eternallyconfuzzled.com/tuts/datastructures/jsw_tut_andersson.aspx

Sorted Linear Hash Table

http://www.concentric.net/~ttwang/tech/sorthash.htm

Search trees