Friday, October 28, 2011

Find 4 Elements in An Array that Sum to A Target Value

Problem: Given an array and a target value, find 4 elements in the array that sum to this target value.

Solution: Here introduce an O(n^2) algorithm. Only briefly explain the idea:

  • Calculate the sums of all pairs in the array and form a new array. Each element in the new array is a struct that stores the sum and the two elements who make the sum. Sort the array by the sum. The cost is  O(n^2) till now (may need radix sort).
  • The problem is converted to find 2 elements in the new array that sum to the target value. The cost for this operation is the length of the new array, which is also O(n^2). So total cost is O(n^2).

Some Problem Solved by Suffix Tree

Problem 1: Find the longest common substring among k strings.

Solution: If we use DP, it takes O(n1*n2*...nk). With suffix tree, it takes O(n1+n2+...nk). Basically, we build the suffix tree for n1 first, then add n2 to this tree, then n3... All the strings share the same generalized suffix tree. We need to track if each node (path) is shared by all the string or not. When the building is done, the longest common substring will be found.


Problem 2: Find the longest repeated substring in a string.

Solution: First, build the suffix tree for this string. Then find the deepest internal node, from the root to that node is the substring we are looking for. The "deepest" is decided by the number of characters.

Thursday, October 27, 2011

Serialize a Binary Tree

We know with pre-order and mid-order we can decide a binary tree. Besides, tree have array representation. However, to have better space efficiency, we can just store a binary tree according to its pre-order, but with null node also stored!

Find Two Substrings with Longest Distance from Two Strings

Problem: Given two strings s1 and s2, s1' and s2' are two substrings of s1 and s2, respectively. Find the s1' and s2' with longest distance. Distance is defined as \sum_i| s1' [i] -s2' [i]|.

Solution: Obviously,  s1' and s2' should be of equal length. Here give a O(n*m) algorithm where n and m are the length of  s1 and s2,  respectively.

  • Calculate c[i][j] = |s1[i]-s2[j]|, then we have a matrix.
  • Swipe the matrix with a diagonal line. For example, s1 = "35645",  s2 ="2475", we have matrix c as follow:
    1 1 4 2
    3 1 2 0
    4 2 1 1
    2 0 3 1
    3 1 2 0
  • First we scan "2", then "4 0",  then "1 2 1". You calculate the sum of the numbers in the line. Later you will find the maximum is "3 2 3 0" or "3 2 3", which stands for the distance of "5645" and "2475".

Find the Kth (smallest) Pair from the Elements in Two Sorted Arrays

Problem: Given two sorted array, by choosing one element from each array we have a pair. A pair can be ranked by the sum of the two numbers within it. Find the Kth smallest pair.

Solution: Naive approach takes O(n*m). Here we can borrow the idea that is used to merge multiple arrays. Basically we need a heap or a priority-queue. The main idea is as follows:

  • Put the first pair (which is definitely the smallest) {a[0], b[0]} in the heap as bootstrapping.
  • Loop over the heap, if heap is not empty, pop the top of the heap {a[i], b[j]}. If i<n-1, push  {a[i+1], b[j]};  If j<m-1, push  {a[i], b[j+1]}.
  • pop k times we will get the kth smallest pair. The complexity is  O(k*log(n+m)).

Tuesday, October 25, 2011

Volume of Water Held by a Histogram

Problem: Pour water on a histogram, calculate the water volume the histogram can hold.

Solution: This is a relatively simple DP problem. Here we only give the main idea.
  • For a particular bar bi, if we know the highest bar on its left Li and highest bar on its right Ri.  If the height of bi is smaller than both Li and Ri, the water volume can be held on this bar is min(Li, Ri) - hi; otherwise, it can't hold water.
  • To calculate Li and Ri, we just need to record the maximum height we had observed so far from the left (and from the right). Therefore, a O(n) algorithm is straightforward here.

Maximum Rectangle Area within a Histogram

Problem: Given a histogram, find the maximum rectangle area within it.

Solution:  It is easy to find a O(n^2) algorithm by comparing the height of a bar with the rest bars. Here we give a O(n) algorithm.
  • For each bar bi, we need to know the number of adjacent bars on the left Li and on the right Ri that are higher than it. Then the maximum rectangle within the histogram with the height hi will be  (Li+Ri+1)*hi. Then we just need to select the largest one.
  • When deciding the number of adjacent bars on the left of bi, the key observation is that we don't need to inspect every bar on its left. The key here is to use a stack to track the bars that had been inspected in a smart way. The bars in the stack are in ascending order (from base to top). Besides, before pushing new bar, we need to pop the bars in the stacks that are higher than it. Then we can achieve calculate Li in O(n). Calculating Ri will be the same.
The following only shows how to calculate Li:
//bar[] represents the height of each bar in the histogram
//left[] stores the number of adjacent bars that are taller than bi 
//this stack is used to track inspected bars
stack s; 

for(int i=0; i<N; i++)
{

  int num = 0;
  int idx;
  while(!s.empty())
  {
    idx = s.top();
    if(bar[idx]>bar[i])
    {
       num++;
       s.pop();
       left[i]+= left[idx];  
    }
    else break;
  }

  s.push(i);
  left[i]= num ? num + left[i]:0;

}

Monday, October 24, 2011

Find the Convex Hull

Problem: a series of points are on plain (X>0 and Y>0), find the convex hull of these points, a.k.a, the points which can form a hull that include all the other points.

Solution: Here only main idea is given.Basically we can use Graham scan.

  • Find the point p0 with the smallest y coordinate; if multiple exist, choose the one with the smallest x coordinate, this point must be one of the vertices on the convex hull.
  • Sort the other points based on the angle of p0->pi
  • Put p0, p1 and p2 in a stack (we need at least three to form a hull), then we inspect the rest points based on the order. Assume px is the stack top and py is the one next to the top, if py->pi is at the right side of py->px, pop px. We apply the same check to the new px and py until the previous condition doesn't hold. Then we push pi if previous there are any pop operations. If before push pi, there are only two elements in stack, we can push pi directly. 
  • After we iterate all the remaining points, the points on the stack are those that form the convex hull. The complexity is O(nlogn), primarily the sorting cost.
More: Alternatively, we can use Jarvis’s march which is asymptotically faster than Graham’s scan. The time complexity is O(nh) where h is the number of the vertices on the hull. The main idea is as follow:
  • Find the highest point p_high and lowest point p_low among all the points, which takes O(n).
  • Start from p_low, find the next point that has the smallest polar angle (using +x axis) with respect to p_low. Assume this point is p', then find the next point that has the smallest polar angle with respect to p'. Repeat such process until we find p_high. Up to now, we had found the left chain of the hull.
  • Then proceed to find the right chain of the hull. It is similar to finding the left chain. We start from p_high and stop when we reach p_low. The only difference is that when we calculate the polar angle, we use -x axis instead of +x axis.

Output a set's all sub sets

Problem: Given a set, output all its subsets. For example, for set (a,b,c), we need to output (), (a), (b), (c), (a,b), (a, c), (b,c), (a,b,c).

Solution: iterative approach is not difficult. Here give a recursive approach:
void _print(const vector<int>& set)
{
    i; i<< set[i] << " ";
    cout << endl;
}


void print_set(const vector <int>&set, int i, vector<int>& output)
{
   if(i >= set.size()) _print(output);
   else {
     output.push_back(set[i]);
     print_set(set, i+1, output);
   // code to handle duplicates
     int tmp = output.back();
     output.pop_back();
     
     if(output.size()>0 && tmp== output.back()) return;
          
     print_set(set, i+1, output);
   }
}

int main()
{
    vector<int> set,output;
    set.push_back(1);
    set.push_back(2);
    set.push_back(3);
    set.push_back(4);

    print_set(set, 0, output);
}


Why we need virtirtual destructor?

When you release the resource of an class object, usually you leverage the object's destructor. Assume we have two class here: base Class A and derived Class B, for the following code:
class A{
  public:
    ~A(){}
};

class B: public A{
   public:
    ~B(){}
};

B *b = new B();
delete b;
First B's destructor will be called, followed by A's. On the other side, if we have the following code:
class A{
  public:
    ~A(){}
};

class B: public A{
   public:
    ~B(){}
};

A *a = new B();
delete a;
Only A's destructor will be called. In order to also call B's destructor in the second code example, the ~A() should be declared as virtual.