Monday, December 15, 2014

Data scientist / Machine Learning Engineer 相关面试题(转)


去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
Yelp, Pinterest,
Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
同学们。

题目写的简略,请大家见谅

====================


1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
get 1 tail and 5 head. Determine whether it’s fair or not. What’s your
confidence value?

2. Given Amazon data, how to predict which users are going to be top
shoppers in this holiday season.

3. Which regression methods are you familiar? How to evaluate regression
result?

4. Write down the formula for logistic regression. How to determine the
coefficients given the data?

5. How do you evaluate regression?
For example, in this particular case:
item click-through-rate  predicted rate
1       0.04        0.06
2       0.68        0.78
3       0.27        0.19
4       0.52        0.57


6. What’s the formula for SVM? What is decision boundary?

7. A field with unknown number of rabbits. Catch 100 rabbits and put a label
on each of them. A few days later, catch 300 rabbits and found 60 with
labels. Estimate how many rabbits are there? 

8. Given 10 coins with 1 unfair coin and 9 fair coins. The unfair coin has &
#8532; prob. to be head. Now random select 1 coin and throw it 3 times. You
observe head, head, tail. What’s the probability that the selected coin is
the unfair one?

9. What’s the formula for Naive Bayesian classifier? What’s the assumption
in the formula? What kind of data is Naive Bayesian good at? What is not?

10. What is the real distribution of click-through rate of items? If you
want to build a predictor/classifier for this data, how do you do it? How do
you divide the data?

11. You have a stream of data coming in, in the format as the following:
item_id, views, clicks, time
1            100     10         2013-11-28
1            1000   350       2013-11-29
1            200     14         2013-11-30
2            127     13         2013-12-1


The same id are consecutive.

Click through rate = clicks / views.
On every day, I want to output the item id when its click through rate is
larger than a given threshold.
For example, at day 1, item 1’s rate is 10/100=10%, day2, its (10+350)/(100
+1000)=0.32. day3 it is (10+350+14)/(100+1000+200)=0.28.
If my threshold is 0.3, then at day 1, I don’t output. On day2 I output. On
day3, I don’t output.

11. Given a dictionary and a string. Write a function, if every word is in
the dictionary return true, otherwise return false.

12. Generate all the permutation of a string.
For example, abc, acb, cba, …

13. We want to add a new feature to our product. How to determine if people
like it?
A/B testing. How to do A/B testing? How many ways? pros and cons?

14. 44.3% vs 47.2% is it significant? 

15. Design a function to calculate people’s interest to a place against the
distance to the place.

16. How to encourage people to write more reviews on Yelp? How to determine
who are likely to write reviews? How to increase the registration rate of
Yelp? What features to add for a better Yelp app? We are expanding to other
countries. Which country we should enter first?

17. What’s the difference between classification and regression?

18. Can you explain how decision tree works? How to build a decision tree
from data?

19. What is regularization in regression? Why do regularization? How to do
regularization?

20. What is gradient descent? stochastic gradient descent?

21. We have a database of <product_id, name, description, price>. When user
inputs a product name, how to return results fast?

22. If user gives a budget value, how to find the most expensive product
under budget? Assume the data fits in memory. What data structure, or
algorithm you use to find the product quickly? Write the program for it.

23. Given yelp data, how to find top 10 restaurants in America?

24. Given a large file that we don’t know how many lines are there. It
doesn’t fit into memory. We want to sample K lines from the file uniformly.
Write a program for it.

25. How to determine if one advertisement is performing better than the
other?

26. How to evaluate classification result? What if the results are in
probability mode?
If I want to build a classifier, but the data is very unbalanced. I have a
few positive samples but a lot of negative samples. What should I do?

27. Given a lot of data, I want to random sample 1% of them. How to do it
efficiently?

28. When a new user signs up Pinterest, we want to know its interests. We
decide to show the user a few pins, 2 pins at a time. Let the user choose
which pin s/he likes. After the user clicks on one of the 2, we select
another 2 pins.
Question: how to design the system and select the pins so that we can
achieve our goal?

29. Write a function to compute sqrt(X). Write a function to compute pow(x,
n) [square root and power)


30. Given a matrix
a b c  d
e f  g  h
i  j  k   l
Print it in this order:
a  f  k
b g l
c h
d
e j
i

31. Given a matrix and an array of words, find if the words are in the
matrix. You can search the

matrix in all directions:  from left to right, right to left, up to down,
down to up, or diagonally.
For example
w o r x b
h  e l  o v
i   n d e m

then the word “world” is in the matrix.


32. Given a coordinates, and two points A and B. How many ways to go from A
to B? You can only move up or right.
For example, from (1, 1) to (5, 7), one possible way is 1,1 -> 2, 1… 5, 1 -
> 5,2 -> ..5, 7


33. In a city where there are only vertical and horizontal streets. There
are people on the cross point. These people want to meet. Please find a
cross point to minimize the cost for all the people to move.

34. Design a job search ranking algorithm on glassdoor

35. How to identify review spam?

36. Glassdoor has this kind of data about a job : (position, company,
location, salary). For example (Software Engineer, Microsoft, Seattle, $125K
). For some records, all four entires are available. But for others, the
salary is missing. Design a way to estimate salary for those records.

37. When to send emails to users in a day can get maximum click through rate?

38. Youtube has video play log like this:
Video ID, time
vid1        t1
vid2        t2
...           ...
The log is super large.
Find out the top 10 played videos on youtube in a given week.

39. Write a program to copy a graph

40. A bank has this access log:
IP address, time
ip1      t1
ip2      t2
...        ...

If one ip accessed K times within m seconds, it may be an attack.
Given the log, identify all IPs that may cause attack.

Tuesday, December 2, 2014

Singleton Design Pattern – An Introspection w/ Best Practices

Singleton is a part of Gang of Four design pattern and it is categorized under creational design patterns.

In this article we are going to take a deeper look into the usage of the Singleton pattern. It is one of the most simple design pattern in terms of the modelling but on the other hand this is one of the most controversial pattern in terms of complexity of usage.

In Java the Singleton pattern will ensure that there is only one instance of a class is created in the Java Virtual Machine. It is used to provide global point of access to the object. In terms of practical use Singleton patterns are used in logging, caches, thread pools, configuration settings, device driver objects. Design pattern is often used in conjunction with Factory design pattern. This pattern is also used in Service Locator JEE pattern.

Structure:
Singleton Class Diagram Singleton Class Diagram
  • Static member : This contains the instance of the singleton class.
  • Private constructor : This will prevent anybody else to instantiate the Singleton class.
  • Static public method : This provides the global point of access to the Singleton object and returns the instance to the client calling class.

Implementation Example: Lazy initialization

Let us look into a singleton implementation example in Java. The below code uses Lazy initialization process.

public class SingletonExample {
 
    // Static member holds only one instance of the
    // SingletonExample class
    private static SingletonExample singletonInstance;
 
    // SingletonExample prevents any other class from instantiating
    private SingletonExample() {
    }
 
    // Providing Global point of access
    public static SingletonExample getSingletonInstance() {
        if (null == singletonInstance) {
            singletonInstance = new SingletonExample();
        }
        return singletonInstance;
    }
 
    public void printSingleton(){
        System.out.println("Inside print Singleton");
    }
}
Singleton Pattern Code Explanation Singleton Pattern Code Explanation
When this class will be called from the client side using SingletonExample.getSingletonInstance().printSingleton(); then at the first time only an instance will be created. During second time onwards for all subsequent calls we will be referring to the same object and the getSingletonInstance() method returns the same instance of the SingletonExample class which was created during the first time. You can test this by adding a print statement the following code as:
public static SingletonExample getSingletonInstance() {
        if (null == singletonInstance) {
            singletonInstance = new SingletonExample();
            System.out.println("Creating new instance");
        }
        return singletonInstance;
    }
If we now call Singleton class from the client class as:
SingletonExample.getSingletonInstance().printSingleton();
SingletonExample.getSingletonInstance().printSingleton();
SingletonExample.getSingletonInstance().printSingleton();
SingletonExample.getSingletonInstance().printSingleton();
The output of the above call is:
Creating new instance
Inside print Singleton
Inside print Singleton
Inside print Singleton
Inside print Singleton

Implementation Example: Double check locking

The above code works absolutely fine in a single threaded environment and processes the result faster because of lazy initialization. However the above code might create some abrupt behaviour in the results in a multithreaded environment as in this situation multiple threads can possibly create multiple instance of the same SingletonExample class if they try to access the getSingletonInstance() method at the same time. Imagine a practical scenario where we have to create a log file and update it or while using a shared resource like Printer. To prevent this we must use some locking mechanism so that the second thread cannot use this getInstance() method until the first thread has completed the process.
Singleton in heap Singleton in heap
In Figure 3 we show how multiple threads access the singleton instance. Since the singleton instance is a static class variable in the stored in the PermGen space of the heap. This applies to getSingletonInstance() instance method as well since it is static too. In the multithreading environment to prevent each thread to create another instance of singleton object and thus creating concurrency issue we will need to use locking mechanism. This can be achieved by synchronized keyword. By using this synchronized keyword we prevent Thread2 or Thread3 to access the singleton instance while Thread1 inside the method getSingletonInstance().
From code perspective it means:

public static synchronized SingletonExample getSingletonInstance() {
        if (null == singletonInstance) {
            singletonInstance = new SingletonExample();
        }
        return singletonInstance;
    }
So this means that every time the getSingletonInstance() is called it gives us an additional overhead . To prevent this expensive operation we will use double checked locking so that the synchronization happens only during the first call and we limit this expensive operation to happen only once. It is only required for:
singletonInstance = new SingletonExample();
Code example:

public static volatile SingletonExample getSingletonInstance() {
        if (null == singletonInstance) {
            synchronized (SingletonExample.class){
                    if (null == singletonInstance) {
            singletonInstance = new SingletonExample();
            }
        }
        }
        return singletonInstance;
    }
In the above code snippet imagine that multiple threads comes concurrently and tries to create the new instance. In such situation the may be three or more threads are waiting on the synchronized block to get access. Since we have used synchronized only one thread will be given access. All the remaining threads which were waiting on the synchronized block will be given access when first thread exits this block. However when the remaining concurrent thread enters the synchronized block they are prevented to enter further due to the double check : null check. Since the first thread has already created an instance no other thread will enter this loop.
All the remaining threads that were not lucky to enter the synchronized block along with the first thread will be blocked at the first null check. This mechanism is called double checked locking and it provides significant performance benefit and also it is cost effective solution.

Implementation Example: Volatile Keyword

We can also use the volatile keyword to the instance variable declaration.
private volatile static SingletonExample singletonInstance;
The volatile keyword helps as concurrency control tool in a multithreaded environment and provides the latest update in a most accurate manner.However please note that double check locking might not work before Java 5. In such situation we can use early loading mechanism. If we remember about the original sample code we had used lazy loading. In case of early loading we will instantiate the SingletonExample class at the start and this will be referred to the private static instance field.

public class SingletonExample {
    private static final SingletonExample singletonInstance = new SingletonExample;
 
    // SingletonExample prevents any other class from instantiating
    private SingletonExample() {
    }
 
    // Providing Global point of access
    public static SingletonExample getSingletonInstance() {
 
        return singletonInstance;
    }
 
    public void printSingleton(){
        System.out.println("Inside print Singleton");
    }
}
In this approach the singleton object is created before it is needed. The JVM takes care of the static variable initialization and ensures that the process is thread safe and that the instance is created before the threads tries to access it. In case of Lazy loading the singletonInstance is created when the client class calls the getSingleInstance() whereas in case of the early loading the singletonInstance is create when the class is loaded in the memory.

Implementation Example: Using enum

Implementing Singleton in Java 5 or above version using Enum:
Enum is thread safe and implementation of Singleton through Enum ensures that your singleton will have only one instance even in a multithreaded environment. Let us see a simple implementation:

public enum SingletonEnum {
 INSTANCE;
 public void doStuff(){
     System.out.println("Singleton using Enum");
 }
}
And this can be called from clients :
public static void main(String[] args) {
        SingletonEnum.INSTANCE.doStuff();
 
    }

FAQs:

Question: Why can’t we use a static class instead of singleton?
Answer:
  • One of the key advantages of singleton over static class is that it can implement interfaces and extend classes while the static class cannot (it can extend classes, but it does not inherit their instance members). If we consider a static class it can only be a nested static class as top level class cannot be a static class. Static means that it belongs to a class it is in and not to any instance. So it cannot be a top level class.
  • Another difference is that static class will have all its member as static only unlike Singleton.
  • Another advantage of Singleton is that it can be lazily loaded whereas static will be initialized whenever it is first loaded.
  • Singleton object stores in Heap but, static object stores in stack.
  • We can clone the object of Singleton but, we can not clone the static class object.
  • Singleton can use the Object Oriented feature of polymorphism but static class cannot.
Question: Can the singleton class be subclassed?
Answer: Frankly speaking singleton is just a design pattern and it can be subclassed. However it is worth to understand the logic or requirement behind subclassing a singleton class as the child class might not inherit the singleton pattern objective by extending the Singleton class. However the subclassing can be prevented by using the final keyword in the class declaration.
Question: Can there be multiple instance of singleton using cloning?
Answer: That was a good catch! What do we do now? To prevent the another instance to be created of the singleton instance we can throw exception from inside the clone() method.
Question: What is the impact if we are creating another instance of singleton using serialization and deserialization?
Answer: When we serialize a class and deserialize it then it creates another instance of the singleton class. Basically as many times as you deserialize the singleton instance it will create multiple instance. Well in this case the best way is to make the singleton as enum. In that way the underlying Java implementation takes care of all the details. If this is not possible then we will need to override the readobject() method to return the same singleton instance.
Question: Which other pattern works with Singleton?
Answer:There are several other pattern like Factory method, builder and prototype pattern which uses Singleton pattern during the implementation.
Question: Which classes in JDK uses singleton pattern?
Answer: java.lang.Runtime : In every Java application there is only one Runtime instance that allows the application to interface with the environment it is running. The getRuntime is equivalent to the getInstance() method of the singleton class.

Uses of Singleton design pattern:

Various usages of Singleton Patterns:
  • Hardware interface access: The use of singleton depends on the requirements. However practically singleton can be used in case external hardware resource usage limitation required e.g. Hardware printers where the print spooler can be made a singleton to avoid multiple concurrent accesses and creating deadlock.
  • Logger : Similarly singleton is a good potential candidate for using in the log files generation. Imagine an application where the logging utility has to produce one log file based on the messages received from the users. If there is multiple client application using this logging utility class they might create multiple instances of this class and it can potentially cause issues during concurrent access to the same logger file. We can use the logger utility class as a singleton and provide a global point of reference.
  • Configuration File: This is another potential candidate for Singleton pattern because this has a performance benefit as it prevents multiple users to repeatedly access and read the configuration file or properties file. It creates a single instance of the configuration file which can be accessed by multiple calls concurrently as it will provide static config data loaded into in-memory objects. The application only reads from the configuration file at the first time and there after from second call onwards the client applications read the data from in-memory objects.
  • Cache: We can use the cache as a singleton object as it can have a global point of reference and for all future calls to the cache object the client application will use the in-memory object.

Hands-on:

Let us take a small practical example to understand the Singleton design pattern in more details.

Problem Statement:
Design a small ATM printing application which can generate multiple types of statements of the transaction including Mini Statement, Detailed statement etc. However the customer should be aware of the creation of these statements. Ensure that the memory consumption is minimized.

Design Solution:
The above requirement can be addressed using two core Gang of four design pattern – Factory design pattern and Singleton design pattern. In order to generate multiple types of statements for the ATM transactions in the ATM machine we can create a Statement Factory object which will have a factory method of createStatements(). The createStatements will create DetailedStatement or MiniStatement objects.

The client object will be completely unware of the object creation since it will interact with the Factory interface only. We will also create an interface called StatementType. This will allow further statement type objects e.g. Credit card statement etc to be added. So the solution is scalable and extensible following the object oriented Open/Closed design principle.

The second requirement of reducing the memory consumption can be achieved by using Singleton design pattern. The Statement Factory class need not be initiated multiple times and a single factory can create multiple statement objects. Singleton pattern will create a single instance of the StatementFactory class thus saving memory.
ATM example ATM example


  • Factory: Factory is an abstract class which is a single point of contact for the client. All the concrete factory classes needs to implement the abstract factory method.

  • StatementFactory: This is the Factory implementation class which consist of the creator method. This class extends from the Factory abstract class.This is the main creator class for all the products e.g. Statements.

  • StatementType: This is a product interface which provides abstraction to the various types of products which needs to be created by the Factory class.

  • DetailedStatement: This is a concrete implementation of the StatementType interface which will print the detailed statements.

  • MiniStatement: This is a concrete implementation of the StatementType interface which will print the mini statements.

  • Client: This is the client class which will call the StatementFactory and StatementType and request for object creation.

Assumption:
The design solution caters to only single ATM machine.

Sample Code:

Factory.java
public abstract class Factory {
 protected abstract StatementType createStatements(String selection);
 }
StatementFactory.java
public class StatementFactory extends Factory {
    private static StatementFactory uniqueInstance;
 
    private StatementFactory() {
    }
 
    public static StatementFactory getUniqueInstance() {
        if (uniqueInstance == null) {
            uniqueInstance = new StatementFactory();
            System.out.println("Creating a new StatementFactory instance");
        }
        return uniqueInstance;
 
    }
 
    public StatementType createStatements(String selection) {
        if (selection.equalsIgnoreCase("detailedStmt")) {
            return new DetailedStatement();
        } else if (selection.equalsIgnoreCase("miniStmt")) {
            return new MiniStatement();
        }
        throw new IllegalArgumentException("Selection doesnot exist");
    }
}
StatementType.java

public interface StatementType {
    String print();
}
DetailedStatement.java
public class DetailedStatement implements StatementType {
    @Override
    public String print() {
        System.out.println("Detailed Statement Created");
        return "detailedStmt";
    }
}
MiniStatement.java
public class MiniStatement implements StatementType {
    @Override
    public String print() {
        System.out.println("Mini Statement Created");
        return "miniStmt";
    }
}
Client.java
public class Client {
 
    public static void main(String[] args) {
        System.out.println("Enter your selection:");
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        String selection = null;
        try {
            selection = br.readLine();
        } catch (IOException e) {
            e.printStackTrace();
        }
        Factory factory = StatementFactory.getUniqueInstance();
        StatementType objStmtType = factory.createStatements(selection);
        System.out.println(objStmtType.print());
    }
 
}
Conclusion:
In the above articles we have gone into the details of the Singleton pattern, how to implement singleton pattern along with a practical application. Though singleton pattern looks a simple implementation we should resist ourselves from using it until and unless there is a strong requirement for it. You can blame the unpredictable nature of the results in the multi-threading environment. Though we can use enum in Java 5 and above it sometimes get difficult to implement your logic always in enum or there might be legacy code before Java 5. Hope our readers have liked this article.

Saturday, November 29, 2014

Sunday, November 23, 2014

Java consumer producers problem using mutex

http://java67.blogspot.com/2012/12/producer-consumer-problem-with-wait-and-notify-example.html?m=1

156. Binary Tree Upside Down ---M ~~~~做了3遍的, 竟然忘了~~~~

Given the root of a binary tree, turn the tree upside down and return the new root.

You can turn a binary tree upside down with the following steps:

  1. The original left child becomes the new root.
  2. The original root becomes the new right child.
  3. The original right child becomes the new left child.

The mentioned steps are done level by level, it is guaranteed that every node in the given tree has either 0 or 2 children.

 

Example 1:



Input: root = [1,2,3,4,5]
Output: [4,5,2,null,null,3,1]

Example 2:

Input: root = []
Output: []

Example 3:

Input: root = [1]
Output: [1]

 

Constraints:

  • The number of nodes in the tree will be in the range [0, 10].
  • 1 <= Node.val <= 10
  • Every node has either 0 or 2 children.


A:

这个题目描述的很烂。
应该是: binary tree upside down along left most edge


Iterative bottom up.
就是先做左子树,再连接修改该节点。
/**
 * Definition for a binary tree node.
 * struct TreeNode {
 *     int val;
 *     TreeNode *left;
 *     TreeNode *right;
 *     TreeNode() : val(0), left(nullptr), right(nullptr) {}
 *     TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
 *     TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {}
 * };
 */
class Solution {
public:
    TreeNode* upsideDownBinaryTree(TreeNode* root) {
        if(not root){ // if nullptr , or no child
            return root;
        }
        stack<TreeNode*> S;
        TreeNode *runner = root;
        while(runner->left ){
            S.push(runner);
            runner = runner->left;
        }
        TreeNode * newRoot = runner;
        while(not S.empty()){
            auto p = S.top();
            S.pop();
            runner->right = p;
            runner->left = p->right;
            p->left = nullptr;
            p->right = nullptr;
            runner = p;
        }
        return newRoot;
    }
};

错误: 
没有先把  p.right = nullptr; 
这个题比较tricky的点就是,要事先找到return 的newRoot.




------------Recursive approach--------先找到 返回TreeNode, 这样就不需要在helper()中管了--------


public class Solution {
    public TreeNode upsideDownBinaryTree(TreeNode root) {
        if(root == null)
            return null;
        TreeNode leftMost = root;
        while( leftMost.left != null)
            leftMost = leftMost.left;
            
        helper(root);
        return leftMost;
    }
    private void helper(TreeNode  root){
        if(root.left == null)
            return;
        upsideDownBinaryTree(root.left);
        
        TreeNode newRoot = root.left;
        newRoot.left = root.right;
        newRoot.right = root;
        
        root.left = null;
        root.right = null;
    }
}

这里利用了一个性质,就是指针指向的Node只做了child 指针的更改操作。原本的Object 位置并没有更改。

Mistakes:
1: base case中,忘了把root.left==null的情况加进来(否则会有null pointer的Exception)


public class Solution {
    public TreeNode UpsideDownBinaryTree(TreeNode root) {
        if(root == null || root.left == null)
            return root;
        TreeNode newRoot = UpsideDownBinaryTree(root.left);
        
        TreeNode r= root.right;
        root.left =null;
        root.right = null;
        TreeNode runner = newRoot;
        while(runner!=null && runner.right!=null){
            runner = runner.right;
        }
        if(runner!=null){
            runner.left = r;
            runner.right = root;
        }
        return newRoot;
    }
}












Saturday, November 22, 2014

155. Min Stack (easy)

Q:

Design a stack that supports push, pop, top, and retrieving the minimum element in constant time.
  • push(x) -- Push element x onto stack.
  • pop() -- Removes the element on top of the stack.
  • top() -- Get the top element.
  • getMin() -- Retrieve the minimum element in the stack. 
Example:
MinStack minStack = new MinStack();
minStack.push(-2);
minStack.push(0);
minStack.push(-3);
minStack.getMin();   --> Returns -3.
minStack.pop();
minStack.top();      --> Returns 0.
minStack.getMin();   --> Returns -2.

A:

思想很简单。
后来出现memory out of limit, minStack中就没有存多余的最小值。

class MinStack {
public:
    /** initialize your data structure here. */
    vector<int> stack;
    vector<int> minStack;
    MinStack() {
        
    }
    
    void push(int x) {
        stack.push_back(x);
        if(minStack.empty() || minStack.back() > x)
            minStack.push_back(x);
        else
            minStack.push_back(minStack.back());
    }
    
    void pop() {
        stack.pop_back();
        minStack.pop_back();
    }
    
    int top() {
        return stack.back();
    }
    
    int getMin() {
        return minStack.back();
    }
};

/**
 * Your MinStack object will be instantiated and called as such:
 * MinStack* obj = new MinStack();
 * obj->push(x);
 * obj->pop();
 * int param_3 = obj->top();
 * int param_4 = obj->getMin();
 */

Learned:

1;  minStack中,重复的最小值也要存。
2:即使Stack<Integer>的返回值是一个对象,因此要用.equals()来比较大小

Saturday, October 25, 2014

Amazon

. You will be given the number of pairs of parenthesis. Find out the total possible valid unique combinations and there should not be any duplicity. Write code

这个应该有计算公式吧???

Saturday, October 18, 2014

count number of bits of '1' from 1 to n

Q:
from 1 to n, count the number of bits that were set to 1.
For example,
n == 1,  return 1,
n == 2, return 3,
n == 3, return 5
n==4, return 6



A:
solution  1:  一个个数,
Solution 2:把n 从2^j 一个个整除,来计算。
 private static int numberOf1Bits(int n) {
        int mask = 0; // set the mask as the left part of nPlusOne
        int nBits = 0; // counts the bits from 1 to n in their binary form

        for (int i = 30; i >= 0; i--) {
            int power2 = 1 << i;
            if(n/power2<1)  // this two lines can be commented
                continue;    // can be commented
            mask |= (n & power2);
            int nRepeatOnes = n / (power2<<1) * power2 ;
            int nLeftOnes = ( n>>i & 1)  * (n - mask+1);

            nBits += nRepeatOnes + nLeftOnes;
        }
        return nBits;
    }



Mistakes:
1:  这里,要除的时候,跟n+1无关,而应该是n / 上一位的power,因此要power2<<1


Thursday, October 16, 2014

Amazon面试题

Question 2: A sorted array has been rotated r times to the left. Find r in least possible time.
Question 1: There is a big file of words which is dynamically changing. We are continuously adding some words into it. How would you keep track of top 10 trending words at each moment?
Question 1: There is a binary tree of size N. All nodes are numbered between 1-N(inclusive). There is a N*N integer matrix Arr[N][N], all elements are initialized to zero. So for all the nodes A and B, put Arr[A][B] = 1 if A is an ancestor of B (NOT just the immediate ancestor).
Count total set bits in all numbers from 1 to n
Given a positive integer n, count the number of 1 bits.
Question 1: You are given many slabs each with a length and a breadth. A slab i can be put on slab j if both dimensions of i are less than that of j. In this similar manner, you can keep on putting slabs on each other. Find the maximum stack possible which you can create out of the given slabs.
Question 2: The above question was raised to 3 dimensions.
Question 3: The above question was then raised to k dimensions.
Questions :   Then there were many questions asked on compilers and dynamic memory allocation.
estion 1: You are given pairs of numbers. In a pair the first number is smaller with respect to the second number. Suppose you have two sets (a, b) and (c, d), the second set can follow the first set if b<c.So you can form a long chain in the similar fashion. Find the longest chain which can be formed.

Find the longest increasing subsequence in O(nlogn). Proof and full code was required.

Amazon2

Replace each element of an array with its greatest next integer in O(n).



题一:
有N个node,每个都不停的向外发送timestamps,具体发送哪些timestamp是每个node决定
的,从其他node来说是随机的.现在要收集这些node发送的所有timestamp.如果某个
timestamp被发现从超过99%的node上发送出来,记录下来.需要怎么做?这些timestamp很
多,是不能完全放进去内存里面的.如果node非常多,怎么scale?


题一:
有N个node,每个都不停的向外发送timestamps,具体发送哪些timestamp是每个node决定
的,从其他node来说是随机的.现在要收集这些node发送的所有timestamp.如果某个
timestamp被发现从超过99%的node上发送出来,记录下来.需要怎么做?这些timestamp很
多,是不能完全放进去内存里面的.如果node非常多,怎么scale?

Answer:  如果每个node不会发送相同的时间戳的话: hashmap <timestamp, 这个时间
戳收到的数量> 如果相同node会发送相同的时间戳:  hashmap <timestamp, 一个set
存储哪些node发送过这个时间戳> 然后应该有个数字代表99%的 node吧 如果hashmap的
value (对于第二种情况就是value的size) 超过了这个数字 就记录这个时间戳。

对于超多数据 根据时间戳进行hash 这样相同时间戳的数据会被分到同一台机器上 就
可以了吧












Saturday, October 11, 2014

facebook面试准备及内推 (转)


update:发给我的基本都给推了,有些已经开始面试了,少量被HR筛掉了。继续提供内
推。

提供内推,社招校招皆可,天朝美帝皆可,如有兴趣,请发简历到fbrefer@163.com
申请码工职位的,请至少保证刷完一遍leetcode,或同等水平
最近一年申请过的,由于公司政策,请不要再申请
如果需要,本人可以提供一次mock interview



关于面试流程
社招的话
电面1-2轮,一般就是coding
onsite一般是4轮,2轮coding,1轮design,1轮behavior+coding
校招的话,那轮design也变成coding了



关于准备
1) algo/coding
建议大家刷一下leetcode,基本上cover到了大多数常见面试题,而且有可能碰到原题
。需要注意的是,仅仅解出来,做到bug free可能是不够的。代码的质量和速度也非常
重要。网上有一些别人给出的答案可以参考,尽量做到代码简洁清晰。速度上leetcode
上所有题都做到10分钟以内写完。


2) design
解这种题是个*交流*的过程,或者说是给出方案然后获取反馈的不断循环的过程。
一般的流程:
首先你要问清楚requirement;
然后可以讲一下high level architecture,就是分成哪几个component,互相之间如果
interact,在白板上画一画;
之后面试官可能会让你深入某个component detail讨论;
也有可能变换requirement让你重新设计

另外,f家还喜欢让你估算机器之类的,做一些back-of-envelopme calculation。所以
最好对一些计算机相关的基本常数,fb的用户量等等有个大概的了解。

准备的时候建议看看fb的design高频题。一方面有可能面试的时候刚好碰到这几个
topic,另一方面其实很多design都是相通的。
之前有个帖子讲这个,原帖已经被删了,这儿有个备份http://blog.csdn.net/sigh1988/article/details/9790337

另外补充一点我收集的材料

a) 首先你可以从整体上了解一下facebook的architecture
http://www.quora.com/Facebook-Engineering/What-is-Facebooks-arc
http://www.ece.lsu.edu/hpca-18/files/HPCA2012_Facebook_Keynote.
http://www.quora.com/Facebook-Engineering/What-have-been-Facebo
除了下面给出的一些资料,fb engineering page里还有很多不错的内容
https://www.facebook.com/Engineering

b) news feed
这里有个talk
http://www.infoq.com/presentations/Facebook-News-Feed
对应的slides
http://readme.skplanet.com/wp-content/uploads/2012/11/0-3_Faceb
还有一些quora上的讨论
http://www.quora.com/Activity-Streams/What-are-the-scaling-issu
http://www.quora.com/What-are-best-practices-for-building-somet
http://www.quora.com/What-is-the-best-storage-solution-for-buil

c) facebook chat
这里有两个notes,其中第二个里面还有相应的tech talk links
https://www.facebook.com/notes/facebook-engineering/facebook-chat/
14218138919
https://www.facebook.com/notes/facebook-engineering/chat-stability-and-
scalability/51412338919

d) typeahead search & graph search
关于typeahead search的tech talk和notes
https://www.facebook.com/video/video.php?v=432864835468
https://www.facebook.com/note.php?note_id=365915113919
https://www.facebook.com/note.php?note_id=389105248919

关于graph search的paper, tech talk, notes。其中paper很值得一看。
http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p871-cur
https://newsroom.fb.com/Photos-and-B-Roll/4362/Graph-Search-Whiteboard
https://www.facebook.com/note.php?note_id=10151240856103920
https://www.facebook.com/note.php?note_id=10151347573598920
https://www.facebook.com/note.php?note_id=10151361720763920
https://www.facebook.com/note.php?note_id=10151432733048920
https://www.facebook.com/note.php?note_id=10151755593228920

e) facebook messages
两个tech talks
http://www.youtube.com/watch?v=XAuwAHWpzPc
http://www.infoq.com/presentations/HBase-at-Facebook
以及eng notes
https://www.facebook.com/note.php?note_id=10150148835363920
https://www.facebook.com/note.php?note_id=10150162742108920

f) photo storage
相关的papers和notes
https://www.usenix.org/conference/osdi10/finding-needle-haystack-facebooks-
photo-storage
https://www.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf
https://www.usenix.org/legacy/events/osdi10/tech/slides/beaver.pdf
https://www.facebook.com/note.php?note_id=76191543919

g) social graph data store
相关的note, video, paper
https://www.facebook.com/notes/facebook-engineering/tao-the-power-of-the-
graph/10151525983993920
https://www.usenix.org/conference/atc13/technical-sessions/presentation/
bronson
http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/117

h) tiny URL
这里有一些讨论
http://n00tc0d3r.blogspot.com/2013/09/big-data-tinyurl.html
http://stackoverflow.com/questions/742013/how-to-code-a-url-sho
http://stackoverflow.com/questions/3376163/what-are-the-things-

i) POI
参考这里
http://www.slideshare.net/mmalone/scaling-gis-data-in-nonrelati
http://www.mitbbs.ca/article_t/JobHunting/32476139.html


3) behavior,建议大家了解一下fb的culture,准备一下常见的behavior questions,
面试之前rehearsal一下。

最后面试临近的时候,可以再刷刷面经,找找感觉。像glassdoor, mitbbs/jobhunting
, careercup,这些上面就有很多。

如果有其它疑问,欢迎回复或者PM我。

Friday, October 10, 2014

LC不能直接pass的题目,需要多写几遍的 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

strstr()

Substring with  of All Words


Permutations II ( 这个以前自己写了三种思路,刚刚竟然一种都没想起来,艹)

 

Wildcard Matching


N-Queens II



Largest Rectangle in Histogram


Permutation Sequence


Valid Number



sqrt()







 



 



 



 



 



 



 



 

Sunday, October 5, 2014

152. Maximum Product Subarray ~~~~~~~~~~~最终看了别人的代码

Given an integer array nums, find the contiguous subarray within an array (containing at least one number) which has the largest product.

Example 1:

Input: [2,3,-2,4]
Output: 6
Explanation: [2,3] has the largest product 6.

Example 2:

Input: [-2,0,-1]
Output: 0
Explanation: The result cannot be 2, because [-2,-1] is not a subarray.

A:
----------------------自己的代码-------略---------------
就是先  用0 把所有的A[i]分开, 然后,只算有正负的。这样,每次相乘的结果,绝对值都变大了。因此,可以直接maxPosProduct 和minNegProduct互相转换。
但其实,这样是没有必要的-----------------


------------------------参考了别人的geeksforgeeks的代码--------------------
public class Solution {
    public int maxProduct(int[] A) {
        assert(A!=null && A.length >0);
        if(A.length == 1)
            return A[0];
        int result = Integer.MIN_VALUE, maxEndHere = 1;// max positive product ending at the current position
        int minEndHere = 1;// min negative product ending at the current position
        for (int i = 0; i < A.length; i++) {
            if (A[i] > 0) {
                maxEndHere = maxEndHere * A[i];
                minEndHere = Math.min(minEndHere * A[i],1);
                result = Math.max(result,maxEndHere);
            } else if(A[i]==0) {
                maxEndHere = 1;
                minEndHere = 1;
                result = Math.max(result,0);// 0 can also be a product result
            }else{ // A[i]<0
                result = Math.max(minEndHere*A[i],result);
                int tmp = maxEndHere;
                maxEndHere = Math.max( minEndHere*A[i],1);
                minEndHere = tmp*A[i];
            }
        }
        return result;
    }
}
Mistakes:
1: 就是上面的这个思路了。  每次用maxEndHere 等,每次初始值都设为1
2:   when A[i]< 0 , we need calculate the result at the beginning,  因此, 我们不能简单地把result = Math.max(  ) 放到if 条件的外面。

3:  切记: A.length ==1的时候要单独列出来。


Learned:  ----------

1:  java中, given 
int a =3;
int A = {1,2,3};
运行语句:  A[a/2]是正确的
2:函数要求的返回值是double,  我们可以return一个int类型。语法是正确的

3







Friday, August 22, 2014

几题排列组合看能不能把你绕晕

(来自买买提论坛。)







n个一样的球,放到m个一样的盒子里,有几种方法
n个一样的球,放到m个都不一样的盒子里,有几种方法
n个一样的球,放到m个除了颜色可能不一样其他都一样的盒子里,有几种方法
n个都不一样的球,放到m个一样的盒子里,有几种方法
n个都不一样的球,放到m个都不一样的盒子里,有几种方法
n个都不一样的球,放到m个除了颜色可能不一样其他都一样的盒子里,有几种方法
n个除了颜色可能不一样其他都一样的球,放到m个一样的盒子里,有几种方法
n个除了颜色可能不一样其他都一样的球,放到m个都不一样的盒子里,有几种方法
n个除了颜色可能不一样其他都一样的球,放到m个除了颜色可能不一样其他都一样的盒
子里,有几种方法






----------------------------------------Tiger's solution---------------------------请大家指正-------


n个一样的球,放到m个一样的盒子里,有几种方法

--------1------第一反应是DP方法。就是n分成最多m个子集,这样需要 m*n的时间和空间。
--------2------先把所有的求排成一串。把所有的盒子排成一串。然后每个球看放到哪个里面就是。就转换成了10块糖,每天至少吃一块儿的问题。 但是,这里的糖是球。最多要吃m天


n个一样的球,放到m个都不一样的盒子里,有几种方法






-----------------------先把所有的求排成一串。只有1种方法。


n个一样的球,放到m个除了颜色可能不一样其他都一样的盒子里,有几种方法


n个都不一样的球,放到m个一样的盒子里,有几种方法


n个都不一样的球,放到m个都不一样的盒子里,有几种方法


n个都不一样的球,放到m个除了颜色可能不一样其他都一样的盒子里,有几种方法



n个除了颜色可能不一样其他都一样的球,放到m个一样的盒子里,有几种方法




n个除了颜色可能不一样其他都一样的球,放到m个都不一样的盒子里,有几种方法





n个除了颜色可能不一样其他都一样的球,放到m个除了颜色可能不一样其他都一样的盒子里,有几种方法









这里有个wiki的链接 。貌似就是讲这样的问题。














Thursday, April 10, 2014

Construct a tree from Inorder and Level order traversals (待写)

原始链接


Given inorder and level-order traversals of a Binary Tree, construct the Binary Tree. Following is an example to illustrate the problem.
BinaryTree
Input: Two arrays that represent Inorder
       and level order traversals of a 
       Binary Tree
in[]    = {4, 8, 10, 12, 14, 20, 22};
level[] = {20, 8, 22, 4, 12, 10, 14};

Output: Construct the tree represented 
        by the two arrays.
        For the above two arrays, the 
        constructed tree is shown in 
        the diagram on right side
 思想就是每次在level order中,提取first node 作为当前的root。
然后,在in order中找到所有在其前面的值(作为左子树)。同时在level order的数组中抽取所有左子树的value。
same for the right subtree.
repeat above .
 
 
 
We strongly recommend to minimize the browser and try this yourself first.
The following post can be considered as a prerequisite for this.
Construct Tree from given Inorder and Preorder traversals
Let us consider the above example.
in[] = {4, 8, 10, 12, 14, 20, 22};
level[] = {20, 8, 22, 4, 12, 10, 14};
In a Levelorder sequence, the first element is the root of the tree. So we know ’20′ is root for given sequences. By searching ’20′ in Inorder sequence, we can find out all elements on left side of ‘20’ are in left subtree and elements on right are in right subtree. So we know below structure now.
             20
           /    \
          /      \ 
 {4,8,10,12,14}  {22}   
Let us call {4,8,10,12,14} as left subarray in Inorder traversal and {22} as right subarray in Inorder traversal.
In level order traversal, keys of left and right subtrees are not consecutive. So we extract all nodes from level order traversal which are in left subarray of Inorder traversal. To construct the left subtree of root, we recur for the extracted elements from level order traversal and left subarray of inorder traversal. In the above example, we recur for following two arrays.
// Recur for following arrays to construct the left subtree
In[]    = {4, 8, 10, 12, 14}
level[] = {8, 4, 12, 10, 14} 
Similarly, we recur for following two arrays and construct the right subtree.
// Recur for following arrays to construct the right subtree
In[]    = {22}
level[] = {22} 
Following is C++ implementation of the above approach.
/* program to construct tree using inorder and levelorder traversals */
#include <iostream>
using namespace std;
/* A binary tree node */
struct Node
{
    int key;
    struct Node* left, *right;
};
/* Function to find index of value in arr[start...end] */
int search(int arr[], int strt, int end, int value)
{
    for (int i = strt; i <= end; i++)
        if (arr[i] == value)
            return i;
    return -1;
}
// n is size of level[], m is size of in[] and m < n. This
// function extracts keys from level[] which are present in
// in[].  The order of extracted keys must be maintained
int *extrackKeys(int in[], int level[], int m, int n)
{
    int *newlevel = new int[m], j = 0;
    for (int i = 0; i < n; i++)
        if (search(in, 0, m-1, level[i]) != -1)
            newlevel[j] = level[i], j++;
    return newlevel;
}
/* function that allocates a new node with the given key  */
Node* newNode(int key)
{
    Node *node = new Node;
    node->key = key;
    node->left = node->right = NULL;
    return (node);
}
/* Recursive function to construct binary tree of size n from
   Inorder traversal in[] and Level Order traversal level[].
   inSrt and inEnd are start and end indexes of array in[]
   Initial values of inStrt and inEnd should be 0 and n -1.
   The function doesn't do any error checking for cases
   where inorder and levelorder do not form a tree */
Node* buildTree(int in[], int level[], int inStrt, int inEnd, int n)
{
    // If start index is more than the end index
    if (inStrt > inEnd)
        return NULL;
    /* The first node in level order traversal is root */
    Node *root = newNode(level[0]);
    /* If this node has no children then return */
    if (inStrt == inEnd)
        return root;
    /* Else find the index of this node in Inorder traversal */
    int inIndex = search(in, inStrt, inEnd, root->key);
    // Extract left subtree keys from level order traversal
    int *llevel  = extrackKeys(in, level, inIndex, n);
    // Extract right subtree keys from level order traversal
    int *rlevel  = extrackKeys(in + inIndex + 1, level, n-inIndex-1, n);
    /* construct left and right subtress */
    root->left = buildTree(in, llevel, inStrt, inIndex-1, n);
    root->right = buildTree(in, rlevel, inIndex+1, inEnd, n);
    // Free memory to avoid memory leak
    delete [] llevel;
    delete [] rlevel;
    return root;
}
/* Uti;ity function to print inorder traversal of binary tree */
void printInorder(Node* node)
{
    if (node == NULL)
       return;
    printInorder(node->left);
    cout << node->key << " ";
    printInorder(node->right);
}
/* Driver program to test above functions */
int main()
{
    int in[]    = {4, 8, 10, 12, 14, 20, 22};
    int level[] = {20, 8, 22, 4, 12, 10, 14};
    int n = sizeof(in)/sizeof(in[0]);
    Node *root = buildTree(in, level, 0, n - 1, n);
    /* Let us test the built tree by printing Insorder traversal */
    cout << "Inorder traversal of the constructed tree is \n";
    printInorder(root);
    return 0;
}
Output:
Inorder traversal of the constructed tree is
4 8 10 12 14 20 22
An upper bound on time complexity of above method is O(n3). In the main recursive function, extractNodes() is called which takes O(n2) time.
The code can be optimized in many ways and there may be better solutions. Looking for improvements and other optimized approaches to solve this problem.

Suffix Array | Set 2 (nLogn Algorithm)

原始链接

A suffix array is a sorted array of all suffixes of a given string. The definition is similar to Suffix Tree which is compressed trie of all suffixes of the given text.
Let the given string be "banana".

0 banana                          5 a
1 anana     Sort the Suffixes     3 ana
2 nana      ---------------->     1 anana  
3 ana        alphabetically       0 banana  
4 na                              4 na   
5 a                               2 nana

The suffix array for "banana" is {5, 3, 1, 0, 4, 2} 
We have discussed Naive algorithm for construction of suffix array. The Naive algorithm is to consider all suffixes, sort them using a O(nLogn) sorting algorithm and while sorting, maintain original indexes. Time complexity of the Naive algorithm is O(n2Logn) where n is the number of characters in the input string.
In this post, a O(nLogn) algorithm for suffix array construction is discussed. Let us first discuss a O(n * Logn * Logn) algorithm for simplicity. The idea is to use the fact that strings that are to be sorted are suffixes of a single string.
We first sort all suffixes according to first character, then according to first 2 characters, then first 4 characters and so on while the number of characters to be considered is smaller than n. The important point is, if we have sorted suffixes according to first 2i characters, then we can sort suffixes according to first 2i+1 characters in O(nLogn) time using a nLogn sorting algorithm like Merge Sort. This is possible as two suffixes can be compared in O(1) time (we need to compare only two values, see the below example and code).
The sort function is called O(Logn) times (Note that we increase number of characters to be considered in powers of 2). Therefore overall time complexity becomes O(nLognLogn). See http://www.stanford.edu/class/cs97si/suffix-array.pdf for more details.
Let us build suffix array the example string “banana” using above algorithm.
Sort according to first two characters Assign a rank to all suffixes using ASCII value of first character. A simple way to assign rank is to do “str[i] – ‘a’” for ith suffix of strp[]
Index     Suffix            Rank
 0        banana             1   
 1        anana              0 
 2        nana               13 
 3        ana                0
 4        na                 13
 5        a                  0 
For every character, we also store rank of next adjacent character, i.e., the rank of character at str[i + 1] (This is needed to sort the suffixes according to first 2 characters). If a character is last character, we store next rank as -1
Index    Suffix            Rank          Next Rank 
 0       banana             1              0
 1       anana              0              13    
 2       nana               13             0
 3       ana                0              13
 4       na                 13             0 
 5       a                  0             -1 
Sort all Suffixes according to rank and adjacent rank. Rank is considered as first digit or MSD, and adjacent rank is considered as second digit.
Index    Suffix            Rank          Next Rank 
 5        a                  0              -1
 1        anana              0               13    
 3        ana                0               13
 0        banana             1               0
 2        nana               13              0
 4        na                 13              0  
Sort according to first four character
Assign new ranks to all suffixes. To assign new ranks, we consider the sorted suffixes one by one. Assign 0 as new rank to first suffix. For assigning ranks to remaining suffixes, we consider rank pair of suffix just before the current suffix. If previous rank pair of a suffix is same as previous rank of suffix just before it, then assign it same rank. Otherwise assign rank of previous suffix plus one.
Index       Suffix          Rank       
  5          a               0     [Assign 0 to first]        
  1          anana           1     (0, 13) is different from previous
  3          ana             1     (0, 13) is same as previous     
  0          banana          2     (1, 0) is different from previous      
  2          nana            3     (13, 0) is different from previous      
  4          na              3     (13, 0) is same as previous      
For every suffix str[i], also store rank of next suffix at str[i + 2]. If there is no next suffix at i + 2, we store next rank as -1
Index       Suffix          Rank        Next Rank
  5          a               0             -1
  1          anana           1              1      
  3          ana             1              0 
  0          banana          2              3
  2          nana            3              3 
  4          na              3              -1       
Sort all Suffixes according to rank and next rank.
Index       Suffix          Rank        Next Rank
  5          a               0             -1
  3          ana             1              0 
  1          anana           1              1      
  0          banana          2              3
  4          na              3             -1
  2          nana            3              3       
We stop here as the next value is 8 which is greater than number of characters in “banana”.
// C++ program for building suffix array of a given text
#include <iostream>
#include <cstring>
#include <algorithm>
using namespace std;
 
// Structure to store information of a suffix
struct suffix
{
    int index; // To store original index
    int rank[2]; // To store ranks and next rank pair
};
 
// A comparison function used by sort() to compare two suffixes
// Compares two pairs, returns 1 if first pair is smaller
int cmp(struct suffix a, struct suffix b)
{
    return (a.rank[0] == b.rank[0])? (a.rank[1] < b.rank[1] ?1: 0):
               (a.rank[0] < b.rank[0] ?1: 0);
}
 
// This is the main function that takes a string 'txt' of size n as an
// argument, builds and return the suffix array for the given string
int *buildSuffixArray(char *txt, int n)
{
    // A structure to store suffixes and their indexes
    struct suffix suffixes[n];
 
    // Store suffixes and their indexes in an array of structures.
    // The structure is needed to sort the suffixes alphabatically
    // and maintain their old indexes while sorting
    for (int i = 0; i < n; i++)
    {
        suffixes[i].index = i;
        suffixes[i].rank[0] = txt[i] - 'a';
        suffixes[i].rank[1] = ((i+1) < n)? (txt[i + 1] - 'a'): -1;
    }
 
    // Sort the suffixes using the comparison function
    // defined above.
    sort(suffixes, suffixes+n, cmp);
 
    // At his point, all suffixes are sorted according to first
    // 2 characters.  Let us sort suffixes according to first 4
    // characters, then first 8 and so on
    int ind[n];  // This array is needed to get the index in suffixes[]
                 // from original index.  This mapping is needed to get
                 // next suffix.
    for (int k = 4; k < n; k = k*2)
    {
        // Assigning rank and index values to first suffix
        int rank = 0;
        suffixes[0].rank[0] = rank;
        ind[suffixes[0].index] = 0;
        int prev_rank = suffixes[0].rank[0];
 
        // Assigning rank to suffixes
        for (int i = 1; i < n; i++)
        {
            // If first rank and next ranks are same as that of previous
            // suffix in array, assign the same new rank to this suffix
            if (suffixes[i].rank[0] == prev_rank &&
                    suffixes[i].rank[1] == suffixes[i-1].rank[1])
            {
                prev_rank = suffixes[i].rank[0];
                suffixes[i].rank[0] = rank;
            }
            else // Otherwise increment rank and assign
            {
                prev_rank = suffixes[i].rank[0];
                suffixes[i].rank[0] = ++rank;
            }
            ind[suffixes[i].index] = i;
        }
 
        // Assign next rank to every suffix
        for (int i = 0; i < n; i++)
        {
            int nextindex = suffixes[i].index + k/2;
            suffixes[i].rank[1] = (nextindex < n)?
                                  suffixes[ind[nextindex]].rank[0]: -1;
        }
 
        // Sort the suffixes according to first k characters
        sort(suffixes, suffixes+n, cmp);
    }
 
    // Store indexes of all sorted suffixes in the suffix array
    int *suffixArr = new int[n];
    for (int i = 0; i < n; i++)
        suffixArr[i] = suffixes[i].index;
 
    // Return the suffix array
    return  suffixArr;
}
 
// A utility function to print an array of given size
void printArr(int arr[], int n)
{
    for (int i = 0; i < n; i++)
        cout << arr[i] << " ";
    cout << endl;
}
 
// Driver program to test above functions
int main()
{
    char txt[] = "banana";
    int n = strlen(txt);
    int *suffixArr = buildSuffixArray(txt,  n);
    cout << "Following is suffix array for " << txt << endl;
    printArr(suffixArr, n);
    return 0;
}
Output:
Following is suffix array for banana
5 3 1 0 4 2
Note that the above algorithm uses standard sort function and therefore time complexity is O(nLognLogn). We can use Radix Sort here to reduce the time complexity to O(nLogn).
Please note that suffx arrays can be constructed in O(n) time also. We will soon be discussing O(n) algorithms.
References:
http://www.stanford.edu/class/cs97si/suffix-array.pdf
http://www.cbcb.umd.edu/confcour/Fall2012/lec14b.pdf