Wednesday, October 2, 2013

Edit Distance !!!!! 第二遍一上来没想出DP算法来,丢人

Q:
Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)
You have the following 3 operations permitted on a word:
a) Insert a character
b) Delete a character
c) Replace a character
A:
 ---------------------3rd pass----------------rolling table----------

public class Solution {
    public int minDistance(String word1, String word2) {
        int n1 = word1.length(),n2 = word2.length();
        if(n1==0 || n2 == 0)
            return n1>n2?n1:n2;
        int[] A = new int[n1+1]; // use rolling table
        // initialize the case that when word2 is empty
        for(int i =0;i<=n1;i++)
            A[i] = i;
        
        for(int i =0;i<n2;i++){
            int lastPre = i; // last time, of A[j-1]
            A[0] = i;// after adding ch from word2, and word1 is empty
            for(int j =0;j<n1;j++){
                int tmp = A[j+1];
                if(word2.charAt(i) == word1.charAt(j)){
                    A[j+1] = lastPre;
                }else{
                    A[j+1] = 1+ Math.min(Math.min(lastPre,A[j]),A[j+1]);// replace, delete,
                }
                lastPre = tmp;
            }
        }
        return A[n1];
    }
}


仔细端详一维数组A, 假设一个2×2的矩阵,我们要计算右下角的值,只需要知道另外三个即可。
因此,这里,我们用一个lastPre来记住上次的,匹配了的值。

Mistakes:

当两个char 相等的时候,真是,没动脑,写成了A[j+1] = A[j] // 这样是错误的!!!!





------------当递归调用不行的时候,直接就应该开始用DP--------------
                       用一个2D array 保存中间结果。
public class EditDistance {

    public int minDistance(String word1, String word2) {
        // use a matrix to record their simularities
        int n1 = word1.length();
        int n2 = word2.length();
        int[][] minDist = new int[n1 + 1][n2 + 1];
        // for minDist[i][j], i is the end index of word1(exclusive ) ,
        // while j is the end index of work2( exclusive)
        // STEP 1: fillin the edge when word1 is empty or word2 is empty
        // at minDist[0][0] , both are empty, we need zero operation
        for (int i = 0; i <= n1; i++) {
            minDist[i][0] = i; // target word2 is empty, operation to given
                                // word[0...i] to word 2
        }
        for(int j =0;j<=n2;j++){
            minDist[0][j] = j;// word1 is empty, number of operations to build word 2
        }
        for(int i =1; i<=n1;i++){
            for(int j =1;j<=n2;j++){
                if(word1.charAt(i-1) == word2.charAt(j-1)){
                    minDist[i][j] = minDist[i-1][j-1];
                }else{
                    // delete the last of s1
                    int deleteLastWord1 = 1 + minDist[i-1][j];
                    // replace the last of word1 to the last of word2
                    int replaceLastWord1 = 1 + minDist[i-1][j-1];
                    // insert the last of word2 to the end of word1
                    int insertLastWord1 = 1 + minDist[i][j-1];
                    int tmp = Math.min(deleteLastWord1, replaceLastWord1);
                    minDist[i][j] = Math.min(tmp, insertLastWord1);                    
                }
                
            }            
        }
        return minDist[n1][n2];

    }
}


--------------------------------2 rd Pass---------------------------------------------------------------------------Rolling table----------------

基本的rolling table只需要知道上边和左边的操作就行了。(对应delete and insert)
但这里还给了第三种操作:replace。这样就导致了不能直接用一个1D array了,需要用2个。
一个就是还用普通的rolling table,(基本更动也照常)
另一个记录上一次的值。(因此需要每次跑完一行之后,把其重新copy)

public class Solution {
    public int minDistance(String word1, String word2) {
        int n1 = word1.length(), n2 = word2.length();
        if( n1 ==0 || n2 == 0)
            return Math.max(n1, n2);

        int A[] = new int[n2+1];//rolling table--> index j would be the number of chars in word2 used
        int lastA[] = new int[n2+1];
        for(int j=0;j<= n2;j++)//using zero characters in word 1
            A[j] = j;

        for(int i =0;i<n1;i++){
            for(int j =0;j<= n2;j++)
                lastA[j] = A[j]; // remember that in the last time( or last line), the A[j] value
            for(int j = 0;j<= n2;j++)
                if(j==0){
                    A[j] = i+1; // # of chars in word1 used,
                }else{
                    if(word1.charAt(i)== word2.charAt(j-1))
                        A[j] = lastA[j-1];
                    else
                        A[j] = 1+ Math.min(lastA[j-1], Math.min(A[j],A[j-1])); // min(replace, min(insert, delete))
                }
        }
        return A[n2];
    }
}
这道题,即使是转化成longgest common substring,  也是上面的复杂度。 除非用suffix tree来找。--但是那样的话,建tree的空间,时间开销也要O(n*log(n))  
Mistakes: (for the 2nd pass)
1: 刚开始的时候,觉得既然只需要记住lastA的话, 那么只用一个变量即可。   但那样可能会搞得比较复杂。想想为什么呢?  
2: 每次update lastA的时候,都是放在for(int j ....)里面了,想那样写,多省几行,  哎, 多无謂的浪费阿~~~~
3: update lastA的时候, 虽然也是写了。但是放在了 upate A 之后, 这样是不对的,应该是放在之前。 (其实主要区别也就是在于 初始化 lastA的时候的值不一样)

 Learned:


-------------递归方法---------考虑到重复计算的浪费, 因此也用了2D array来记录  中间结果-------
public class Solution { 
    public int minDistance(String word1, String word2) {
        int A[][]  = new int[word1.length()+1][word2.length()+1];
        for(int i =0;i< word1.length()+1;i++)
            Arrays.fill(A[i],-1000);
        return helper(A,word1,word2);
    }
    private int helper(int[][] A, String word1, String word2){
        int i = word1.length(), j = word2.length();
        if(A[i][j] >= 0){
        }else if(i==0 || j == 0){
            A[i][j] = Math.max(i,j);
        }else if(word1.charAt(0) == word2.charAt(0))
            A[i][j] =  helper(A,word1.substring(1), word2.substring(1));
        else{
            int replace = helper(A,word1.substring(1), word2.substring(1));
            int delete =  helper(A,word1.substring(1),word2);
            int insert =  helper(A,word1,word2.substring(1));
            A[i][j] = 1+ Math.min(replace,Math.min(delete,insert));
        }
        return A[i][j];
    }
} 

No comments:

Post a Comment