There are two sorted arrays A and B of size m and n respectively.
Find the median of the two sorted arrays.
The overall run time complexity should be O(log (m+n)).
A:
方法的核心是将原问题转变成一个寻找第k小数的问题(假设两个原序列升序排列),这样中位数实际上是第(m+n)/2小的数。所以只要解决了第k小数的问题,原问题也得以解决。
首先假设数组A和B的元素个数都大于k/2,我们比较A[k/2-1]和B[k/2-1]两个元素,这两个元素分别表示A的第k/2小的元素和B的 第k/2小的元素。这两个元素比较共有三种情况:>、<和=。如果A[k/2-1]<B[k/2-1],这表示A[0]到A[k /2-1]的元素都在A和B合并之后的前k小的元素中。换句话说,A[k/2-1]不可能大于两数组合并之后的第k小值,所以我们可以将其抛弃。
-------简单的说,就是或者丢弃较大中位数的右区间,或者丢弃较小中位数的左区间。
-----------呵呵,自己想出来的, 可是,还是不自信, 没有code完,就去看别人的提示了。
---------------哎~~~~~~~ 踩到地雷了,以为很简单,没想到是难度为5 的题目------------
public class Solution { public double findMedianSortedArrays(int A[], int B[]) { int m = A.length, n = B.length; assert (!(n == m && n == 0)); // if both array are empty, WRONG int[] C = null;// C point to the non-empty array, while the other is // empty if (m == 0) { C = B; } else if (n == 0) { C = A; } if (C != null) { // if A is empty return (C.length % 2 == 1) ? C[C.length / 2] : (C[C.length / 2] + C[C.length / 2 - 1]) / 2.0; } // NOW neither A or B is empty if ((m + n) % 2 == 1) { return findKth(A, 0, m - 1, B, 0, n - 1, (m + n) / 2 + 1); } else { int leftMedian = findKth(A, 0, m - 1, B, 0, n - 1, (m + n) / 2); int rightMedian = findKth(A, 0, m - 1, B, 0, n - 1, (m + n) / 2 + 1); return (leftMedian + rightMedian) / 2.0; } } private int findKth(int[] a, int lowA, int highA, int[] b, int lowB, int highB, int k) { // GOAL: each time, we either cut the half left of the smaller median // or the right part of the bigger median array if (k == 1) { return Math.min(a[lowA], b[lowB]); } else { // if one array is empty if (lowA > highA) { // A is empty, search in B return b[lowB + k - 1]; } if (lowB > highB) {// B is empty, search in A return a[lowA + k - 1]; } int midA = (lowA + highA) / 2; int midB = (lowB + highB) / 2; // two pointers pointing , according to the value of the median int[] medianSmall = a; int smallLow = lowA; int smallMid = midA; int smallHigh = highA; int[] medianBig = b; int bigLow = lowB; int bigMid = midB; int bigHigh = highB; // make sure that small point to small if (medianSmall[smallMid] > medianBig[bigMid]) { medianSmall = b; medianBig = a; smallLow = lowB; smallMid = midB; smallHigh = highB; bigLow = lowA; bigMid = midA; bigHigh = highA; } // check value of k with the size of their half int smallLeftLength = smallMid - smallLow + 1; // smallMid is counted, and != 0 int bigLeftLength = bigMid - bigLow + 1; // bigMid is counted; if (k <= smallLeftLength + bigLeftLength) { // discard the right part of the medianBig array // NOTE that the length of the right part of the medianBig array // could be ZERO. if (bigMid < bigHigh) { // right part is not of length 0 return findKth(medianSmall, smallLow, smallHigh, medianBig, bigLow, bigMid, k); } else { // right part is of length 0, i.e. bigLow == bigHigh // now,we can (imaginely) insert this value into the // range of medianSmall array int biggerMedian = medianBig[bigMid];// first candicates if (k == smallLeftLength + 1) { return (smallMid == smallHigh) ? biggerMedian : Math .min(biggerMedian, medianSmall[smallMid + 1]); } else { return medianSmall[smallLow + k - 1]; } } } else { // discard the left part of the medianSmall array ( // length>0) return findKth(medianSmall, smallMid + 1, smallHigh, medianBig, bigLow, bigHigh, k - smallLeftLength); } } } }
Mistakes:
TMD: median 的定义是: If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values
1: 因为我们后来在处理一个数组结束之后,只寻找另一个数组的情况的时候。 用了一个唯一的值等于Interger.Max_VALUE。 所以, 当k == 1 时,我们要考虑Interger.MAX_VALUE 对我们的影响。 因此,不能简单得返回 Math.max(a[lowA],b[lowB]) , 要改成代码中所示。
2: 当我面在递归函数里, 处理 k == 1的时候,忘记了先检验,是否数组不为空。
3: 当递归的时候, 需要处理一个情况, 就是, pointerSmall 的左边 和pointerBig的右边,如果为空的话,就会导致out of stack exception.
因此,需要先处理那些情况。
4: 要记得检测, 总数是奇数还是偶数--------导致的计算方法是不一样的
5: 哎呀, TMD,疯了, 就因为,当时, 也不知道怎么想的。可能是半夜,脑子不清醒。
当k==1的时候,我们应该取两个数组里面,各自第一个里的值的较小的呀。
应该是Math.min ------可是自己当时,写的是Math.max 花费了一次半值班的时间, 大约是4,5个小时的宝贵时间。 得不偿失啊。 Tiger
6: when discarding the right part of the medianBig array, we should check first the length of the to-be-discarded part.
因此这种情况下, medianBig 数组的bigHigh == bigSmall,我们就把它的值单独列出来,并对比medianSmall即可。 (否则会出现死循环调用,导致out of stack memory ,)
Learned:
1: 当 一个int 值, 例如,
int a = 3;
a/2 的值,默认还是int类型。 因此,当我们需要返回一个带小数的double类型的话,需要除以2.0
---------------------------第二遍 ---------------------------
思路同上, 只是没有用medianBig, medianSmall来区分罢了。
public class Solution { public double findMedianSortedArrays(int A[], int B[]) { int m = A.length, n = B.length; return (findKth(A,0,m-1,B,0,n-1,(m+n)/2)+findKth(A,0,m-1,B,0,n-1,(m+n)/2+1))/2.0; } public int findKth(int[] A, int lowA,int highA, int[] B, int lowB,int highB,int kth){ if(lowA>highA) return B[lowB+kth-1]; if(lowB>highB) return A[lowA+kth-1]; int leftACount = (lowA+highA)/2 - lowA; int leftBCount = (lowB+highB)/2 - lowB; if( A[(lowA+highA)/2] >= B[(lowB+highB)/2] ){ // median of A >= median of B if(leftACount+leftBCount+1 >= kth)// discard right part of A, including median A return findKth(A,lowA, (lowA+highA)/2 -1,B,lowB,highB,kth); else // discard left part of B return findKth(A,lowA, highA, B, (lowB+highB)/2 +1, highB, kth - leftBCount -1 ); }else{ // median of A < median of B if(leftACount+leftBCount+1 >= kth)// discard right part of B, including median B return findKth(A,lowA, highA ,B,lowB,(lowB+highB)/2-1,kth); else // discard left part of A return findKth(A,(lowA+highA)/2+1, highA, B, lowB, highB, kth - leftACount -1 ); } } }Mistakes:
1: kth这个参数, 第一遍是按照0 based考虑的。 那么, 在对比leftCount+rightCount 和 kth的大小的时候, 就不需要加1了。 ----------(这个问题很快发现, 然后就调整了)
2:由于在 if 语句中(20行左右),写 (lowB+highB)/2] 的时候,拷贝了上面, 然后发现多了个 ‘】’, 就随手把if (这里的】 也去掉了, ) complier error, 就又随手加上了 ] ,。 结果加到/2 后面去了 。 哎, 结果 傻看了40多分钟,才弄明白。 哎, 人生阿~~~~~~~
状态不好的时候, 就不要去coding, 犯的SB错误,都会搞死我的时间
############ 3 rd pass##################
public class Solution { public double findMedianSortedArrays(int[] nums1, int[] nums2) { int m = nums1.length, n = nums2.length; return ( helper(nums1,0,m-1,nums2,0,n-1,(m+n)/2 ) + helper(nums1,0,m-1,nums2,0,n-1,(m+n-1)/2) )/2.0; } public int helper(int[] A, int startA, int endA, int B[], int startB, int endB, int k ){ // find the kth in A and B ( k start from ==0 ) if(startA > endA) return B[startB+k]; if(startB > endB) return A[startA + k]; int midA = (startA+endA)/2, midB = (startB+endB)/2; int nLeft = midA-startA + midB - startB +1; // inclusive the smaller middle value if( A[midA] <= B[midB]){ if( nLeft >= k+1 ){// exclude the right part of B, include midB return helper(A,startA,endA, B, startB,midB-1,k); }else{//delete the left part of A, include midA return helper(A,midA+1, endA, B, startB, endB, k-(midA-startA+1)); } }else{ // switch A and B return helper(B,startB,endB, A, startA, endA, k); } } }
改进是, 倒数第二个if, 交换了A,B 两个数组,这样,少写一遍最容易出错的代码
Mistakes:
这里最容易犯的错误, 第一个if里,开始写成了 nLeft >= index
No comments:
Post a Comment