Friday, February 27, 2015

187. Repeated DNA Sequences ----M ~

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]
A:
就是像String matching的 Rabin–Karp algorithm。
class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        vector<string> res;
        unordered_map<string,int> M;
        for(int i =0;i <= int(s.length())-10; i++)
            M[s.substr(i,10)] +=1;
        
        auto iter = M.begin();
        while(iter != M.end()){
            if(iter->second > 1){
                res.push_back(iter->first);
            }
            iter++;
        }
        return res;
    }
};


Mistakes:

auto len = s.length()  return type is size_t, 
which is non-negative, thus len-10 can be negative, but would be treated a huge positive number





No comments:

Post a Comment