0187 - Repeated DNA Subsequences
0187 - Repeated DNA Subsequences
The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.
For example, "ACGAATTCCG" is a DNA sequence.
When studying DNA, it is useful to identify repeated sequences within the DNA.
Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.
Examples
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC","CCCCCAAAAA"]
Input: s = "AAAAAAAAAAAAA" Output: ["AAAAAAAAAA"]
Constraints
1 <= s.length <= 105
s[i] is either 'A', 'C', 'G', or 'T'.
Java Solution
class Solution {
public List<String> findRepeatedDnaSequences(String s) {
int L = 10, n = s.length();
HashSet<String> seen = new HashSet(), output = new HashSet();
for (int start = 0; start < n - L + 1; ++start) {
String tmp = s.substring(start, start + L);
if (seen.contains(tmp)) output.add(tmp);
seen.add(tmp);
}
return new ArrayList<String>(output);
}
}
Last updated