Minimum Window Substring the time complexity of my solution

Minimum Window Substring the time complexity of my solution - java

Minimum Window Substring
this is a problem from Leetcode https://leetcode.com/problems/minimum-window-substring/
I found a solution based on Sliding Window Algorithm, but I cannot figure out the time complexity. Some people said it is O(N), but I think it is not. Please help me, thanks!
public class Solution {
// Minimum Window Algorithm, the algorithm must fit for specific problem, this problem is diff from ...words
// 348ms
public String minWindow(String s, String t) {
int N = s.length(), M = t.length(), count = 0;
String res = "";
if (N < M || M == 0) return res;
int[] lib = new int[256], cur = new int[256]; // ASCII has 256 characters
for (int i = 0; i < M; lib[t.charAt(i++)]++); // count each characters in t
for (int l = 0, r = 0; r < N; r++) {
char c = s.charAt(r);
if (lib[c] != 0) {
cur[c]++;
if (cur[c] <= lib[c]) count++;
if (count == M) {
char tmp = s.charAt(l);
while (lib[tmp] == 0 || cur[tmp] > lib[tmp]) {
cur[tmp]--;
tmp = s.charAt(++l);
}
if (res.length() == 0 || r - l + 1 < res.length())
res = s.substring(l, r + 1);
count--; // should add these three lines for the case cur[c] c is char in s but not the one visited
cur[s.charAt(l)]--;
l++;
}
}
}
return res;
}
}

There are N steps to add every char in s to r position
There are no more than O(N) while operators - at most N working cycles with ++l operations, and at most N worthless checks of while condition
So overall complexity is linear, if we don't take into consideration s.substring.
Note that substring operation should be moved out of the loop, we have to keep the best index pair only, and get substring at the very end.

check out my solution:
public class Solution {
public String minWindow(String S, String T) {
Map<Character, Integer> pattern = new HashMap<Character, Integer>();
Map<Character, Integer> cur = new HashMap<Character, Integer>();
Queue<Integer> queue = new LinkedList<Integer>();
int min = Integer.MAX_VALUE;
int begin = 0, end = 0;
// fill in pattern by T
for(int i = 0;i < T.length();i++) addToMap(pattern, T.charAt(i));
// initialize current set
for(int i = 0;i < T.length();i++) cur.put(T.charAt(i), 0);
// go through S to match the pattern by minimum length
for(int i = 0;i < S.length();i++){
if(pattern.containsKey(S.charAt(i))){
queue.add(i);
addToMap(cur, S.charAt(i));
// check if pattern is matched
while(isMatch(pattern, cur)){ /* Important Code! */
if(i - queue.peek() < min){
min = i - queue.peek();
begin = queue.peek();
end = i+1;
}
cur.put(S.charAt(queue.peek()), cur.get(S.charAt(queue.peek()))-1);
queue.poll();
}
}
}
return end > begin?S.substring(begin, end):"";
}
private void addToMap(Map<Character, Integer> map, Character c){
if(map.containsKey(c))
map.put(c, map.get(c)+1);
else
map.put(c,1);
}
private boolean isMatch(Map<Character, Integer> p, Map<Character, Integer> cur){
for(Map.Entry<Character, Integer> entry: p.entrySet())
if(cur.get((char)entry.getKey()) < (int)entry.getValue()) return false;
return true;
}
}

Related

How could I improve the speed/performance for this problem, Java

I saw this challenge on https://www.topcoder.com/ for Beginners. And I really wanted to complete it. I've got so close after so many failures. But I got stuck and don't know what to do no more. Here is what I mean
Question:
Read the input one line at a time and output the current line if and only if you have already read at least 1000 lines greater than the current line and at least 1000 lines less than the current line. (Again, greater than and less than are with respect to the ordering defined by String.compareTo().)
Link to the Challenge
My Solution:
public static void doIt(BufferedReader r, PrintWriter w) throws IOException {
SortedSet<String> linesThatHaveBeenRead = new TreeSet<>();
int lessThan =0;
int greaterThan =0;
Iterator<String> itr;
for (String currentLine = r.readLine(); currentLine != null; currentLine = r.readLine()){
itr = linesThatHaveBeenRead.iterator();
while(itr.hasNext()){
String theCurrentLineInTheSet = itr.next();
if(theCurrentLineInTheSet.compareTo(currentLine) == -1)++lessThan;
else if(theCurrentLineInTheSet.compareTo(currentLine) == 1)++greaterThan;
}
if(lessThan >= 1000 && greaterThan >= 1000){
w.println(currentLine);
lessThan = 0;
greaterThan =0;
}
linesThatHaveBeenRead.add(currentLine);
}
}
PROBLEM
I think the problem with my solution, is because I'm using nested loops which is making it a lot slower, but I've tried other ways and none worked. At this point I'm stuck. The whole point of this challenge is to make use of the most correct data-structure for this problem.
GOAL:
The goal is to use the most efficient data-structure for this problem.

Let me try to present just an accessible refinement of what to do.
public static void
doIt(java.io.BufferedReader r, java.io.PrintWriter w)
throws java.io.IOException {
feedNonExtremes(r, (line) -> { w.println(line);}, 1000, 1000);
}
/** Read <code>r</code> one line at a time and
* output the current line if and only there already were<br/>
* at least <code>nHigh</code> lines greater than the current line <br/>
* and at least <code>nLow</code> lines less than the current line.<br/>
* #param r to read lines from
* #param sink to feed lines to
* #param nLow number of lines comparing too small to process
* #param nHigh number of lines comparing too great to process
*/
static void feedNonExtremes(java.io.BufferedReader r,
Consumer<String> sink, int nLow, int nHigh) {
// collect nLow+nHigh lines into firstLowHigh; instantiate
// - a PriorityQueue(firstLowHigh) highest
// - a PriorityQueue(nLow, (a, b) -> String.compareTo(b, a)) lowest
// remove() nLow elements from highest and insert each into lowest
// for each remaining line
// if greater than the head of highest
// add to highest and remove head
// else if smaller than the head of lowest
// add to lowest and remove head
// else feed to sink
}

Made you a little example with Binary search, now in Java code. It will only use Binary search when newLine is within limits of the sorting.
public static void main(String[] args) {
// Create random lines
ArrayList<String> lines = new ArrayList<String>();
Random rn = new Random();
for (int i = 0; i < 50000; i++) {
int lenght = rn.nextInt(100);
char[] newString = new char[lenght];
for (int j = 0; j < lenght; j++) {
newString[j] = (char) rn.nextInt(255);
}
lines.add(new String(newString));
}
// Here starts logic
ArrayList<String> lowerCompared = new ArrayList<String>();
ArrayList<String> higherCompared = new ArrayList<String>();
int lowBoundry = 1000, highBoundry = 1000;
int k = 0;
int firstLimit = Math.min(lowBoundry, highBoundry);
// first x lines sorter equal
for (; k < firstLimit; k++) {
int index = Collections.binarySearch(lowerCompared, lines.get(k));
if (index < 0)
index = ~index;
lowerCompared.add(index, lines.get(k));
higherCompared.add(index, lines.get(k));
}
for (; k < lines.size(); k++) {
String newLine = lines.get(k);
boolean lowBS = newLine.compareTo(lowerCompared.get(lowBoundry - 1)) < 0;
boolean highBS = newLine.compareTo(higherCompared.get(0)) > 0;
if (lowerCompared.size() == lowBoundry && higherCompared.size() == highBoundry && !lowBS && !highBS) {
System.out.println("Time to print: " + newLine);
continue;
}
if (lowBS) {
int lowerIndex = Collections.binarySearch(lowerCompared, newLine);
if (lowerIndex < 0)
lowerIndex = ~lowerIndex;
lowerCompared.add(lowerIndex, newLine);
if (lowerCompared.size() > lowBoundry)
lowerCompared.remove(lowBoundry);
}
if (highBS) {
int higherIndex = Collections.binarySearch(higherCompared, newLine);
if (higherIndex < 0)
higherIndex = ~higherIndex;
higherCompared.add(higherIndex, newLine);
if (higherCompared.size() > highBoundry)
higherCompared.remove(0);
}
}
}

You need to implement binary search and also need to handle duplicates.
I've done some code sample here which does what you want ( may contains bugs).
public class CheckRead1000 {
public static void main(String[] args) {
// generate strings in revert order to get the worse case
List<String> aaa = new ArrayList<String>();
for (int i = 50000; i > 0; i--) {
aaa.add("some string 123456789" + i);
}
// fast solution
ArrayList<String> sortedLines = new ArrayList<>();
long st1 = System.currentTimeMillis();
for (String a : aaa) {
checkIfRead1000MoreAndLess(sortedLines, a);
}
System.out.println(System.currentTimeMillis() - st1);
// doIt solution
TreeSet<String> linesThatHaveBeenRead = new TreeSet<>();
long st2 = System.currentTimeMillis();
for (String a : aaa) {
doIt(linesThatHaveBeenRead, a);
}
System.out.println(System.currentTimeMillis() - st2);
}
// solution doIt
public static void doIt(SortedSet<String> linesThatHaveBeenRead, String currentLine) {
int lessThan = 0;
int greaterThan = 0;
Iterator<String> itr = linesThatHaveBeenRead.iterator();
while (itr.hasNext()) {
String theCurrentLineInTheSet = itr.next();
if (theCurrentLineInTheSet.compareTo(currentLine) == -1) ++lessThan;
else if (theCurrentLineInTheSet.compareTo(currentLine) == 1) ++greaterThan;
}
if (lessThan >= 1000 && greaterThan >= 1000) {
// System.out.println(currentLine);
lessThan = 0;
greaterThan = 0;
}
linesThatHaveBeenRead.add(currentLine);
}
// will return if we have read more at least 1000 string more and less then our string
private static boolean checkIfRead1000MoreAndLess(List<String> sortedLines, String newLine) {
//adding string to list and calculating its index and the last search range
int indexes[] = addNewString(sortedLines, newLine);
int index = indexes[0]; // index of element
int low = indexes[1];
int high = indexes[2];
//we need to check if this string already was in list for instance
// 1,2,3,4,5,5,5,5,5,6,7 for 5 we need to count 'less' as 4 and 'more' is 2
int highIndex = index;
for (int i = highIndex + 1; i < high; i++) {
if (sortedLines.get(i).equals(newLine)) {
highIndex++;
} else {
//no more duplicates
break;
}
}
int lowIndex = index;
for (int i = lowIndex - 1; i > low; i--) {
if (sortedLines.get(i).equals(newLine)) {
lowIndex--;
} else {
//no more duplicates
break;
}
}
// just calculating how many we did read more and less
if (sortedLines.size() - highIndex - 1 > 1000 && lowIndex > 1000) {
return true;
}
return false;
}
// simple binary search will insert string and return its index and ranges in sorted list
// first int is index,
// second int is start of range - will be used to find duplicates,
// third int is end of range - will be used to find duplicates,
private static int[] addNewString(List<String> sortedLines, String newLine) {
if (sortedLines.isEmpty()) {
sortedLines.add(newLine);
return new int[]{0, 0, 0};
}
// int index = Integer.MAX_VALUE;
int low = 0;
int high = sortedLines.size() - 1;
int mid = 0;
while (low <= high) {
mid = (low + high) / 2;
if (sortedLines.get(mid).compareTo(newLine) < 0) {
low = mid + 1;
} else if (sortedLines.get(mid).compareTo(newLine) > 0) {
high = mid - 1;
} else if (sortedLines.get(mid).compareTo(newLine) == 0) {
// index = mid;
break;
}
if (low > high) {
mid = low;
}
}
if (mid == sortedLines.size()) {
sortedLines.add(newLine);
} else {
sortedLines.add(mid, newLine);
}
return new int[]{mid, low, high};
}
}

Most efficient way to search for unknown patterns in a string?

I am trying to find patterns that:
occur more than once
are more than 1 character long
are not substrings of any other known pattern
without knowing any of the patterns that might occur.
For example:
The string "the boy fell by the bell" would return 'ell', 'the b', 'y '.
The string "the boy fell by the bell, the boy fell by the bell" would return 'the boy fell by the bell'.
Using double for-loops, it can be brute forced very inefficiently:
ArrayList<String> patternsList = new ArrayList<>();
int length = string.length();
for (int i = 0; i < length; i++) {
int limit = (length - i) / 2;
for (int j = limit; j >= 1; j--) {
int candidateEndIndex = i + j;
String candidate = string.substring(i, candidateEndIndex);
if(candidate.length() <= 1) {
continue;
}
if (string.substring(candidateEndIndex).contains(candidate)) {
boolean notASubpattern = true;
for (String pattern : patternsList) {
if (pattern.contains(candidate)) {
notASubpattern = false;
break;
}
}
if (notASubpattern) {
patternsList.add(candidate);
}
}
}
}
However, this is incredibly slow when searching large strings with tons of patterns.

You can build a suffix tree for your string in linear time:
https://en.wikipedia.org/wiki/Suffix_tree
The patterns you are looking for are the strings corresponding to internal nodes that have only leaf children.

You could use n-grams to find patterns in a string. It would take O(n) time to scan the string for n-grams. When you find a substring by using a n-gram, put it into a hash table with a count of how many times that substring was found in the string. When you're done searching for n-grams in the string, search the hash table for counts greater than 1 to find recurring patterns in the string.
For example, in the string "the boy fell by the bell, the boy fell by the bell" using a 6-gram will find the substring "the boy fell by the bell". A hash table entry with that substring will have a count of 2 because it occurred twice in the string. Varying the number of words in the n-gram will help you discover different patterns in the string.
Dictionary<string, int>dict = new Dictionary<string, int>();
int count = 0;
int ngramcount = 6;
string substring = "";
// Add entries to the hash table
while (count < str.length) {
// copy the words into the substring
int i = 0;
substring = "";
while (ngramcount > 0 && count < str.length) {
substring[i] = str[count];
if (str[i] == ' ')
ngramcount--;
i++;
count++;
}
ngramcount = 6;
substring.Trim(); // get rid of the last blank in the substring
// Update the dictionary (hash table) with the substring
if (dict.Contains(substring)) { // substring is already in hash table so increment the count
int hashCount = dict[substring];
hashCount++;
dict[substring] = hashCount;
}
else
dict[substring] = 1;
}
// Find the most commonly occurrring pattern in the string
// by searching the hash table for the greatest count.
int maxCount = 0;
string mostCommonPattern = "";
foreach (KeyValuePair<string, int> pair in dict) {
if (pair.Value > maxCount) {
maxCount = pair.Value;
mostCommonPattern = pair.Key;
}
}

I've written this just for fun. I hope I have understood the problem correctly, this is valid and fast enough; if not, please be easy on me :) I might optimize it a little more I guess, if someone finds it useful.
private static IEnumerable<string> getPatterns(string txt)
{
char[] arr = txt.ToArray();
BitArray ba = new BitArray(arr.Length);
for (int shingle = getMaxShingleSize(arr); shingle >= 2; shingle--)
{
char[] arr1 = new char[shingle];
int[] indexes = new int[shingle];
HashSet<int> hs = new HashSet<int>();
Dictionary<int, int[]> dic = new Dictionary<int, int[]>();
for (int i = 0, count = arr.Length - shingle; i <= count; i++)
{
for (int j = 0; j < shingle; j++)
{
int index = i + j;
arr1[j] = arr[index];
indexes[j] = index;
}
int h = getHashCode(arr1);
if (hs.Add(h))
{
int[] indexes1 = new int[indexes.Length];
Buffer.BlockCopy(indexes, 0, indexes1, 0, indexes.Length * sizeof(int));
dic.Add(h, indexes1);
}
else
{
bool exists = false;
foreach (int index in indexes)
if (ba.Get(index))
{
exists = true;
break;
}
if (!exists)
{
int[] indexes1 = dic[h];
if (indexes1 != null)
foreach (int index in indexes1)
if (ba.Get(index))
{
exists = true;
break;
}
}
if (!exists)
{
foreach (int index in indexes)
ba.Set(index, true);
int[] indexes1 = dic[h];
if (indexes1 != null)
foreach (int index in indexes1)
ba.Set(index, true);
dic[h] = null;
yield return new string(arr1);
}
}
}
}
}
private static int getMaxShingleSize(char[] arr)
{
for (int shingle = 2; shingle <= arr.Length / 2 + 1; shingle++)
{
char[] arr1 = new char[shingle];
HashSet<int> hs = new HashSet<int>();
bool noPattern = true;
for (int i = 0, count = arr.Length - shingle; i <= count; i++)
{
for (int j = 0; j < shingle; j++)
arr1[j] = arr[i + j];
int h = getHashCode(arr1);
if (!hs.Add(h))
{
noPattern = false;
break;
}
}
if (noPattern)
return shingle - 1;
}
return -1;
}
private static int getHashCode(char[] arr)
{
unchecked
{
int hash = (int)2166136261;
foreach (char c in arr)
hash = (hash * 16777619) ^ c.GetHashCode();
return hash;
}
}
Edit
My previous code has serious problems. This one is better:
private static IEnumerable<string> getPatterns(string txt)
{
Dictionary<int, int> dicIndexSize = new Dictionary<int, int>();
for (int shingle = 2, count0 = txt.Length / 2 + 1; shingle <= count0; shingle++)
{
Dictionary<string, int> dic = new Dictionary<string, int>();
bool patternExists = false;
for (int i = 0, count = txt.Length - shingle; i <= count; i++)
{
string sub = txt.Substring(i, shingle);
if (!dic.ContainsKey(sub))
dic.Add(sub, i);
else
{
patternExists = true;
int index0 = dic[sub];
if (index0 >= 0)
{
dicIndexSize[index0] = shingle;
dic[sub] = -1;
}
}
}
if (!patternExists)
break;
}
List<int> lst = dicIndexSize.Keys.ToList();
lst.Sort((a, b) => dicIndexSize[b].CompareTo(dicIndexSize[a]));
BitArray ba = new BitArray(txt.Length);
foreach (int i in lst)
{
bool ok = true;
int len = dicIndexSize[i];
for (int j = i, max = i + len; j < max; j++)
{
if (ok) ok = !ba.Get(j);
ba.Set(j, true);
}
if (ok)
yield return txt.Substring(i, len);
}
}
Text in this book took 3.4sec in my computer.

Suffix arrays are the right idea, but there's a non-trivial piece missing, namely, identifying what are known in the literature as "supermaximal repeats". Here's a GitHub repo with working code: https://github.com/eisenstatdavid/commonsub . Suffix array construction uses the SAIS library, vendored in as a submodule. The supermaximal repeats are found using a corrected version of the pseudocode from findsmaxr in Efficient repeat finding via suffix arrays
(Becher–Deymonnaz–Heiber).
static void FindRepeatedStrings(void) {
// findsmaxr from https://arxiv.org/pdf/1304.0528.pdf
printf("[");
bool needComma = false;
int up = -1;
for (int i = 1; i < Len; i++) {
if (LongCommPre[i - 1] < LongCommPre[i]) {
up = i;
continue;
}
if (LongCommPre[i - 1] == LongCommPre[i] || up < 0) continue;
for (int k = up - 1; k < i; k++) {
if (SufArr[k] == 0) continue;
unsigned char c = Buf[SufArr[k] - 1];
if (Set[c] == i) goto skip;
Set[c] = i;
}
if (needComma) {
printf("\n,");
}
printf("\"");
for (int j = 0; j < LongCommPre[up]; j++) {
unsigned char c = Buf[SufArr[up] + j];
if (iscntrl(c)) {
printf("\\u%.4x", c);
} else if (c == '\"' || c == '\\') {
printf("\\%c", c);
} else {
printf("%c", c);
}
}
printf("\"");
needComma = true;
skip:
up = -1;
}
printf("\n]\n");
}
Here's a sample output on the text of the first paragraph:
Davids-MBP:commonsub eisen$ ./repsub input
["\u000a"
," S"
," as "
," co"
," ide"
," in "
," li"
," n"
," p"
," the "
," us"
," ve"
," w"
,"\""
,"–"
,"("
,")"
,". "
,"0"
,"He"
,"Suffix array"
,"`"
,"a su"
,"at "
,"code"
,"com"
,"ct"
,"do"
,"e f"
,"ec"
,"ed "
,"ei"
,"ent"
,"ere's a "
,"find"
,"her"
,"https://"
,"ib"
,"ie"
,"ing "
,"ion "
,"is"
,"ith"
,"iv"
,"k"
,"mon"
,"na"
,"no"
,"nst"
,"ons"
,"or"
,"pdf"
,"ri"
,"s are "
,"se"
,"sing"
,"sub"
,"supermaximal repeats"
,"te"
,"ti"
,"tr"
,"ub "
,"uffix arrays"
,"via"
,"y, "
]

I would use Knuth–Morris–Pratt algorithm (linear time complexity O(n)) to find substrings. I would try to find the largest substring pattern, remove it from the input string and try to find the second largest and so on. I would do something like this:
string pattern = input.substring(0,lenght/2);
string toMatchString = input.substring(pattern.length, input.lenght - 1);
List<string> matches = new List<string>();
while(pattern.lenght > 0)
{
int index = KMP(pattern, toMatchString);
if(index > 0)
{
matches.Add(pattern);
// remove the matched pattern occurences from the input string
// I would do something like this:
// 0 to pattern.lenght gets removed
// check for all occurences of pattern in toMatchString and remove them
// get the remaing shrinked input, reassign values for pattern & toMatchString
// keep looking for the next largest substring
}
else
{
pattern = input.substring(0, pattern.lenght - 1);
toMatchString = input.substring(pattern.length, input.lenght - 1);
}
}
Where KMP implements Knuth–Morris–Pratt algorithm. You can find the Java implementations of it at Github or Princeton or write it yourself.
PS: I don't code in Java and it is quick try to my first bounty about to close soon. So please don't give me the stick if I missed something trivial or made a +/-1 error.

Finding shortest possible substring that contains a String

This was a question asked in a recent programming interview.
Given a random string S and another string T with unique elements, find the minimum consecutive sub-string of S such that it contains all the elements in T.
Say,
S='adobecodebanc'
T='abc'
Answer='banc'
I've come up with a solution,
public static String completeSubstring(String T, String S){
String minSub = T;
StringBuilder sb = new StringBuilder();
for (int i = 0; i <T.length()-1; i++) {
for (int j = i + 1; j <= T.length() ; j++) {
String sub = T.substring(i,j);
if(stringContains(sub, S)){
if(sub.length() < minSub.length()) minSub = sub;
}
}
}
return minSub;
}
private static boolean stringContains(String t, String s){
//if(t.length() <= s.length()) return false;
int[] arr = new int[256];
for (int i = 0; i <t.length() ; i++) {
char c = t.charAt(i);
arr[c -'a'] = 1;
}
boolean found = true;
for (int i = 0; i <s.length() ; i++) {
char c = s.charAt(i);
if(arr[c - 'a'] != 1){
found = false;
break;
}else continue;
}
return found;
}
This algorithm has a O(n3) complexity, which but naturally isn't great. Can someone suggest a better algorithm.

Here's the O(N) solution.
The important thing to note re: complexity is that each unit of work involves incrementing either start or end, they don't decrease, and the algorithm stops before they both get to the end.
public static String findSubString(String s, String t)
{
//algorithm moves a sliding "current substring" through s
//in this map, we keep track of the number of occurrences of
//each target character there are in the current substring
Map<Character,int[]> counts = new HashMap<>();
for (char c : t.toCharArray())
{
counts.put(c,new int[1]);
}
//how many target characters are missing from the current substring
//current substring is initially empty, so all of them
int missing = counts.size();
//don't waste my time
if (missing<1)
{
return "";
}
//best substring found
int bestStart = -1, bestEnd = -1;
//current substring
int start=0, end=0;
while (end<s.length())
{
//expand the current substring at the end
int[] cnt = counts.get(s.charAt(end++));
if (cnt!=null)
{
if (cnt[0]==0)
{
--missing;
}
cnt[0]+=1;
}
//while the current substring is valid, remove characters
//at the start to see if a shorter substring that ends at the
//same place is also valid
while(start<end && missing<=0)
{
//current substring is valid
if (end-start < bestEnd-bestStart || bestEnd<0)
{
bestStart = start;
bestEnd = end;
}
cnt = counts.get(s.charAt(start++));
if (cnt != null)
{
cnt[0]-=1;
if (cnt[0]==0)
{
++missing;
}
}
}
//current substring is no longer valid. we'll add characters
//at the end until we get another valid one
//note that we don't need to add back any start character that
//we just removed, since we already tried the shortest valid string
//that starts at start-1
}
return(bestStart<=bestEnd ? s.substring(bestStart,bestEnd) : null);
}

I know that there already is an adequate O(N) complexity answer, but I tried to figure it out on my own without looking it up, just because it's a fun problem to solve and thought I would share. Here's the O(N) solution that I came up with:
public static String completeSubstring(String S, String T){
int min = S.length()+1, index1 = -1, index2 = -1;
ArrayList<ArrayList<Integer>> index = new ArrayList<ArrayList<Integer>>();
HashSet<Character> targetChars = new HashSet<Character>();
for(char c : T.toCharArray()) targetChars.add(c);
//reduce initial sequence to only target chars and keep track of index
//Note that the resultant string does not allow the same char to be consecutive
StringBuilder filterS = new StringBuilder();
for(int i = 0, s = 0 ; i < S.length() ; i++) {
char c = S.charAt(i);
if(targetChars.contains(c)) {
if(s > 0 && filterS.charAt(s-1) == c) {
index.get(s-1).add(i);
} else {
filterS.append(c);
index.add(new ArrayList<Integer>());
index.get(s).add(i);
s++;
}
}
}
//Not necessary to use regex, loops are fine, but for readability sake
String regex = "([abc])((?!\\1)[abc])((?!\\1)(?!\\2)[abc])";
Matcher m = Pattern.compile(regex).matcher(filterS.toString());
for(int i = 0, start = -1, p1, p2, tempMin, charSize = targetChars.size() ; m.find(i) ; i = start+1) {
start = m.start();
ArrayList<Integer> first = index.get(start);
p1 = first.get(first.size()-1);
p2 = index.get(start+charSize-1).get(0);
tempMin = p2-p1;
if(tempMin < min) {
min = tempMin;
index1 = p1;
index2 = p2;
}
}
return S.substring(index1, index2+1);
}
I'm pretty sure the complexity is O(N), please correct if I'm wrong

Alternative implementation of O(N) algorithm proposed by #MattTimmermans, which uses Map<Integer, Integer> to count occurrences and Set<Integer> to store chars from T that are present in current substring:
public static String completeSubstring(String s, String t) {
Map<Integer, Integer> occ
= t.chars().boxed().collect(Collectors.toMap(c -> c, c -> 0));
Set<Integer> found = new HashSet<>(); // characters from T found in current match
int start = 0; // current match
int bestStart = Integer.MIN_VALUE, bestEnd = -1;
for (int i = 0; i < s.length(); i++) {
int ci = s.charAt(i); // current char
if (!occ.containsKey(ci)) // not from T
continue;
occ.put(ci, occ.get(ci) + 1); // add occurrence
found.add(ci);
for (int j = start; j < i; j++) { // try to reduce current match
int cj = s.charAt(j);
Integer c = occ.get(cj);
if (c != null) {
if (c == 1) { // cannot reduce anymore
start = j;
break;
} else
occ.put(cj, c - 1); // remove occurrence
}
}
if (found.size() == occ.size() // all chars found
&& (i - start < bestEnd - bestStart)) {
bestStart = start;
bestEnd = i;
}
}
return bestStart < 0 ? null : s.substring(bestStart, bestEnd + 1);
}

design a method in Java which receives 2D array and find the most repetitive value for each column

I am going to design a method in Java which receives 2D array and find the most repetitive value for each column. So the output for this method is a one Dimensional array which contains the most repeated value for each column in the 2 D array.
It can be summarised like that,
Count the repetitive values for each column.
save these values in one array where each value in the output array represent the most repeated values in the 2 D array column
This is my code, I start with that
static int choseAction(int[][] Actions, int ColNumber) {
int action = 0;
int c = 0;
int d = 0;
int n = 0;
for (int i = 0; i < Actions.length; i++) {
for (int j = 0; j < Actions[0].length; j++) {
if (Actions[ColNumber][i] == 1) {
c = +1;
} else if (Actions[ColNumber][i] == -1) {
d = +1;
}
else if (Actions[ColNumber][i] == 0) {
n = +1;
}
}
}
action = ActionCompare(c, d, n);
return action;
}
static int ActionCompare(int a, int b, int c) {
int r;
if ((a > b) && (a > c)) {
r = a;
System.out.println("\n cc ");
} else if ((b > a) && (b > c)) {
r = b;
System.out.println("\n dd ");
} else {
r = c;
System.out.println("\n do nn ");
}
return r;
}
My question is that , what is the easier way to do that ?

Here is an approach from the answer here
Use a HashMap<Integer, Integer>
For multiple occurrences, increment the corresponding value of the integer key;
public int[] static getFrequencies(int[][] Actions){
int[] output = new int[Actions.length]
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
for(int j = 0; j < Actions.length; j++){
for (int i : Actions[j]) {
Integer count = map.get(i);
map.put(i, count != null ? count+1 : 0);
}
Then append the number with maximum frequency from hash map to output array:
output[j] = Collections.max(map.entrySet(),
new Comparator<Map.Entry<Integer, Integer>>() {
#Override
public int compare(Entry<Integer, Integer> o1, Entry<Integer, Integer> o2) {
return o1.getValue().compareTo(o2.getValue());
}
}).getKey().intValue();
}
And return output at the end:
return output;
That's it!

You can sum all the values in column so you can have if 1s are more or -1s but you cant tell about zeros. Do the thing what you did before for zeros and your code becomes 2/3 of full length.
public class CounterPro {
int pn = 0; // positive numbers
int mn = 0; // minus numbers
int nn = 0; // neutral numbers
int sum = 0; // sum
String r; // result
int i = 0;
public String sumAll(int[] array){
while(i < array.length){
sum += array[i];
if(array[i]==0){
nn += 1;
}
i+=1;
}
pn = (array.length - nn + sum)/2;
mn = (array.length - sum -nn)/2;
if(pn > mn && pn > nn){
r = "woaa lost of positive 1";
}
else if(mn > pn && mn > nn){
r = "woaa lost of minus 1";
}
else {
r = "woaa lost of neutral numbers";
}
return r;
}

How to get the count of unmatched character in two strings?

I need to get the count of Unmatched character in two strings. for example
string 1 "hari", string 2 "malar"
Now i need to remove the duplicates from both string ['a' & 'r'] are common in both strings so remove that, now string 1 contain "hi" string 2 contain "mla".
Remaining count = 5
I tried this code, its working fine if duplicate / repeart is not available in same sting like here 'a' come twice in string 2 so my code is didn't work properly.
for (int i = 0; i < first.length; i++) {
for (int j = 0; j < second.length; j++) {
if(first[i] == second[j])
{
getstrings = new ArrayList<String>();
count=count+1;
Log.d("Matches", "string char that matched "+ first[i] +"==" + second[j]);
}
}
}
int tot=(first.length + second.length) - count;
here first & second refers to
char[] first = nameone.toCharArray();
char[] second = nametwo.toCharArray();
this code is working fine for String 1 "sri" string 2 "hari" here in a string character didn't repeat so this above code is working fine. Help me to solve this ?

Here is my solution,
public static void RemoveMatchedCharsInnStrings(String first,String second)
{
for(int i = 0 ;i < first.length() ; i ++)
{
char c = first.charAt(i);
if(second.indexOf(c)!= -1)
{
first = first.replaceAll(""+c, "");
second = second.replaceAll(""+c, "");
}
}
System.out.println(first);
System.out.println(second);
System.out.println(first.length() + second.length());
}
Hope it is what you need. if not i'll update my answer

I saw the other answers and thought: There must be a more declarative and composable way of doing this!
There is, but it's far longer...
public static void main(String[] args) {
String first = "hari";
String second = "malar";
Map<Character, Integer> differences = absoluteDifference(characterCountOf(first), characterCountOf(second));
System.out.println(sumOfCounts(differences));
}
public static Map<Character, Integer> characterCountOf(String text) {
Map<Character, Integer> result = new HashMap<Character, Integer>();
for (int i=0; i < text.length(); i++) {
Character c = text.charAt(i);
result.put(c, result.containsKey(c) ? result.get(c) + 1 : 1);
}
return result;
}
public static <K> Set<K> commonKeys(Map<K, ?> first, Map<K, ?> second) {
Set<K> result = new HashSet<K>(first.keySet());
result.addAll(second.keySet());
return result;
}
public static <K> Map<K, Integer> absoluteDifference(Map<K, Integer> first, Map<K, Integer> second) {
Map<K, Integer> result = new HashMap<K, Integer>();
for (K key: commonKeys(first, second)) {
Integer firstCount = first.containsKey(key) ? first.get(key) : 0;
Integer secondCount = second.containsKey(key) ? second.get(key) : 0;
Integer resultCount = Math.max(firstCount, secondCount) - Math.min(firstCount, secondCount);
if (resultCount > 0) result.put(key, resultCount);
}
return result;
}
public static Integer sumOfCounts(Map<?, Integer> map) {
Integer sum = 0;
for (Integer count: map.values()) {
sum += count;
}
return sum;
}
This is the solution I prefer - but it's lot longer. You've tagged the question with Android, so I didn't use any Java 8 features, which would reduce it a bit (but not as much as I would have hoped for).
However it produces meaningful intermediate results. But it's still so much longer :-(

Try out this code:
String first = "hari";
String second = malar;
String tempFirst = "";
String tempSecond = "";
int maxSize = ((first.length() > second.length()) ? (first.length()) : (second.length()));
for (int i = 0; i < maxSize; i++) {
if (i >= second.length()) {
tempFirst += first.charAt(i);
} else if (i >= first.length()) {
tempSecond += second.charAt(i);
} else if (first.charAt(i) != second.charAt(i)) {
tempFirst += first.charAt(i);
tempSecond += second.charAt(i);
}
}
first = tempFirst;
second = tempSecond;

you need to break; as soon as the match is found:
public static void main(String[] args) {
String nameone="hari";
String nametwo="malar";
char[] first = nameone.toCharArray();
char[] second = nametwo.toCharArray();
List<String>getstrings=null;
int count=0;
for (int i = 0; i < first.length; i++) {
for (int j = 0; j < second.length; j++) {
if(first[i] == second[j])
{
getstrings = new ArrayList<String>();
count++;
System.out.println("Matches"+ "string char that matched "+ first[i] +"==" + second[j]);
break;
}
}
}
//System.out.println(count);
int tot=(first.length-count )+ (second.length - count);
System.out.println("Remaining after match from both strings:"+tot);
}
prints:
Remaining after match from both strings:5

Two things you are missing here.
In the if condition, when the two characters matches, you need to increment count by 2, not one as you are eliminating from both strings.
You need to put a break in the in condition as you are always matching for the first occurrence of the character.
Made those two changes in your code as below, and now it prints the result as you expected.
for (int i = 0; i < first.length; i++) {
for (int j = 0; j < second.length; j++) {
if(first[i] == second[j])
{
count=count+2;
break;
}
}
}
int tot=(first.length + second.length) - count;
System.out.println("Result = "+tot);

You just need to loop over two strings if characters are matched increment the count and just remove those count from total len of two characters
s = 'hackerhappy'\
t = 'hackerrank'\
count = 0
for i in range(len(s)):
for j in range(len(t)):
if s[i] == t[j]:
count += 2
break
char_unmatched = (len(s)+len(t)) - count
char_unmatched contains the count of number of characters from both the strings that are not equal

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Minimum Window Substring the time complexity of my solution - java

Related

How could I improve the speed/performance for this problem, Java

Most efficient way to search for unknown patterns in a string?

Finding shortest possible substring that contains a String

design a method in Java which receives 2D array and find the most repetitive value for each column

How to get the count of unmatched character in two strings?

Categories

Resources