How to find longest substring without repeating characters? [closed]

How to find longest substring without repeating characters? [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am solving a problem on leetcode.Here is the question link
https://leetcode.com/problems/longest-substring-without-repeating-characters/
Below is my solution which is not passing some test-cases:
abcabcbb -- not correct
pwwkew -- correct
bbbbb -- not correct
Any help would be thankful:)
And also I am a newbie here so you can suggest me about my problem statement.
class Solution {
public int lengthOfLongestSubstring(String s) {
int i,max=0;
List<Character> list = new ArrayList<>();
String x = "";
for(i=0;i<s.length();i++)
{
if(!list.contains(s.charAt(i)))
{
x += s.charAt(i);
list.add(s.charAt(i));
System.out.println(x);
}
else
{
if(x != null && x.length() > max)
{
max = x.length();
System.out.println(max);
x = "";
list.clear();
System.out.println("x : "+ x);
System.out.println("list : "+ list);
}
// else
// {
// list.add(s.charAt(i));
// x += s.charAt(i);
// System.out.println("x in else : "+ x);
// System.out.println("list in else : "+ list);
// }
list.add(s.charAt(i));
x += s.charAt(i);
System.out.println("x in else : "+ x);
System.out.println("list in else : "+ list);
}
}
return max;
}
}

Sometimes it's helpful to remain in the problem domain as much as possible. This approach creates a solution before any thought of coding. This approach leaves us with a set of minimally complex logical operations which then require implementation.
First our initial condition. Columns should be clear: Input (always same), Current (the current substring without repeating characters), Answer (the current answer in String form) and Logic (what logic is applied for this step:
So first iteration starts the same as rest : get next character in Input. Check if it is in the Current substring and since it is not add to Current. Here we also ask the question: Is Answer shorter than Current and if so set Answer to Current.
Note in the Logic column we are developing operations which we'll need to implement in the solution.
Repeat for second character input (no new operations):
And again for third - (no new operations):
Ok now we find next CH in the current substring so we need a new operation: 'Remove chars in current up to but not including CH. Note that "add CH to current" is done in this case as well. Note also some new logic (answer was as long or longer than current so "Do Nothing").
And finish things out - no new operations.
So now we reach the end of input and simply ask the question "How long is the Answer" and that is the result.
So now looking at the Logic column we see operations to perform:
// Initial condition
String answer = "";
String current = "";
Let's work completely in Strings to keep things simple - optimization can come later..
Let's define the "next CH (nextChar)" operation:
// Get the ith character (0-based) from 's' as a String.
private static String nextChar(String s, int i) {}
We'll need an operation which "checks if 'Current contains CH'":
// Does the 'current' String contain the 'nextChar' String?
private static boolean currentContainsCh(String current, String nextChar) {}
We'll need to check if current Answer is shorter than Current:
// True if the 'answer' String is short in length than the 'current' string.
private static boolean isAnswerShorterThanCurrent(String current, String answer) {}
And the ability to append the nextChar to Current:
// Append the 'nextChar' to the 'current' String and return the String.
private static String addChToCurrent(String current, String nextChar) {}
And finally the ability to remove all characters up to but not including current char in Current:
// #return a String which has all characters removed from 'current' up to but not including 'ch'
private static String removeUpToChar(String current, String ch) {}
So putting it all together (essentially still in the problem domain since we haven't implemented any operations but just projected the problem into structure):
public int lengthOfLongestSubstring(String s) {
String answer = "";
String current = "";
for (int i = 0; i < s.length(); i++) {
String nextChar = nextChar(s,i);
if (currentContainsCh(current, nextChar)) {
current = removeUpToChar(current, nextChar);
}
current = addChToCurrent(current,nextChar);
if (isAnswerShorterThanCurrent(current,answer)) {
answer = new String(current);
}
}
return answer.length();
}
And now implementing the "operations" becomes easier (and fun) since they are isolated and not complex. This is left for you. When implemented it passes the problem.
The next logical step after verifying correctness is to consider optimizations - if needed.

Although the pictures in answer by Andy are useful and mostly correct, the code is sub-optimal.
The code in the question, as well as in both current answers, builds a lot of substrings, using string concatenation. That is detrimental to performance.
Here is an O(n) solution that doesn't build any substrings:
public static int lengthOfLongestSubstring(String s) {
Map<Character, Integer> lastPos = new HashMap<>();
int start = 0, maxLen = 0, i = 0;
for (; i < s.length(); i++) {
Integer pos = lastPos.put(s.charAt(i), i);
if (pos != null && pos >= start) {
if (i > start + maxLen)
maxLen = i - start;
start = pos + 1;
}
}
return (i > start + maxLen ? i - start : maxLen);
}
This works by remembering the last position of each character, so the potential longest substring's starting position can be adjusted to start right after the previous position of a repeating character.
Since HashMap.put(...) is O(1) (amortized), the solution is O(n).
Since it would be nice to see the substring, we can easily modify the code to return the (first1) longest substring:
public static String longestSubstring(String s) {
Map<Character, Integer> lastPos = new HashMap<>();
int start = 0, maxStart = 0, maxLen = 0, i = 0;
for (; i < s.length(); i++) {
Integer pos = lastPos.put(s.charAt(i), i);
if (pos != null && pos >= start) {
if (i > start + maxLen) {
maxStart = start;
maxLen = i - start;
}
start = pos + 1;
}
}
return (i > start + maxLen ? s.substring(start) : s.substring(maxStart, maxStart + maxLen));
}
1) To return the last of multiple longest substrings, change both i > to i >=
The above solutions cannot handle strings with Unicode characters from the supplemental planes, e.g. emoji characters.
Emoji's such as 😀 and 😁 are stored as "\uD83D\uDE00" and "\uD83D\uDE01", so the char value '\uD83D' will be seen as a repeating character.
To make it correctly handle all Unicode characters, we need to change it to:
public static String longestSubstring(String s) {
Map<Integer, Integer> lastPos = new HashMap<>();
int start = 0, maxStart = 0, maxLen = 0, i = 0;
for (int cp; i < s.length(); i += Character.charCount(cp)) {
cp = s.codePointAt(i);
Integer pos = lastPos.put(cp, i);
if (pos != null && pos >= start) {
if (i > start + maxLen) {
maxStart = start;
maxLen = i - start;
}
start = pos + Character.charCount(cp);
}
}
return (i > start + maxLen ? s.substring(start) : s.substring(maxStart, maxStart + maxLen));
}
Test
for (String s : new String[] { "abcabcbb", "pwwkew", "bbbbb", "aab", "abba", "xabycdxefghy", "aXbXcdXefgXh", "😀😁😂😀😁😂" }) {
String substr = longestSubstring(s);
System.out.printf("%s: %s (%d)%n", s, substr, substr.length());
}
Output
abcabcbb: abc (3)
pwwkew: wke (3)
bbbbb: b (1)
aab: ab (2)
abba: ab (2)
xabycdxefghy: abycdxefgh (10)
aXbXcdXefgXh: cdXefg (6)
😀😁😂😀😁😂: 😀😁😂 (6)

Related

Find the most repeated word in a string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
can you give me some pointers as of how can I find the most frequent word in an String? I cannot use Maps, lists or so on. I should only achieve this by for's and if's and some in-build methods.

Split String and save to array, sort the array, iterate over the sorted array and count frequency of same strings updating the maximal count. Example:
public static void main(String[] args) {
String myStr = "how can I find the most frequent word in an string how can I find how how how string";
String[] splited = myStr.split(" ");
Arrays.sort(splited);
System.out.println(Arrays.toString(splited));
int max = 0;
int count= 1;
String word = splited[0];
String curr = splited[0];
for(int i = 1; i<splited.length; i++){
if(splited[i].equals(curr)){
count++;
}
else{
count =1;
curr = splited[i];
}
if(max<count){
max = count;
word = splited[i];
}
}
System.out.println(max + " x " + word);
}

Sample idea (there are thousand ways to solve this):
1: A B B C B (< String with words, seperated by blanks)
'A' is your start position
2: count the A (1) and save the pos of A (0). You always iterate from pos until the end of the String.
3: continue counting until you iterated over the entire String. When you reached the end of the String save the count by assigning it to another variable (e.g. oldCount).
4: move on to the next word and start counting B's (new position = 1). You are about to count 3 B's. If newer count > older count replace the older count.
5: count the next word and update the position to your current position, which is 3. (which is the last position of the String).
6: you are not gonna update the counter, B is the most used word in the String.

For the purists - just loops and String.
private String mostFrequentWord(String words) {
// Where my current word starts.
int wordStart = 0;
// How many I counted.
int wordCount = 0;
// The currently most frequent.
String word = "";
for (int wordEnd = wordStart; wordEnd < words.length(); wordEnd++) {
// Is this the end of a word?
if (wordEnd > words.length() || words.charAt(wordEnd) == ' ') {
// We have a word! How many times does it occur?
String thisWord = words.substring(wordStart, wordEnd);
// How many times this word occurs.
int thisWordCount = 0;
// Current start of search.
int search = -1;
// Count them.
while ((search = words.indexOf(thisWord, search + 1)) >= 0) {
thisWordCount += 1;
}
// Is it longer?
if (thisWordCount > wordCount) {
// Keep track.
word = thisWord;
wordCount = thisWordCount;
}
// Move start to the next word.
wordStart = wordEnd + 1;
}
}
return word;
}
private void test() {
String words = "Now is the time for all good men to come to the aid of the party";
System.out.println("Most frequent word in \"" + words + "\" is " + mostFrequentWord(words));
}

public static void main(String...strings) {
String para = "Paris in the the spring.Not that that is related.Why are you laughing? Are my my regular expressions THAT bad??";
String[] words = para.split("\\s+");
int finalCount = 0;
int tempCount = 0;
String mostlyUsedWord = null;
for (String word: words) {
tempCount = 0;
for (String w: words) {
if (word.equalsIgnoreCase(w)) {
tempCount++;
}
}
if (tempCount >= finalCount) {
finalCount = tempCount;
mostlyUsedWord = word;
}
}
System.out.println("mostlyUsedWord:: = " + mostlyUsedWord + " ,count:: = " + finalCount);
}

Divide string into several substrings

I have a strings that contain only digits. String itself would look like this "0011112222111000" or "1111111000". I'd like to know how can I get an array of substrings which will consist of strings with only one digit.
For example, if I have "00011111122233322211111111110000000" string, I 'd like it to be in string array(string[]) which contains ["000","111111","222","333","222","1111111111","0000000"].
This is what I've tried
for (int i = (innerHierarchy.length()-1); i >= 1; i--) {
Log.e("Point_1", "innerHierarchy " + innerHierarchy.charAt(i));
c = Character.toChars(48 + max);
Log.e("Point_1", "c " + c[0]);
if (innerHierarchy.charAt(i) < c[0] && innerHierarchy.charAt(i - 1) == c[0]) {
Log.e("Point_1", "Start " + string.charAt(i));
o = i;
} else if (innerHierarchy.charAt(i) == c[0] && innerHierarchy.charAt(i - 1) < c[0]) {
Log.e("Point_1", "End " + string.charAt(i));
o1 = i;
string[j] = string.substring(o1,o);
j=j+1;
}
}
But this code won't work if string looks like this "111111000"
Thank you.

I have "00011111122233322211111111110000000" string, I 'd like it to
be in string array(string[]) which contains
["000","111111","222","333","222","1111111111","0000000"]
One approach I can think of right now (O(n)) (might not be the most efficient but would solve your problem) would be traversing the string of numbers i.e. ("00011111122233322211111111110000000" in your case )
and if char at that position under consideration is not same as char at previous position then making string till that part as one string and continuing.
(approach)
considering str= "00011111122233322211111111110000000"
//starting from position 1 (ie from 2nd char which is '0')
//which is same as prev character ( i.e 1st char which is '0')
// continue in traversal
// now char at pos 2 which is again '0'
// keep traversing
// but then char at position 3 is 1
// so stop here and
//make substring till here-1 as one string
//so "000" came as one string
//continue in same manner.
code
import java.util.*;
public class A {
public static void main(String []args){
String str = "00011111122233322211111111110000000";
str+='-'; //appended '-' to get last 0000000 as well into answer
//otherwise it misses last string which i guess was your problem
String one_element ="";
int start=0;
for(int i=1;i<str.length();i++){
if(str.charAt(i)== str.charAt(i-1) )
{
}
else{
one_element = str.substring(start,i);
start = i;
System.out.println(one_element);//add one_element into ArrayList if required.
}
}
}
}
I have printed each element here as string , if you need an array of all those you can simply use an array_list and keep adding one_element in array_list instead of printing.

Finding Balanced Parenthesis Involving Math

I've tried to solve this question for the past couple of hours and I just don't understand it. I know there must be a sort of mathematical calculation to calculate this but I don't know how to exactly calculate it. I know this code does not make sense because I'm completely lost, I would appreciate any hints or help for this to help me get closer to the solution.
I asked my professor and he told me a hint about it being similar to a permutation/combination using alphabet such as 26^3 for 3 different combinations but this did not help me much.
What I know:
There are 796 characters for the input given in the string and I must find ALL possible ways that 796 characters can be in a balanced parenthesis form.
Since it must start with '(' and end with ')' there must be 2 brackets for each case. So it can be '()()(xc)(cvs)'. Thus that means the mathematical calculation must involve 2*(something) per char(s) since it has to be balanced.
I need to use the remainder(%) operator to recursively find every case but how do I do that when I take a char in not an int?
What I don't know:
How will I analyze each case? Won't that take a long time or a lot of code without a simple formula to calculate the input?
Would I need a lot of if-statements or recursion?
Question:
Let Σ = {), (}. Let L ⊆ Σ* be the set of strings of correctly balanced parentheses. For example, (())() is in L and (()))( is not in L. Formally, L is defined recursively as follows.
ε ∈ L
A string x ≠ ε is in L if and only if x is of the form (y)z, where y and z are in L.
n is a specific 3 digit number between 0 and 999.
Compute f(n) mod 997
Some facts you might find useful: if n1, n2 is a member of N(natural number) then,
(n1 x n2) mod 997 and
(n1 + n2) mod 997
n = 796 (this is specific for me and this will be the given input in this case)
So I must "compute f(796) mod 997 = ?" using a program. In this case I will simply use java for this question.
Code:
import java.util.*;
public class findBrackets
{
public static void main(String[] args)
{
String n;
int answer = 0;
Scanner input = new Scanner(System.in);
System.out.println("Input String");
n = input.nextLine();
// probably wrong because a string can start as x(d))c(()...
for(int i = 0; i < n; i++)
{
if(n[i] != '(' || n[i] != ')' || n[i] != null || n[i] != " ") {
answer = 2 * (Integer.parseInt(n[i]); // how can i calculate if its a char
// i have to use mod % operator somewhere but I don't know where?
}
}
System.out.println("f(796) mod 997 = " + answer);
}
}

You might find the following fact useful: the number of strings of n pairs of balanced parentheses is given by the nth Catalan number and its exact value is
(2n)! / (n! (n + 1)!)
You should be able to directly compute this value mod 997 by using the hint about how products and sums distribute over modulus.
Hope this helps!

I'm still not quite sure exactly what you're asking, but validating as to whether or not the parentheses are valid placement can be done using the following method. I used a similar one to go through hundred-page papers to ensure all parentheses were closed properly in the old days.
public static boolean isValid(String s) {
int openParens = 0;
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) == '(') {
// we found an open paren
openParens++;
} else if (s.charAt(i) == ')') {
// we can close a paren
openParens--;
}
if (openParens < 0) {
// we closed a paren but there was nothing to close!
return false;
}
}
if (openParens > 0) {
// we didn't close all parens!
return false;
}
// we did!
return true;
}

You need to do implement this:
public static void main (String[]args) {
String str = "((1+2)*(3+4))-5";
if(isValid(str)){
expandString(str);
}
}
public static boolean isValid(String s) {
int totalParenthesis = 0;
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) == '(') {
totalParenthesis++;
} else if (s.charAt(i) == ')') {
totalParenthesis--;
}
if (totalParenthesis < 0) {
return false;
}
}
if (totalParenthesis != 0) {
return false;
}
return true;
}
private static void expandString(String str) {
System.out.println("Called with : "+str);
if(!(str.contains("("))){
evalueMyExpresstion(str);
return;
}
String copyString=str;
int count=-1,positionOfOpen=0,positionOfClose=0;
for(Character character : str.toCharArray()) {
count++;
if(count==str.toCharArray().length){
evalueMyExpresstion(str);
return;
} else if(character.equals('(')) {
positionOfOpen=count+1;
} else if(character.equals(')')) {
positionOfClose=count;
copyString = str.substring(0, positionOfOpen - 1) + evalueMyExpresstion(
str.substring(positionOfOpen, positionOfClose)) + str.substring(positionOfClose + 1);
System.out.println("Call again with : "+copyString);
expandString(copyString);
return;
}
}
}
private static String evalueMyExpresstion(String str) {
System.out.println("operation : "+str);
String[] operation;
int returnVal =0;
if(str.contains("+")){
operation = str.split("\\+");
returnVal=Integer.parseInt(operation[0])+ Integer.parseInt(operation[1]);
System.out.println("+ val : "+returnVal);
return Integer.toString(returnVal);
} else if (str.contains("*")){
operation = str.split("\\*");
returnVal=Integer.parseInt(operation[0])* Integer.parseInt(operation[1]);
System.out.println("* val : "+returnVal);
return Integer.toString(returnVal);
} else if (str.contains("-")){
operation = str.split("\\-");
returnVal=Integer.parseInt(operation[0])- Integer.parseInt(operation[1]);
System.out.println("- val : "+returnVal);
return Integer.toString(returnVal);
}
System.out.println(str);
return Integer.toString(returnVal);
}
Output looks like:
Called with : ((1+2)*(3+4))-5
operation : 1+2
+ val : 3
Call again with : (3*(3+4))-5
Called with : (3*(3+4))-5
operation : 3+4
+ val : 7
Call again with : (3*7)-5
Called with : (3*7)-5
operation : 3*7
* val : 21
Call again with : 21-5
Called with : 21-5
operation : 21-5
- val : 16

Substring between two same or different delimiters (when delimiters occur multiple times)

I need to fetch a sub string that lies between two same or different delimiters. The delimiters will be occurring multiple times in the string, so i need to extract the sub-string that lies between mth occurrence of delimiter1 and nth occurrence of delimiter2.
For eg:
myString : Ron_CR7_MU^RM^_SAF_34^
What should i do here if i need to extract the sub-string that lies between 3rd occurrence of '_' and 3rd occurence of '^'?
Substring = SAF_34
Or i could look for a substring that lies between 2nd '^' and 4th '_', i.e :
Substring = _SAF
An SQL equivalent would be :
substr(myString, instr(myString, '',1,3)+1,instr(myString, '^',1,3)-1-instr(myString, '',1,3))

I would use,
public static int findNth(String text, String toFind, int count) {
int pos = -1;
do {
pos = text.indexOf(toFind, pos+1);
} while(--count > 0 && pos >= 0);
return pos;
}
int from = findNth(text, "_", 3);
int to = findNth(text, "^", 3);
String found = text.substring(from+1, to);

If you can use a solution without regex you can find indexes in your string where your resulting string needs to start and where it needs to end. Then just simply perform: myString.substring(start,end) to get your result.
Biggest problem is to find start and end. To do it you can repeat this N (M) times:
int pos = indexOf(delimiterX)
myString = myString.substring(pos) //you may want to work on copy of myString
Hope you get an idea.

You could create a little method that simply hunts for such substrings between delimiters sequentially, using (as noted) String.indexOf(string); You do need to decide whether you want all substrings (whether they overlap or not .. which your question indicates), or if you don't want to see overlapping strings. Here is a trial for such code
import java.util.Vector;
public class FindDelimitedStrings {
public static void main(String[] args) {
String[] test = getDelimitedStrings("Ron_CR7_MU'RM'_SAF_34'", "_", "'");
if (test != null) {
for (int i = 0; i < test.length; i++) {
System.out.println(" " + (i + 1) + ". |" + test[i] + "|");
}
}
}
public static String[] getDelimitedStrings(String source,
String leftDelimiter, String rightDelimiter) {
String[] answer = null;
;
Vector<String> results = new Vector<String>();
if (source == null || leftDelimiter == null || rightDelimiter == null) {
return null;
}
int loc = 0;
int begin = source.indexOf(leftDelimiter, loc);
int end;
while (begin > -1) {
end = source
.indexOf(rightDelimiter, begin + leftDelimiter.length());
if (end > -1) {
results.add(source.substring(begin, end));
// loc = end + rightDelimiter.length(); if strings must be
// returned as pairs
loc = begin + 1;
if (loc < source.length()) {
begin = source.indexOf(leftDelimiter, loc);
} else {
begin = -1;
}
} else {
begin = -1;
}
}
if (results.size() > 0) {
answer = new String[results.size()];
results.toArray(answer);
}
return answer;
}
}

Algorithm for duplicated but overlapping strings

I need to write a method where I'm given a string s and I need to return the shortest string which contains s as a contiguous substring twice.
However two occurrences of s may overlap. For example,
aba returns ababa
xxxxx returns xxxxxx
abracadabra returns abracadabracadabra
My code so far is this:
import java.util.Scanner;
public class TwiceString {
public static String getShortest(String s) {
int index = -1, i, j = s.length() - 1;
char[] arr = s.toCharArray();
String res = s;
for (i = 0; i < j; i++, j--) {
if (arr[i] == arr[j]) {
index = i;
} else {
break;
}
}
if (index != -1) {
for (i = index + 1; i <= j; i++) {
String tmp = new String(arr, i, i);
res = res + tmp;
}
} else {
res = res + res;
}
return res;
}
public static void main(String args[]) {
Scanner inp = new Scanner(System.in);
System.out.println("Enter the string: ");
String word = inp.next();
System.out.println("The requires shortest string is " + getShortest(word));
}
}
I know I'm probably wrong at the algorithmic level rather than at the coding level. What should be my algorithm?

Use a suffix tree. In particular, after you've constructed the tree for s, go to the leaf representing the whole string and walk up until you see another end-of-string marker. This will be the leaf of the longest suffix that is also a prefix of s.

As #phs already said, part of the problem can be translated to "find the longest prefix of s that is also a suffix of s" and a solution without a tree may be this:
public static String getShortest(String s) {
int i = s.length();
while(i > 0 && !s.endsWith(s.substring(0, --i)))
;
return s + s.substring(i);
}

Once you've found your index, and even if it's -1, you just need to append to the original string the substring going from index + 1 (since index is the last matching character index) to the end of the string. There's a method in String to get this substring.

i think you should have a look at the Knuth-Morris-Pratt algorithm, the partial match table it uses is pretty much what you need (and by the way it's a very nice algorithm ;)

If your input string s is, say, "abcde" you can easily build a regex like the following (notice that the last character "e" is missing!):
a(b(c(d)?)?)?$
and run it on the string s. This will return the starting position of the trailing repeated substring. You would then just append the missing part (i.e. the last N-M characters of s, where N is the length of s and M is the length of the match), e.g.
aba
^ match "a"; append the missing "ba"
xxxxxx
^ match "xxxxx"; append the missing "x"
abracadabra
^ match "abra"; append the missing "cadabra"
nooverlap
--> no match; append "nooverlap"

From my understanding you want to do this:
input: dog
output: dogdog
--------------
input: racecar
output: racecaracecar
So this is how i would do that:
public String change(String input)
{
StringBuilder outputBuilder = new StringBuilder(input);
int patternLocation = input.length();
for(int x = 1;x < input.length();x++)
{
StringBuilder check = new StringBuilder(input);
for(int y = 0; y < x;y++)
check.deleteCharAt(check.length() - 1);
if(input.endsWith(check.toString()))
{
patternLocation = x;
break;
}
}
outputBuilder.delete(0, input.length() - patternLocation);
return outputBuilder.toString();
}
Hope this helped!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to find longest substring without repeating characters? [closed] - java

Related

Find the most repeated word in a string [closed]

Divide string into several substrings

Finding Balanced Parenthesis Involving Math

Substring between two same or different delimiters (when delimiters occur multiple times)

Algorithm for duplicated but overlapping strings

Categories

Resources