Maximum repeating sequence instead of longest repeating sequence - java

I am trying to get the most repeated sequence of characters in a string.
For example :
Input:
s = "abccbaabccba"
Output:
2
I have used dynamic programming to figure out the repeating sequence, but this returns the longest repeating character sequence. For example:
Input:
s = "abcabcabcabc"
Output:
2
2(abcabc,abcabc) instead of 4(abc,abc,abc,abc)
Here is the part of the code where I'm filling the DP table and extracting repeating sequence. Can anyone suggest how I can get the most repeating sequence?
//Run through the string and fill the DP table.
char[] chars = s.toCharArray();
for(int i = 1; i <= length; i++){
for(int j = 1; j <= length; j++){
if( chars[i-1] == chars[j-1] && Math.abs(i-j) > table[i-1][j-1]){
table[i][j] = table[i-1][j-1] + 1;
if(table[i][j] > max_length_sub){
max_length_sub = table[i][j];
array_index = Math.min(i, j);
}
}else{
table[i][j] = 0;
}
}
}
//Check if there was a repeating sequence and return the number of times it occurred.
if( max_length_sub > 0 ){
String temp = s;
String subSeq = "";
for(int i = (array_index - max_length_sub); i< max_length_sub; i++){
subSeq = subSeq + s.charAt(i);
}
System.out.println( subSeq );
Pattern pattern = Pattern.compile(subSeq);
Matcher matcher = pattern.matcher(s);
int count = 0;
while (matcher.find())
count++;
// To find left overs - doesn't seem to matter
String[] splits = temp.split(subSeq);
if (splits.length == 0){
return count;
}else{
return 0;
}
}

Simple and dump, the the smallest sequence to be considered is a pair of characters (*):
loop over the whole String an get every consecutive pair of characters, like using a for and substring to get the characters;
count the occurrence of that pair in the String, create a method countOccurrences() using indexof(String, int) or regular expressions; and
store the greatest count, use one variable maxCount outside the loop and an if to check if the actual count is greater (or Math.max())
(*) if "abc" occurs 5 times, than "ab" (and "bc") will occur at least 5 times too - so it is enough to search just for "ab" and "bc", not need to check "abc"
Edit without leftovers, see comments, summary:
check if the first character is repeated over the whole string, if not
check if the 2 initial characters are repeated all over, if not
check if the 3 ...
at least 2 counters/loops needed: one for the number of characters to test, second for the position being tested. Some arithmetic could be used to improve performance: the length of the string must be divisible by the number of repeated characters without remainder.

Related

How to increment conditionally?

so im a complete beginner and I was wondering if it was possible to increment a counter conditionally. I am trying to count the letter “I” in a sentence and everytime i pass an “I”, i want counter to increment by 1 but if there is more than 1 of these together “III” it still only increments by 1 until there a character after it like “IIIaI” which would count as 2 instances.
Is this possible?
Sorry guys, here is my code:
public static int countTheIs(string sentence){
int iCounter = 0;
String iCount = "iI"; //both cases included
for (int j = 0; j < sentence.length(); j++){
char ch =sentence.charAt(j);
if (iCount.indexOf(ch) != -1){
iCounter++;
}
}
}
You are actually quite far already, all you need to do is to check the previous character. This can be done the following way:
String sentence = "Test i two II three iIi";
int iCounter = 0;
String iCount = "iI";
for (int j = 0; j < sentence.length(); j++){
char current = sentence.charAt(j);
char previous; //1
if (j==0) {
previous = 'Z'; //2
} else {
previous = sentence.charAt(j-1); //3
}
if (iCount.indexOf(current) != -1 && iCount.indexOf(previous) == -1 ){ //4
iCounter++;
}
}
Let me explain to you what I have done, according to my // tags
//1 We make a new char variable holding the previous character.
//2 Because the first index of the String has no previous characters, we will set it to a random, non-matching character to prevent errors at the start. I picked Z in this example.
//3 If there is a previous character, we get this by subtracting 1 from j
//4 We check in the if statement if the currenct character is in iCount, and the previous character is not in iCount. If this is the case, the counter will increase.
When the above code is ran, the result will output 3.
OK, I'm going to assume that you have a string input, you are counting by using a loop and then using charAt(x)(x is the number you use to increment the loop) and then comparing.
Simply check if charAt(x-1) is also I. If it is, then don't increment i. Also, you want to make sure x>0 otherwise it will throw an error.
Please run the below code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CountI {
public static void main(String[] args) {
String input = "IIiaIii";
String regex = "([A-Za-z])\\1+";
Pattern pattern = Pattern.compile(regex , Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
String output = matcher.replaceAll("$1");
int result = 0;
for(int i = 0; i < output.length(); i++){
if(output.charAt(i) == 73 || output.charAt(i) == 105){
result++;
}
}
System.out.println(result);
}
}
Output:
2
Process finished with exit code 0
You want Regular Expressions and the Java Pattern class (https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html).
In my sample below I used "w" instead of "l" because it's easier to distinguish. Using regular expressions, define a pattern that will capture one or more consecutive occurrences of the letter: w+, then use a matcher, count the number of times it matches.
String input = "wwowwee w w w";
Pattern p = Pattern.compile("w+");
Matcher matcher = p.matcher(input);
int count = 0;
while(matcher.find()) {
count++;
}
System.out.println("Count: " + count);
Or, simply split the string and count the number of splits:
String input = "wwowwee w w w";
Pattern p = Pattern.compile("w+");
String[] tokens = p.split(input);
System.out.println("token count: " + tokens.length);
Both give the correct results.
Edit: This doesn't answer the question about incrementing a counter conditionally, but it solves the problem that question was posed to address.

How to define number sequence in string?

I have a task to create a string with a non-defined length (input digits from the keyboard until the user presses "Enter"), then I have to define how many of digits are in sequence. Unfortunately I can't handle this. I think I'm almost there but I'm not. I've created the string which I hoped to copy character by character to an array and then compare each digit with the next one, but I have trouble with copying characters into an array.
Here's my code:
int sum = 0;
String someSymbols = sc.nextLine();
int array [] = new int[someSymbols.length()];
for(int i=0; i<someSymbols.length(); i++){
for (int j=0; j<=array.length; j++){
array[j] = someSymbols.charAt(i);
}
sum++;
}
Not sure of what you want to achieve but here are 2 examples for inspiration:
Taking digits until reaching a different digit. Ignoring non digits
String s = "22u223r5";
String digitsOnly = s.replaceAll("[^\\d.]", "");
int firstDifferentDigit = -1;
for(int i = 1; i < digitsOnly.length(); i++) {
if(digitsOnly.charAt(i) != digitsOnly.charAt(i-1)) {
firstDifferentDigit = i;
break;
}
}
System.out.println("firstDifferentDigit:"+firstDifferentDigit);
System.out.println(digitsOnly.substring(0,firstDifferentDigit));
Outputs
firstDifferentDigit:4
2222
Taking digits until first non digit
String s = "124g35h6j3lk4kj56";
int firstNonDigitCharacter = -1;
for(int i = 0; i < s.length(); i++) {
if(s.charAt(i) < '0' || s.charAt(i) > '9') {
firstNonDigitCharacter = i;
break;
}
}
System.out.println("firstNonDigitCharacter:"+firstNonDigitCharacter);
System.out.println(s.substring(0,firstNonDigitCharacter));
Outputs
firstNonDigitCharacter:3
124
EDIT
This works for how you described the exercise:
String someSymbols = "72745123";
List<String> sequences = new ArrayList<>();
boolean inSequence = false; // will flag if we are currently within a sequence
StringBuilder currentSequence = new StringBuilder(); // this will store the numbers of the sequence
for(int i = 0; i < someSymbols.length(); i++) {
char currentChar = someSymbols.charAt(i);
char nextChar = 0;
if(i < someSymbols.length()-1)
nextChar = someSymbols.charAt(i+1);
// if next number is 1 more than the current one, we are in a sequence
if(currentChar == nextChar-1) {
inSequence = true;
currentSequence.append(String.valueOf(currentChar));
// if next number is NOT 1 more than the current one and we are in a sequence, it is the last of the sequence
} else if(inSequence) {
currentSequence.append(String.valueOf(currentChar));
sequences.add(currentSequence.toString());
currentSequence = new StringBuilder();
inSequence = false;
}
}
System.out.println(sequences);
Outputs
[45, 123]
Thanks a lot! I made it with your help! It turns out that the task was to count how many numbers occur in the string of any symbols. As simple as that. My bad! But I'm grateful to be part of this forum :)
Here's the code:
String f = sc.nextLine();
int count = 0;
for(int i=0; i<f.length(); i++){
if((f.charAt(i)>='0') && (f.charAt(i)<='9')){
count++;
}
}
System.out.println("The numbers in the row are : " + count);
I deleted my first answer, because I got the question wrong, thought it's about a character sequence, of which some happen to be digits.
Trying to wrap my head around the new functional style in Java8, but conversion is complicated and full of pitfalls. Surely, this isn't canonical. I guess a collector could be appropriate here, but I broke out and made half of the work in an recursive method.
import java.util.*;
import java.util.stream.*;
String s = "123534567321468"
List<Integer> li = IntStream.range (0, s.length()-2).filter (i -> (s.charAt(i+1) != s.charAt(i)+1)).collect (ArrayList::new, List::add, List::addAll);
li.add (s.length()-1);
int maxDiff (int last, List<Integer> li , int maxdiff) {
if (li.isEmpty ())
return maxdiff;
return maxDiff (li.get(0), li.subList (1, li.size () - 1), Math.max (li.get(0) - last, maxdiff));
}
int result = maxDiff (0, li, 0);
It starts elegantly.
IntStream.range (0, s.length()-2).filter (i -> (s.charAt(i+1) != s.charAt(i)+1))
| Expression value is: java.util.stream.IntPipeline$9#5ce81285
| assigned to temporary variable $20 of type IntStream
-> $20.forEach (i -> System.out.println (i));
2
3
8
9
10
11
12
That's the List of indexes, where chains of numbers are broken.
String s = "123534567321468"
123 is from 0 to 2, 5 is just 3, 34567, the later winner from 4 to 8, ...
Note that we needn't transform the String into Numbers, since the characters are chained in ASCII or UTF-X one by one, like numbers.
To convert it into a List, the complicated collect method is used, because Array of primitive int doesn't work well with List .
For the last interval, li.add (s.length()-1) has to be added - adding elements wouldn't work with array.
maxDiff protocolls the max so far, the last element and repeatedly takes the head from the list, to compare it with the last element to build the current difference.
The Code was testet in the jshell of Java9, which is an amazing tool and needs no embedding class, nor 'main' for snippets. :)
Just for comparison, this is my solution in scala:
val s = "123534567321468"
val cuts = (0 to s.length-2).filter (i => {s.charAt(i+1) != s.charAt(i)+1}).toList ::: s.length-1 :: Nil
(0 :: cuts).sliding (2, 1).map {p => p(1) - p(0)}.max
Sliding(a,b) defines a window of width a=2 which moves forward by b=1.

Fastest way to calculate all the substrings of a string and check it for a given condition

What is the fastest possible way to calculate all the possible substrings of a given string and check them for the following condition.
The condition is:
If the first and the last Character of the generated substring is same then count is incremented by one. We need to find all such possible substrings of a given very large string.
I have tried the naive brute force approach but it did not work for strings with lengths 10^7.
Please help :(
for(int c = 0 ; c < length ; c++ )
{
for( i = 3 ; i <= length - c ; i++ )
{
String sub = str.substring(c, c+i);
System.out.println(sub);
if(sub.charAt(0) == sub.charAt(sub.length()-1)){
count++;
}
}
}
Your current solution is quadratic for the size of the input string or O(n^2)
You can solve this more efficiently by counting the occurrence of each character in the string, and then counting the number of substrings that can be created with this character.
E.g. if a character occurs 4 times, then this leads to 3 + 2 + 1 = 6 substrings.
You can use the following formula for this: ((n-1) * n) / 2
This brings the complexity of the algorithm down to O(n), because for counting each character you only need to traverse the String once.
I believe this code should work:
public static void main(String[] args) {
String str = "xyzxyzxyzxyz";
Map<Character, Integer> map = new HashMap<>();
for (char c : str.toCharArray())
{
Integer count = map.get(c);
if (count == null)
count = 0;
map.put(c, count + 1);
}
int sum = 0;
for (int n : map.values())
sum += ((n - 1) * n) / 2;
System.out.println(sum);
}

Matching subsequence of length 2 (at same index) in two strings

Given 2 strings, a and b, return the number of the positions where they contain the same length 2 substring. For instance a and b is respectively "xxcaazz" and "xxbaaz" yields 3, since the "xx", "aa", and "az" substrings appear in the same place in both strings.
What is wrong with my solution?
int count=0;
for(int i=0;i<a.length();i++)
{
for(int u=i; u<b.length(); u++)
{
String aSub=a.substring(i,i+1);
String bSub=b.substring(u,u+1);
if(aSub.equals(bSub))
count++;
}
}
return count;
}
In order to fix your solution, you really don't need the inner loop. Since the index should be same for the substrings in both string, only one loop is needed.
Also, you should iterate till 2nd last character of the smaller string, to avoid IndexOutOfBounds. And for substring, give i+2 as second argument instead.
Overall, you would have to change your code to something like this:
int count=0;
for(int i=0; i < small(a, b).length()-1; i++)
{
String aSub=a.substring(i,i+2);
String bSub=b.substring(i,i+2);
if(aSub.equals(bSub))
count++;
}
}
return count;
Why I asked about the length of string is, it might become expensive to create substrings of length 2 in loop. For length n of smaller string, you would be creating 2 * n substrings.
I would rather not create substring, and just match character by character, while keeping track of whether previous character matched or not. This will work perfectly fine in your case, as length of substring to match is 2. Code would be like:
String a = "iaxxai";
String b = "aaxxaaxx";
boolean lastCharacterMatch = false;
int count = 0;
for (int i = 0; i < Math.min(a.length(), b.length()); i++) {
if (a.charAt(i) == b.charAt(i)) {
if (lastCharacterMatch) {
count++;
} else {
lastCharacterMatch = true;
}
} else {
lastCharacterMatch = false;
}
}
System.out.println(count);
The heart of the problem lies with your usage of the substring method. The important thing to note is that the beginning index is inclusive, and the end index is exclusive.
As an example, dissecting your usage, String aSub=a.substring(i,i+1); in the first iteration of the loop i = 0 so this line is then String aSub=a.substring(0,1); From the javadocs, and my explanation above, this would result in a substring from the first character to the first character or String aSub="x"; Changing this to i+2 and u+2 will get you the desired behavior but beware of index out of bounds errors with the way your loops are currently written.
String a = "xxcaazz";
String b = "xxbaaz";
int count = 0;
for (int i = 0; i < (a.length() > b.length() ? b : a).length() - 1; i++) {
String aSub = a.substring(i, i + 2);
String bSub = b.substring(i, i + 2);
if (aSub.equals(bSub)) {
count++;
}
}
System.out.println(count);

Count the number of dots[.] , ! and ? in a text

Im working on java project and I need to count the number of dots,! and ? in a string. My current approach is to use regex. I used the following code but its not giving the correct result.
for(int i=0; i<words.length; i++){
String w = words[i];
if(w.matches("(.)+[.!?]")){
count++; //increasing the count.
}
}
For some other function I have converted the string into array of words. So I'm using it in this.
I want to increase the count by one for each occurrence of dot,! or ? indicating a terminating point of a sentence. For example
test. - count increases by 1
test.. - count increase by 1
test?. - count increases by 1
Repeated use of symbols shouldn't increase the count.
Can you tell me what is wrong in here?
Use a wildcard in the regex.
int count = 0;
for( int i = 0; i < words.length; i++ )
if( words[i].matches(".*[.!?]") )
count++;
.*[.!?] will match all strings that end in a period, exclamation point, or question mark.
The first . is unescaped, and stands for any character. The * means 0 or more of the previous thing. So 0 or more of any character. The . in the brackets is escaped, so it's just a regular period.
The easiest way is this one-liner:
int count = str.length() - str.replaceAll("[.!?]+", "").length()
Rather than count char matches, delete them and compare lengths.
You can do -
public static void main(String args[])
{
String str = "Test, test!.\tTEST:\nTeST?;";
Pattern p = Pattern.compile("[.!?]");
Matcher matcher = p.matcher(str);
int count = 0;
while(matcher.find()) {
count++;
}
System.out.println("Count : " + count);
}
and the output is - 3 as expected.
Can you tell me why the same regex in str.matches("[.!?]) is not giving the expected result?
Because str.matches("[.!?]) matches the whole String and not if regex is found in the String. Above regex will work if string is '.', '?', or '!' -
String s = ".";
System.out.println(s.matches("[.!?]"));
will give true.

Categories

Resources