count directly repeated substring occurence - java

I am trying to count the number of directly repeatings of a substring in a string.
String s = "abcabcdabc";
String t = "abc";
int count = 2;
EDIT:
because some people are asking, i try to clarify this: there are 3 times t in s but i need the number of times t is repeated without any other character. that would result in 2, because the d in my example is not the starting character of t. ('d' != 'a').
Another example to clarify this:
String s = "fooabcabcdabc";
String t = "abc";
int count = 0;
I know how to count the number of occurrences in the string, i need it to be repeating from left to right without interruption!
Here is what i have so far, but i think i made a simple mistake in it...
public static int countRepeat(String s, String t){
if(s.length() == 0 || t.length() == 0){
return 0;
}
int count = 0;
if(t.length() == 1){
System.out.println(s+" | " + t);
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) != t.charAt(0)){
return count;
}
count++;
}
}else{
System.out.println(s+" | " + t);
for (int i = 0; i < s.length(); i++) {
int tchar = (i- (count*(t.length()-1)));
System.out.println(i+ " | " + tchar);
if (s.charAt(i) != t.charAt(tchar)){
return count;
}
if(tchar >= t.length()-1){
count++;
}
}
}
return count;
}
what am i doing wrong? And is there a better/faster way to do this?

There exists a str.indexOf(substring,index) method in the String API.
In pseudocode this would mean something like this:
declare integer variable as index
declare integer variable as count
while index <= (length of string - length of substring)
index = indexOf substring from index
if index >= 0
increment count
end if
end while

Using indexOf() makes the code much easier:
public static int startingRepeats(final String haystack, final String needle)
{
String s = haystack;
final int len = needle.length();
// Special case...
if (len == 0)
return 0;
int count = 0;
while (s.startsWith(needle)) {
count++;
s = s.subString(len);
}
return count;
}

This version does not allocate new objects (substrings, etc) and just look for the characters where they are supposed to be.
public static void main(String[] args) {
System.out.println(countRepeat("abcabcabc", "abc")); // 3
System.out.println(countRepeat("abcabcdabc", "abc")); // 2
System.out.println(countRepeat("abcabcabcx", "abc")); // 3
System.out.println(countRepeat("xabcabcabc", "abc")); // 0
}
public static int countRepeat(String s, String t){
int n = 0; // Ocurrences
for (int i = 0; i < s.length(); i ++) { // i is index in s
int j = i % t.length(); // corresponding index in t
boolean last = j == t.length() - 1; // this should be the last char in t
if (s.charAt(i) == t.charAt(j)) { // Matches?
if (last) { // Matches and it is the last
n++;
}
} else { // Do not match. finished!
break;
}
}
return n;
}

Here is the another calculator:
String s = "abcabcdabc";
String t = "abc";
int index = 0;
int count = 0;
while ((index = s.indexOf(t, index)) != -1) {
index += t.length();
count++;
}
System.out.println("count = " + count);

Related

How to count all the matching sub-string using regex?

I am trying to count all instances of a substring from .PBAAP.B with P A B in that sequence and can have 1-3 symbols in between them (inclusive).
The output should be 2
.P.A...B
.P..A..B
What I've tried so far is
return (int) Pattern
.compile("P.{0,2}A.{0,2}B")
.matcher(C)
.results()
.count();
But I only get output 1. My guess is that in both cases, the group is PBAAP.B. So instead of 2, I get 1.
I could write an elaborate function to achieve what I am trying to do, but I was wondering if there was a way to do it with regex.
int count = 0;
for (int i = 0; i < C.length(); i++) {
String p = Character.toString(C.charAt(i));
if (p.equalsIgnoreCase("P")) {
for (int j = X; j <= Y; j++) {
if (i + j < C.length()) {
String a = Character.toString(C.charAt(i + j));
if (a.equals("A")) {
for (int k = X; k <= Y; k++) {
if (i + j + k < C.length()) {
String b = Character.toString(C.charAt(i + j + k));
if (b.equalsIgnoreCase("B")) {
count++;
}
}
}
}
}
return count;
To my knowledge, you will only get a boolean response out of a regex match - either it is a match or it isn't. Thus, I can't think of a solution solving your problem using regex.
count() is used, if you want to check if there are multiple matches at different indices - which is not what you want. For instance, the following snippet will return 2 as th is found at 11-12 and at 20-21:
Pattern
.compile("(th)")
.matcher("let's test this together")
.results()
.count();
In order to keep the solution without regex readable and extensible, you may want to use regression.
Class to keep track of the latest match of each letter:
public class Letter {
private char letter;
private int latestMatch;
// constructor
// getters + setters
}
Class to detect different matches:
public class MatchFinder {
final static String C = "......PAA.BBBadsfjksPeAkBB";
final static Letter[] LETTERS = new Letter[3];
final static int SYMBOLS_THRESHOLD = 4;
public static void main(String[] args) {
LETTERS[0] = new Letter('P', -1);
LETTERS[1] = new Letter('A', -1);
LETTERS[2] = new Letter('B', -1);
int count = countMatches(0, C.length(), 0, 0, -1);
System.out.println(count);
}
public static int countMatches(int start, int end, int letterIndex, int currentCount, int latestMatch) {
for (int i = start; i < end; i++) {
if (i < C.length()) {
Character c = Character.toLowerCase(C.charAt(i));
if (c.equals(LETTERS[letterIndex].getLowercaseLetter()) && i > latestMatch) {
LETTERS[letterIndex].setLatestMatch(i);
if (letterIndex + 1 < LETTERS.length) {
int childStart = LETTERS[letterIndex].getLatestMatch() + 1;
return countMatches(childStart, childStart + SYMBOLS_THRESHOLD, letterIndex + 1, currentCount, -1);
}
currentCount++;
}
}
}
if (letterIndex > 0) {
int parentLetterIndex = letterIndex - 1;
int latestParentMatch = LETTERS[parentLetterIndex].getLatestMatch();
if (letterIndex > 1) {
int parentStart = LETTERS[letterIndex - 2].getLatestMatch() + 1;
return countMatches(parentStart, parentStart + SYMBOLS_THRESHOLD, parentLetterIndex, currentCount, latestParentMatch);
} else {
return countMatches(LETTERS[parentLetterIndex].getLatestMatch() + 1, C.length(), 0, currentCount, latestParentMatch);
}
}
return currentCount;
}
}

Efficient way to find longest streak of characters in string

This code works fine but I'm looking for a way to optimize it. If you look at the long string, you can see 'l' appears five times consecutively. No other character appears this many times consecutively. So, the output is 5. Now, the problem is this method checks each and every character and even after the max is found, it continues to check the remaining characters. Is there a more efficient way?
public class Main {
public static void main(String[] args) {
System.out.println(longestStreak("KDDiiigllllldddfnnlleeezzeddd"));
}
private static int longestStreak(String str) {
int max = 0;
for (int i = 0; i < str.length(); i++) {
int count = 0;
for (int j = i; j < str.length(); j++) {
if (str.charAt(i) == str.charAt(j)) {
count++;
} else break;
}
if (count > max) max = count;
}
return max;
}
}
We could add variable for previous char count in single iteration. Also as an additional optimisation we stop iteration if i + max - currentLenght < str.length(). It means that max can not be changed:
private static int longestStreak(String str) {
int maxLenght = 0;
int currentLenght = 1;
char prev = str.charAt(0);
for (int index = 1; index < str.length() && isMaxCanBeChanged(str, maxLenght, currentLenght, index); index++) {
char currentChar = str.charAt(index);
if (currentChar == prev) {
currentLenght++;
} else {
maxLenght = Math.max(maxLenght, currentLenght);
currentLenght = 1;
}
prev = currentChar;
}
return Math.max(maxLenght, currentLenght);
}
private static boolean isMaxCanBeChanged(String str, int max, int currentLenght, int index) {
return index + max - currentLenght < str.length();
}
Here is a regex magic solution, which although a bit brute force perhaps gets some brownie points. We can iterate starting with the number of characters in the original input, decreasing by one at a time, trying to do a regex replacement of continuous characters of that length. If the replacement works, then we know we found the longest streak.
String input = "KDDiiigllllldddfnnlleeezzeddd";
for (int i=input.length(); i > 0; --i) {
String replace = input.replaceAll(".*?(.)(\\1{" + (i-1) + "}).*", "$1");
if (replace.length() != input.length()) {
System.out.println("longest streak is: " + replace);
}
}
This prints:
longest streak is: lllll
Yes there is. C++ code:
string str = "KDDiiigllllldddfnnlleeezzeddd";
int longest_streak = 1, current_streak = 1; char longest_letter = str[0];
for (int i = 1; i < str.size(); ++i) {
if (str[i] == str[i - 1])
current_streak++;
else current_streak = 1;
if (current_streak > longest_streak) {
longest_streak = current_streak;
longest_letter = str[i];
}
}
cout << "The longest streak is: " << longest_streak << " and the character is: " << longest_letter << "\n";
LE: If needed, I can provide the Java code for it, but I think you get the idea.
public class Main {
public static void main(String[] args) {
System.out.println(longestStreak("KDDiiigllllldddfnnlleeezzeddd"));
}
private static int longestStreak(String str) {
int longest_streak = 1, current_streak = 1; char longest_letter = str.charAt(0);
for (int i = 1; i < str.length(); ++i) {
if (str.charAt(i) == str.charAt(i - 1))
current_streak++;
else current_streak = 1;
if (current_streak > longest_streak) {
longest_streak = current_streak;
longest_letter = str.charAt(i);
}
}
return longest_streak;
}
}
The loop could be rewritten a bit smaller, but mainly the condition can be optimized:
i < str.length() - max
Using Stream and collector. It should give all highest repeated elements.
Code:
String lineString = "KDDiiiiiiigllllldddfnnlleeezzeddd";
String[] lineSplit = lineString.split("");
Map<String, Integer> collect = Arrays.stream(lineSplit)
.collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(e -> 1)));
int maxValueInMap = (Collections.max(collect.values()));
for (Entry<String, Integer> entry : collect.entrySet()) {
if (entry.getValue() == maxValueInMap) {
System.out.printf("Character: %s, Repetition: %d\n", entry.getKey(), entry.getValue());
}
}
Output:
Character: i, Repetition: 7
Character: l, Repetition: 7
P.S I am not sure how efficient this code it. I just learned Streams.

Searching LinkedList, comparing two "Strings"?

I have a LinkedList where each node contains a word. I also have a variable that contains 6 randomly generated letters. I have a code the determines all possible letter combinations of those letters. I need to traverse through the linked list and determine the best "match" among the nodes.
Example:
-Letters generated: jghoot
-Linked list contains: cat, dog, cow, loot, hooter, ghlooter (I know ghlooter isn't a word)
The method would return hooter because it shares the most characters and is most similar to it. Any ideas?
I guess you could say I am looking for the word that the generated letters are a substring of.
use a nested for loop, compare each letter of your original string with the one you are comparing it to, and for each match, increase a local int variable. at the end of the loop, compare local int variable to global int variable that holds the "best" match till that one, and if bigger, store local int into global one, and put your found node into global node. at the end you should have a node which matches closest.
something like this
int currBest = 0;
int currBestNode = firstNodeOfLinkedList;
while(blabla)
{
int localBest = 0;
for(i= 0; i < currentNodeWord.length; i++)
{
for(j=0; j < originalWord.length;j++)
{
if(currentNodeWord[i] == originalWord[j])
{
localBest++
}
}
}
if(localBest > currBest)
{
currBest = localBest;
currBestNode = currentNodeStringBeingSearched;
}
}
// here u are out of ur loop, ur currBestNode should be set to the best match found.
hope that helps
If you're considering only character counts, first you need a method to count chars in a word
public int [] getCharCounts(String word) {
int [] result = new int['z' - 'a' + 1];
for(int i = 0; i<word.length(); i++) result[word.charAt(i) - 'a']++;
return result;
}
and then you need to compare character counts of two words
public static int compareCounts(int [] count1, int [] count2) [
int result = 0;
for(int i = 0; i<count1.length; i++) {
result += Math.min(count1[i], count2[i]);
}
return result;
}
public static void main(String[] args) {
String randomWord = "jghoot";
int [] randomWordCharCount = getCharCounts(randomWord);
ArrayList<String> wordList = new ArrayList();
String bestElement = null;
int bestMatch = -1;
for(String word : wordList) {
int [] wordCount = getCharCounts(word);
int cmp = compareCounts(randomWordCharCount, wordCount);
if(cmp > bestMatch) {
bestMatch = cmp;
bestElement = word;
}
}
System.out.println(word);
}
I think this works.
import java.util.*;
import java.io.*;
class LCSLength
{
public static void main(String args[])
{
ArrayList<String> strlist=new ArrayList<String>();
strlist.add(new String("cat"));
strlist.add(new String("cow"));
strlist.add(new String("hooter"));
strlist.add(new String("dog"));
strlist.add(new String("loot"));
String random=new String("jghoot"); //Your String
int maxLength=-1;
String maxString=new String();
for(String s:strlist)
{
int localMax=longestSubstr(s,random);
if(localMax>maxLength)
{
maxLength=localMax;
maxString=s;
}
}
System.out.println(maxString);
}
public static int longestSubstr(String first, String second) {
if (first == null || second == null || first.length() == 0 || second.length() == 0) {
return 0;
}
int maxLen = 0;
int fl = first.length();
int sl = second.length();
int[][] table = new int[fl+1][sl+1];
for(int s=0; s <= sl; s++)
table[0][s] = 0;
for(int f=0; f <= fl; f++)
table[f][0] = 0;
for (int i = 1; i <= fl; i++) {
for (int j = 1; j <= sl; j++) {
if (first.charAt(i-1) == second.charAt(j-1)) {
if (i == 1 || j == 1) {
table[i][j] = 1;
}
else {
table[i][j] = table[i - 1][j - 1] + 1;
}
if (table[i][j] > maxLen) {
maxLen = table[i][j];
}
}
}
}
return maxLen;
}
}
Credits: Wikipedia for longest common substring algorithm.
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring

Finding the intersection between two list of string candidates

I wrote the following Java code, to find the intersection between the prefix and the suffix of a String in Java.
// you can also use imports, for example:
// import java.math.*;
import java.util.*;
class Solution {
public int max_prefix_suffix(String S) {
if (S.length() == 0) {
return 1;
}
// prefix candidates
Vector<String> prefix = new Vector<String>();
// suffix candidates
Vector<String> suffix = new Vector<String>();
// will tell me the difference
Set<String> set = new HashSet<String>();
int size = S.length();
for (int i = 0; i < size; i++) {
String candidate = getPrefix(S, i);
// System.out.println( candidate );
prefix.add(candidate);
}
for (int i = size; i >= 0; i--) {
String candidate = getSuffix(S, i);
// System.out.println( candidate );
suffix.add(candidate);
}
int p = prefix.size();
int s = suffix.size();
for (int i = 0; i < p; i++) {
set.add(prefix.get(i));
}
for (int i = 0; i < s; i++) {
set.add(suffix.get(i));
}
System.out.println("set: " + set.size());
System.out.println("P: " + p + " S: " + s);
int max = (p + s) - set.size();
return max;
}
// codility
// y t i l i d o c
public String getSuffix(String S, int index) {
String suffix = "";
int size = S.length();
for (int i = size - 1; i >= index; i--) {
suffix += S.charAt(i);
}
return suffix;
}
public String getPrefix(String S, int index) {
String prefix = "";
for (int i = 0; i <= index; i++) {
prefix += S.charAt(i);
}
return prefix;
}
public static void main(String[] args) {
Solution sol = new Solution();
String t1 = "";
String t2 = "abbabba";
String t3 = "codility";
System.out.println(sol.max_prefix_suffix(t1));
System.out.println(sol.max_prefix_suffix(t2));
System.out.println(sol.max_prefix_suffix(t3));
System.exit(0);
}
}
Some test cases are:
String t1 = "";
String t2 = "abbabba";
String t3 = "codility";
and the expected values are:
1, 4, 0
My idea was to produce the prefix candidates and push them into a vector, then find the suffix candidates and push them into a vector, finally push both vectors into a Set and then calculate the difference. However, I'm getting 1, 7, and 0. Could someone please help me figure it out what I'm doing wrong?
I'd write your method as follows:
public int max_prefix_suffix(String s) {
final int len = s.length();
if (len == 0) {
return 1; // there's some dispute about this in the comments to your post
}
int count = 0;
for (int i = 1; i <= len; ++i) {
final String prefix = s.substring(0, i);
final String suffix = s.substring(len - i, len);
if (prefix.equals(suffix)) {
++count;
}
}
return count;
}
If you need to compare the prefix to the reverse of the suffix, I'd do it like this:
final String suffix = new StringBuilder(s.substring(len - i, len))
.reverse().toString();
I see that the code by #ted Hop is good..
The question specify to return the max number of matching characters in Suffix and Prefix of a given String, which is a proper subset. Hence the entire string is not taken into consideration for this max number.
Ex. "abbabba", prefix and suffix can have abba(first 4 char) - abba (last 4 char),, hence the length 4
codility,, prefix(c, co,cod,codi,co...),, sufix (y, ty, ity, lity....), none of them are same.
hence length here is 0.
By modifying the count here from
if (prefix.equals(suffix)) {
++count;
}
with
if (prefix.equals(suffix)) {
count = prefix.length();// or suffix.length()
}
we get the max length.
But could this be done in O(n).. The inbuilt function of string equals, i believe would take O(n), and hence overall complexity is made O(n2).....
i would use this code.
public static int max_prefix_suffix(String S)
{
if (S == null)
return -1;
Set<String> set = new HashSet<String>();
int len = S.length();
StringBuilder builder = new StringBuilder();
for (int i = 0; i < len - 1; i++)
{
builder.append(S.charAt(i));
set.add(builder.toString());
}
int max = 0;
for (int i = 1; i < len; i++)
{
String suffix = S.substring(i, len);
if (set.contains(suffix))
{
int suffLen = suffix.length();
if (suffLen > max)
max = suffLen;
}
}
return max;
}
#ravi.zombie
If you need the length in O(n) then you just need to change Ted's code as below:
int max =0;
for (int i = 1; i <= len-1; ++i) {
final String prefix = s.substring(0, i);
final String suffix = s.substring(len - i, len);
if (prefix.equals(suffix) && max < i) {
max =i;
}
return max;
}
I also left out the entire string comparison to get proper prefix and suffixes so this should return 4 and not 7 for an input string abbabba.

Permutate a String to upper and lower case

I have a string, "abc". How would a program look like (if possible, in Java) who permute the String?
For example:
abc
ABC
Abc
aBc
abC
ABc
abC
AbC
Something like this should do the trick:
void printPermutations(String text) {
char[] chars = text.toCharArray();
for (int i = 0, n = (int) Math.pow(2, chars.length); i < n; i++) {
char[] permutation = new char[chars.length];
for (int j =0; j < chars.length; j++) {
permutation[j] = (isBitSet(i, j)) ? Character.toUpperCase(chars[j]) : chars[j];
}
System.out.println(permutation);
}
}
boolean isBitSet(int n, int offset) {
return (n >> offset & 1) != 0;
}
As you probably already know, the number of possible different combinations is 2^n, where n equals the length of the input string.
Since n could theoretically be fairly large, there's a chance that 2^n will exceed the capacity of a primitive type such as an int. (The user may have to wait a few years for all of the combinations to finish printing, but that's their business.)
Instead, let's use a bit vector to hold all of the possible combinations. We'll set the number of bits equal to n and initialize them all to 1. For example, if the input string is "abcdefghij", the initial bit vector values will be {1111111111}.
For every combination, we simply have to loop through all of the characters in the input string and set each one to uppercase if its corresponding bit is a 1, else set it to lowercase. We then decrement the bit vector and repeat.
For example, the process would look like this for an input of "abc":
Bits:   Corresponding Combo:
111    ABC
110    ABc
101    AbC
100    Abc
011    aBC
010    aBc
001    abC
000    abc
By using a loop rather than a recursive function call, we also avoid the possibility of a stack overflow exception occurring on large input strings.
Here is the actual implementation:
import java.util.BitSet;
public void PrintCombinations(String input) {
char[] currentCombo = input.toCharArray();
// Create a bit vector the same length as the input, and set all of the bits to 1
BitSet bv = new BitSet(input.length());
bv.set(0, currentCombo.length);
// While the bit vector still has some bits set
while(!bv.isEmpty()) {
// Loop through the array of characters and set each one to uppercase or lowercase,
// depending on whether its corresponding bit is set
for(int i = 0; i < currentCombo.length; ++i) {
if(bv.get(i)) // If the bit is set
currentCombo[i] = Character.toUpperCase(currentCombo[i]);
else
currentCombo[i] = Character.toLowerCase(currentCombo[i]);
}
// Print the current combination
System.out.println(currentCombo);
// Decrement the bit vector
DecrementBitVector(bv, currentCombo.length);
}
// Now the bit vector contains all zeroes, which corresponds to all of the letters being lowercase.
// Simply print the input as lowercase for the final combination
System.out.println(input.toLowerCase());
}
public void DecrementBitVector(BitSet bv, int numberOfBits) {
int currentBit = numberOfBits - 1;
while(currentBit >= 0) {
bv.flip(currentBit);
// If the bit became a 0 when we flipped it, then we're done.
// Otherwise we have to continue flipping bits
if(!bv.get(currentBit))
break;
currentBit--;
}
}
String str = "Abc";
str = str.toLowerCase();
int numOfCombos = 1 << str.length();
for (int i = 0; i < numOfCombos; i++) {
char[] combinations = str.toCharArray();
for (int j = 0; j < str.length(); j++) {
if (((i >> j) & 1) == 1 ) {
combinations[j] = Character.toUpperCase(str.charAt(j));
}
}
System.out.println(new String(combinations));
}
You can also use backtracking to solve this problem:
public List<String> letterCasePermutation(String S) {
List<String> result = new ArrayList<>();
backtrack(0 , S, "", result);
return result;
}
private void backtrack(int start, String s, String temp, List<String> result) {
if(start >= s.length()) {
result.add(temp);
return;
}
char c = s.charAt(start);
if(!Character.isAlphabetic(c)) {
backtrack(start + 1, s, temp + c, result);
return;
}
if(Character.isUpperCase(c)) {
backtrack(start + 1, s, temp + c, result);
c = Character.toLowerCase(c);
backtrack(start + 1, s, temp + c, result);
}
else {
backtrack(start + 1, s, temp + c, result);
c = Character.toUpperCase(c);
backtrack(start + 1, s, temp + c, result);
}
}
Please find here the code snippet for the above :
public class StringPerm {
public static void main(String[] args) {
String str = "abc";
String[] f = permute(str);
for (int x = 0; x < f.length; x++) {
System.out.println(f[x]);
}
}
public static String[] permute(String str) {
String low = str.toLowerCase();
String up = str.toUpperCase();
char[] l = low.toCharArray();
char u[] = up.toCharArray();
String[] f = new String[10];
f[0] = low;
f[1] = up;
int k = 2;
char[] temp = new char[low.length()];
for (int i = 0; i < l.length; i++)
{
temp[i] = l[i];
for (int j = 0; j < u.length; j++)
{
if (i != j) {
temp[j] = u[j];
}
}
f[k] = new String(temp);
k++;
}
for (int i = 0; i < u.length; i++)
{
temp[i] = u[i];
for (int j = 0; j < l.length; j++)
{
if (i != j) {
temp[j] = l[j];
}
}
f[k] = new String(temp);
k++;
}
return f;
}
}
You can do something like
```
import java.util.*;
public class MyClass {
public static void main(String args[]) {
String n=(args[0]);
HashSet<String>rs = new HashSet();
helper(rs,n,0,n.length()-1);
System.out.println(rs);
}
public static void helper(HashSet<String>rs,String res , int l, int n)
{
if(l>n)
return;
for(int i=l;i<=n;i++)
{
res=swap(res,i);
rs.add(res);
helper(rs,res,l+1,n);
res=swap(res,i);
}
}
public static String swap(String st,int i)
{
char c = st.charAt(i);
char ch[]=st.toCharArray();
if(Character.isUpperCase(c))
{
c=Character.toLowerCase(c);
}
else if(Character.isLowerCase(c))
{
c=Character.toUpperCase(c);
}
ch[i]=c;
return new String(ch);
}
}
```

Categories

Resources