Finding Count of Pattern in a String (Overlap Inclusive) - java

So I'm trying to write an algorithm that counts the number of occurrences of some pattern, say "aa", within a string, say "aaabca." The number of patterns in that string should return an integer, in this case 2, because the first three characters contain two occurrences of the pattern.
What I have finds the number of patterns under the assumption the existing occurrences of a pattern is NOT overlapping:
public class Pattern{
public static void main(String[] args){
Scanner scan = new Scanner(System.in);
System.out.println("Enter the string: ");
String s = scan.nextLine();
String[] splittedInput = s.split(";");
String pattern = splittedInput[0];
String blobs = splittedInput[1];
Pattern p = new Pattern();
p.count(pattern, blobs);
}
public static void count(String pattern, String blobs){
String[] substrings = blobs.split("[|]");
int numOccurences = 0;
int[] instances = new int[substrings.length];
int patternLength = pattern.length();
for (int i = 0; i < instances.length; i++){
int length = substrings[i].length();
String temp = substrings[i];
temp = temp.replaceAll(pattern, "");
int postLength = temp.length();
numOccurences = (length - postLength) / pattern.length();
instances[i] = numOccurences;
numOccurences = 0;
}
int sum = 0;
for (int i = 0; i < instances.length; i++){
System.out.print(instances[i] + "|");
sum += instances[i];
}
System.out.print(sum);
}
}
Any suggestions?

I would personally compare the pattern as a substring in this case. For example a run of a single String from your array would look like this:
//Initial values
String blobs = "aaaabcaaa";
String pattern = "aab";
String[] substrings = blobs.split("[|]");
//The code I added that should placed into the loop
int numOccurences = 0;
String str = substrings[0];
for (int k = 0; k <= (str.length() - pattern.length()); k++)
{
if (str.substring(k, k + pattern.length()).equals(pattern))
{
numOccurences++;
}
}
System.out.println(numOccurences);
If you want to run this on each String in your array simply modify String str = substrings[0] to String str = substrings[i] and iterate over the array storing the final numOccurences as you please.
Example Run:
String is aaaabcaaa
Pattern is aa
Output is 5 occurences

For one String, match is the String you're looking for:
int len = theStr.length ();
int start = 0;
int pos;
int count = 0;
while ((start < len) && ((pos = theStr.indexOf (match, start)) >= 0))
{
++count;
start = pos + 1;
}

If you use Java 8 you can count this value in the following way.
Example:
String blobs = "aaabcaaa";
String pattern = "aa";
List<String> strings = Arrays.asList(blobs.split(""));
long count = IntStream.range(0, strings.size())
.mapToObj(index -> index < strings.size() - 1 ? strings.get(index) + strings.get(index + 1) : strings.get(index - 1))
.filter(str -> str.equals(pattern))
.count();
System.out.println("Result count: " + count);

Continually taking substrings and using the startsWith method seems to work pretty well.
String pat = "ss";
String str = "kskslsksaaaslsslssskssssllsssss";
int count = 0;
while (str.length() >= pat.length()) {
count += str.startsWith(pat) ? 1 : 0;
str = str.substring(1);
}
System.out.println("count = " + count);
You can also take a similar approach with streams.
long count = IntStream.range(0, str.length()).mapToObj(
n -> str.substring(n)).filter(n -> n.startsWith(pat)).count();
System.out.println("count = " + count);
But in this case I actually prefer the non-stream approach.

Related

How to convert reminder int value to string (Characters)

How to convert int value into string means my string will be 42646 character its mod 42600 how to show and print this character and how?
int count = image_length.length(); //count=42646
System.out.println(count);
int mod = count % length; //46
int rem = count - mod; //42600
String rem_value = String.valueOf(rem);
// I want to get string through reminder value 42600 & how
String[] split = rem_value.split("[^a-zA-Z/]", length);
getSaltString();
photoName = randStr + "_IMG.jpg";
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < split.length; i++) {
url_part = String.valueOf(stringBuilder.append(split[i]));
new RegisterImageThread(ActivityRegisterUploadPhoto.this).execute(photoName, url_part + i);
}
new RegisterImageThread(ActivityRegisterUploadPhoto.this).execute(photoName, url_part+rem_value);
So you want to split the numeric String "42600" to an array right?
Replace the line:
String[] split = rem_value.split("[^a-zA-Z/]", length);
to something like this:
String rem_value = "42600"; //your rem_value
int[] split = new int[rem_value.length()];
for (int i = 0; i < rem_value.length(); i++) {
split[i] = Character.getNumericValue(rem_value.charAt(i));
}
//print the result
Arrays.stream(split).forEach(s -> System.out.println(s));

Convert comma separated string to array without string split?

Using Java, how would I convert "Paul, John, Ringo" to
Paul
John
Ringo
But while using a loop that searches for the commas and pulls out the words between them? I can't use anything like string split, strictly a loop and pretty simple java. Thanks!
String str = "Paul, John, Ringo";
List<String> words = new ArrayList<String>();
int cIndex = 0;
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == ',') {
String temp = str.substring(cIndex, i).trim();
cIndex = i + 1;
words.add(temp);
}
}
String temp = str.substring(str.lastIndexOf(',')+1,str.length()).trim();
words.add(temp);
List<String> list = new List<String>();
String text = "your, text, here";
int indexTraversed = 0;
while(true){
int i = text.indexOf(",", indexTraversed);
if(i<0) break;
list.add(text.substring(indexTraversed, i));
indexTraversed += i + 1;
}
String[] array = list.toArray();
and if you can't use List :
String[] list = new String[100];
int counter = 0;
String text = "your, text, here";
int indexTraversed = 0; // declaring and assigning var
while(true){
int i = text.indexOf(",", indexTraversed);
if(i<0) break;
list.add(text.substring(indexTraversed, i));
indexTraversed += i + 1;
counter ++;
}
Then read the list array until counter.

How to split string at every nth occurrence of character in Java

I would like to split a string at every 4th occurrence of a comma ,.
How to do this? Below is an example:
String str = "1,,,,,2,3,,1,,3,,";
Expected output:
array[0]: 1,,,,
array[1]: ,2,3,,
array[2]: 1,,3,,
I tried using Google Guava like this:
Iterable<String> splitdata = Splitter.fixedLength(4).split(str);
output: [1,,,, ,,2,, 3,,1, ,,3,, ,]
I also tried this:
String [] splitdata = str.split("(?<=\\G.{" + 4 + "})");
output: [1,,,, ,,2,, 3,,1, ,,3,, ,]
Yet this is is not the output I want. I just want to split the string at every 4th occurrence of a comma.
Thanks.
Take two int variable. One is to count the no of ','. If ',' occurs then the count will move. And if the count is go to 4 then reset it to 0. The other int value will indicate that from where the string will be cut off. it will start from 0 and after the first string will be detected the the end point (char position in string) will be the first point of the next. Use the this start point and current end point (i+1 because after the occurrence happen the i value will be incremented). Finally add the string in the array list. This is a sample code. Hope this will help you. Sorry for my bad English.
String str = "1,,,,,2,3,,1,,3,,";
int k = 0;
int startPoint = 0;
ArrayList<String> arrayList = new ArrayList<>();
for (int i = 0; i < str.length(); i++)
{
if (str.charAt(i) == ',')
{
k++;
if (k == 4)
{
String ab = str.substring(startPoint, i+1);
System.out.println(ab);
arrayList.add(ab);
startPoint = i+1;
k = 0;
}
}
}
Here's a more flexible function, using an idea from this answer:
static List<String> splitAtNthOccurrence(String input, int n, String delimiter) {
List<String> pieces = new ArrayList<>();
// *? is the reluctant quantifier
String regex = Strings.repeat(".*?" + delimiter, n);
Matcher matcher = Pattern.compile(regex).matcher(input);
int lastEndOfMatch = -1;
while (matcher.find()) {
pieces.add(matcher.group());
lastEndOfMatch = matcher.end();
}
if (lastEndOfMatch != -1) {
pieces.add(input.substring(lastEndOfMatch));
}
return pieces;
}
This is how you call it using your example:
String input = "1,,,,,2,3,,1,,3,,";
List<String> pieces = splitAtNthOccurrence(input, 4, ",");
pieces.forEach(System.out::println);
// Output:
// 1,,,,
// ,2,3,,
// 1,,3,,
I use Strings.repeat from Guava.
try this also, if you want result in array
String str = "1,,,,,2,3,,1,,3,,";
System.out.println(str);
char c[] = str.toCharArray();
int ptnCnt = 0;
for (char d : c) {
if(d==',')
ptnCnt++;
}
String result[] = new String[ptnCnt/4];
int i=-1;
int beginIndex = 0;
int cnt=0,loopcount=0;
for (char ele : c) {
loopcount++;
if(ele==',')
cnt++;
if(cnt==4){
cnt=0;
result[++i]=str.substring(beginIndex,loopcount);
beginIndex=loopcount;
}
}
for (String string : result) {
System.out.println(string);
}
This work pefectly and tested in Java 8
public String[] split(String input,int at){
String[] out = new String[2];
String p = String.format("((?:[^/]*/){%s}[^/]*)/(.*)",at);
Pattern pat = Pattern.compile(p);
Matcher matcher = pat.matcher(input);
if (matcher.matches()) {
out[0] = matcher.group(1);// left
out[1] = matcher.group(2);// right
}
return out;
}
//Ex: D:/folder1/folder2/folder3/file1.txt
//if at = 2, group(1) = D:/folder1/folder2 and group(2) = folder3/file1.txt
The accepted solution above by Saqib Rezwan does not add the leftover string to the list, if it divides the string after every 4th comma and the length of the string is 9 then it will leave the 9th character, and return the wrong list.
A complete solution would be :
private static ArrayList<String> splitStringAtNthOccurrence(String str, int n) {
int k = 0;
int startPoint = 0;
ArrayList<String> list = new ArrayList();
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == ',') {
k++;
if (k == n) {
String ab = str.substring(startPoint, i + 1);
list.add(ab);
startPoint = i + 1;
k = 0;
}
}
// if there is no comma left and there are still some character in the string
// add them to list
else if (!str.substring(i).contains(",")) {
list.add(str.substring(startPoint));
break;
}
}
return list;
}
}

Java String: split String

I have this String:
String string="NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
How can I do to split it into an array every 4 commas?
I would like something like this:
String[] a=string.split("d{4}");
a[0]="NNP,PERSON,true,?";
a[1]="IN,O,false,pobj";
a[2]="NNP,ORGANIZATION,true,?";
a[3]="p";
Keep it simple. No need to use regex. Simply count the number of commas. when four commas are found then use String.substring() to find out the value.
Finally store the printed values in ArrayList<String>.
String string = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
int count = 0;
int beginIndex = 0;
int endIndex = 0;
for (char ch : string.toCharArray()) {
if (ch == ',') {
count++;
}
if (count == 4) {
System.out.println(string.substring(beginIndex + 1, endIndex));
beginIndex = endIndex;
count = 0;
}
endIndex++;
}
if (beginIndex < endIndex) {
System.out.println(string.substring(beginIndex + 1, endIndex));
}
output:
NP,PERSON,true,?
IN,O,false,pobj
NNP,ORGANIZATION,true,?
p
If you really have to use split you can use something like
String[] array = string.split("(?<=\\G[^,]{1,100},[^,]{1,100},[^,]{1,100},[^,]{1,100}),");
Explanation if idea in my previous answer on similar but simpler topic
Demo:
String string = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
String[] array = string.split("(?<=\\G[^,]{1,100},[^,]{1,100},[^,]{1,100},[^,]{1,100}),");
for (String s : array)
System.out.println(s);
output:
NNP,PERSON,true,?
IN,O,false,pobj
NNP,ORGANIZATION,true,?
p
But if there is any chance that you don't have to use split but you still want to use regex then I encourage you to use Pattern and Matcher classes to create simple regex which can find parts you are interested in, not complicated regex to find parts you want to get rid of. I mean something like
any xx,xxx,xxx,xxx part where x is not ,
any xx or xx,xx or xxx,xxx,xxx parts if they are placed at the end of string (to catch rest of data unmatched by regex from point 1.)
So
Pattern p = Pattern.compile("[^,]+(,[^,]+){3}|[^,]+(,[^,]+){0,2}$");
should do the trick.
Another solution and probably the fastest (and quite easy to write) would be creating your own parser which will iterate over all characters from your string, store them in some buffer, calculate how many , already occurred and if number is multiplication of 4 clear buffer and write its contend to array (or better dynamic collection like list). Such parser can look like
public static List<String> parse(String s){
List<String> tokens = new ArrayList<>();
StringBuilder sb = new StringBuilder();
int commaCounter = 0;
for (char ch: s.toCharArray()){
if (ch==',' && ++commaCounter == 4){
tokens.add(sb.toString());
sb.delete(0, sb.length());
commaCounter = 0;
}else{
sb.append(ch);
}
}
if (sb.length()>0)
tokens.add(sb.toString());
return tokens;
}
You can later convert List to array if you need but I would stay with List.
StringTokenizer tizer = new StringTokenizer (string,",");
int count = tizer.countTokens ()/4;
int overFlowCount = tizer.countTokens % 4;
String [] a;
if(overflowCount > 0)
a = new String[count +1];
else
a = new String[count];
int x = 0;
for (; x <count; x++){
a[x]= tizer.nextToken() + "," + tizer.nextToken() + "," + tizer.nextToken() + "," + tizer.nextToken();
}
if(overflowCount > 0)
while(tizer.hasMoreTokens()){
a[x+1] = a[x+1] + tizer.nextToken() + ",";
}
Edited,
Try this:
String str = "NNP,PERSON,true,?,IN,O,false,pobj,NNP,ORGANIZATION,true,?,p";
String[] arr = str.split(",");
ArrayList<String> result = new ArrayList<String>();
String s = arr[0] + ",";
int len = arr.length - (arr.length /4) * 4;
int i;
for (i = 1; i <= arr.length-len; i++) {
if (i%4 == 0) {
result.add(s.substring(0, s.length()-1));
s = arr[i] + ",";
}
else
s += arr[i] + ",";
}
s = "";
while (i <= arr.length-1) {
s += arr[i] + ",";
i++;
}
s += arr[arr.length-1];
result.add(s);
output:
NP,PERSON,true,?
IN,O,false,pobj
NNP,ORGANIZATION,true,?
p

Fancy looping in Java

I have a problem wherein I have two strings, the length of one of which I will know only upon execution of my function. I want to write my function such that it would take these two stings and based upon which one is longer, compute a final string as under -
finalString = longerStringChars1AND2
+ shorterStringChar1
+ longerStringChars3and4
+ shorterStringChar2
+ longerStringChars5AND6
...and so on till the time the SHORTER STRING ENDS.
Once the shorter string ends, I want to append the remaining characters of the longer string to the final string, and exit. I have written some code, but there is too much looping for my liking. Any suggestions?
Here is the code I wrote - very basic -
public static byte [] generateStringToConvert(String a, String b){
(String b's length is always known to be 14.)
StringBuffer stringToConvert = new StringBuffer();
int longer = (a.length()>14) ? a.length() : 14;
int shorter = (longer > 14) ? 14 : a.length();
int iteratorForLonger = 0;
int iteratorForShorter = 0;
while(iteratorForLonger < longer) {
int count = 2;
while(count>0){
stringToConvert.append(b.charAt(iteratorForLonger));
iteratorForLonger++;
count--;
}
if(iteratorForShorter < shorter && iteratorForLonger >= longer){
iteratorForLonger = 0;
}
if(iteratorForShorter<shorter){
stringToConvert.append(a.charAt(iteratorForShorter));
iteratorForShorter++;
}
else{
break;
}
}
if(stringToConvert.length()<32 | iteratorForLonger<b.length()){
String remainingString = b.substring(iteratorForLonger);
stringToConvert.append(remainingString);
}
System.out.println(stringToConvert);
return stringToConvert.toString().getBytes();
}
You can use StringBuilder to achieve this. Please find below source code.
public static void main(String[] args) throws InterruptedException {
int MAX_ALLOWED_LENGTH = 14;
String str1 = "yyyyyyyyyyyyyyyy";
String str2 = "xxxxxx";
StringBuilder builder = new StringBuilder(MAX_ALLOWED_LENGTH);
builder.append(str1);
char[] shortChar = str2.toCharArray();
int index = 2;
for (int charCount = 0; charCount < shortChar.length;) {
if (index < builder.length()) {
// insert 1 character from short string to long string
builder.insert(index, shortChar, charCount, 1);
}
// 2+1 as insertion index is increased after after insertion
index = index + 3;
charCount = charCount + 1;
}
String trimmedString = builder.substring(0, MAX_ALLOWED_LENGTH);
System.out.println(trimmedString);
}
Output
yyxyyxyyxyyxyy
String one = "longwordorsomething";
String two = "short";
String shortString = "";
String longString = "";
if(one.length() > two.length()) {
shortString = two;
longString = one;
} else {
shortString = one;
longString = two;
}
StringBuilder newString = new StringBuilder();
int j = 0;
for(int i = 0; i < shortString.length(); i++) {
if((j + 2) < longString.length()) {
newString.append(longString.substring(j, j + 2));
j += 2;
}
newString.append(shortString.substring(i, i + 1));
}
// Append last part
newString.append(longString.substring(j));
System.out.println(newString);

Categories

Resources