Extract an int from a large string - java

I'm trying to get an int from a String. The String will always come as:
"mombojumbomombojumbomombojumbomombojumbomombojumbomombojumbohello=1?fdjaslkd;fdsjaflkdjfdklsa;fjdklsa;djsfklsa;dfjklds;afj=124214fdsamf=352"
The only constant in all of this, is that I will have a "hello=" followed by a number. With just that, I can't figure out how to pull out the number after the "hello=". This is what I have tried so far with no luck.
EDIT: The number will always be followed by a "?"
String[] tokens = s.split("hello=");
for (String t : tokens)
System.out.println(t);
I can't figure out how to isolate it from both sides of the int.

Pattern p = Pattern.compile("hello=(\\d+)");
Matcher m = p.matcher (s);
while (m.find())
System.out.println(m.group(1));
This sets up a search for anywhere in s that contains hello= followed by one or more digits (\\d+ means one or more digits). The loop looks for each occurrence of this pattern, and then whenever it finds a match, m.group(1) extracts the digits (since those are grouped in the pattern).

You should use a regular expression for this:
String str = "mombojumbomombojumbomombojumbomombojumbomombojumbomombojumbohello=1fdjaslkd;fdsjaflkdjfdklsa;fjdklsa;djsfklsa;dfjklds;afj=124214fdsamf=352";
Pattern p = Pattern.compile("hello=(\\d+)");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(m.group(1)); // prints 1
}

Try this:
String r = null;
int col = s.indexOf("hello="); // find the starting column of the marker string
if (col >= 0) {
String s2 = s.substring(col + 6); // get digits and the rest (add length of marker)
col = 0;
// now find the end of the digits (assume no plus or comma or dot chars)
while (col < s2.length() && Character.isDigit(s2.charAt(col))) {
col++;
}
if (col > 0) {
r = s2.substring(0, col); // get the digits off the front
}
}
r will be the string you want or it will be null if no number was found.

Here is another non-regex performance approach. Wrapped in a method for your convenience
Helper method
public static Integer getIntegerForKey(String key, String s)
{
int startIndex = s.indexOf(key);
if (startIndex == -1)
return null;
startIndex += key.length();
int endIndex = startIndex;
int len = s.length();
while(endIndex < len && Character.isDigit(s.charAt(endIndex))) {
++endIndex;
}
if (endIndex > startIndex)
return new Integer(s.substring(startIndex, endIndex));
return null;
}
Usage
Integer result = getIntegerForKey("hello=", yourInputString);
if (result != null)
System.out.println(result);
else
System.out.println("Key-integer pair not found.");

Yet another non regex solution:
String str = "mombojumbomombojumbomombojumbomombojumbomombojumbomombojumbohello=142?fdjaslkd;fdsjaflkdjfdklsa;fjdklsa;djsfklsa;dfjklds;afj=124214fdsamf=352";
char arr[] = str.substring(str.indexOf("hello=")+6).toCharArray();
String buff ="";
int i=0;
while(Character.isDigit(arr[i])){
buff += arr[i++];
}
int result = Integer.parseInt(buff);
System.out.println(result);

Related

How to compare two char arrays comparing characters in the same chronology but given an extra sign which stands for every possible sign?

char [] text = {'H','e','l','L','o','H','e','l','L','o'};
char[] pat = {'H','e','?','l','o'}; //'?' stands for every possible sign
We can ignore if the letters are upper or lower case.
Now I need to output how often it occurs.
Output: He?lo is in HelLoHelLo 2x
I know that you can use string methods like "contain" but how can I consider the question mark ?
public int matchCount(char[] text, char[] pattern) {
int consecCharHits = 0, matchCount = 0;
for (int i = 0; i < text.length; i++) {
if (text[i] == pattern[consecCharHits] || '?' == pattern[consecCharHits]) { // if char matches
consecCharHits++;
if (consecCharHits == pattern.length) { // if the whole pattern matches
matchCount++;
i -= consecCharHits - 1; // return to the next position to be evaluated
consecCharHits = 0; // reset consecutive char hits
}
} else {
i -= consecCharHits;
consecCharHits = 0;
}
}
return matchCount;
}
The way I would naively implement it without thinking too much about it
create inputIndex and set it to 0
create matchIndex and set it to 0
iterate over the input by incrementing the inputIndex one by one
compare the char in the input at inputIndex with the char in the match at matchIndex
if they "match" increment the matchIndex by one - if they don't set matchIndex to 0
if the matchIndex equals to your pat length increment the actual count of matches by one and set matchIndex back to 0
Where I wrote "match" you need to implement your custom match logic, ignoring the case and considering everything a match if the pattern at this place is a ?.
#Test
public void match() {
char [] text = {'H','e','l','L','o','H','e','l','L','o'};
char[] pat = {'H','e','?','l','o'}; //'?' stands for every possible sign
printMatch(text, pat);
}
private void printMatch(char[] text, char[] pat) {
String textStr = new String(text);
String patStr = new String(pat);
final String regexPattern = patStr.replace('?', '.').toLowerCase();
final Pattern pattern = Pattern.compile(regexPattern);
final Matcher matcher = pattern.matcher(textStr.toLowerCase());
while (matcher.find()) {
System.out.println(patStr + " is in " + textStr );
}
}
What about this ?
static int countPatternOccurences (char [] text, char [] pat)
{
int i = 0;
int j = 0;
int k = 0;
while ( i < text.length)
{
int a = Character.getNumericValue(pat[j]);
int b = Character.getNumericValue(text[i]);
if (a == b || pat[j] =='?')
{
j++;
}
else
{
j=0;
//return 0;
}
if(j == pat.length)
{
k++;
j = 0;
}
i++;
}
return k; // returns occurrences of pat in text
}

How to split string at every nth occurrence of character in Java

I would like to split a string at every 4th occurrence of a comma ,.
How to do this? Below is an example:
String str = "1,,,,,2,3,,1,,3,,";
Expected output:
array[0]: 1,,,,
array[1]: ,2,3,,
array[2]: 1,,3,,
I tried using Google Guava like this:
Iterable<String> splitdata = Splitter.fixedLength(4).split(str);
output: [1,,,, ,,2,, 3,,1, ,,3,, ,]
I also tried this:
String [] splitdata = str.split("(?<=\\G.{" + 4 + "})");
output: [1,,,, ,,2,, 3,,1, ,,3,, ,]
Yet this is is not the output I want. I just want to split the string at every 4th occurrence of a comma.
Thanks.
Take two int variable. One is to count the no of ','. If ',' occurs then the count will move. And if the count is go to 4 then reset it to 0. The other int value will indicate that from where the string will be cut off. it will start from 0 and after the first string will be detected the the end point (char position in string) will be the first point of the next. Use the this start point and current end point (i+1 because after the occurrence happen the i value will be incremented). Finally add the string in the array list. This is a sample code. Hope this will help you. Sorry for my bad English.
String str = "1,,,,,2,3,,1,,3,,";
int k = 0;
int startPoint = 0;
ArrayList<String> arrayList = new ArrayList<>();
for (int i = 0; i < str.length(); i++)
{
if (str.charAt(i) == ',')
{
k++;
if (k == 4)
{
String ab = str.substring(startPoint, i+1);
System.out.println(ab);
arrayList.add(ab);
startPoint = i+1;
k = 0;
}
}
}
Here's a more flexible function, using an idea from this answer:
static List<String> splitAtNthOccurrence(String input, int n, String delimiter) {
List<String> pieces = new ArrayList<>();
// *? is the reluctant quantifier
String regex = Strings.repeat(".*?" + delimiter, n);
Matcher matcher = Pattern.compile(regex).matcher(input);
int lastEndOfMatch = -1;
while (matcher.find()) {
pieces.add(matcher.group());
lastEndOfMatch = matcher.end();
}
if (lastEndOfMatch != -1) {
pieces.add(input.substring(lastEndOfMatch));
}
return pieces;
}
This is how you call it using your example:
String input = "1,,,,,2,3,,1,,3,,";
List<String> pieces = splitAtNthOccurrence(input, 4, ",");
pieces.forEach(System.out::println);
// Output:
// 1,,,,
// ,2,3,,
// 1,,3,,
I use Strings.repeat from Guava.
try this also, if you want result in array
String str = "1,,,,,2,3,,1,,3,,";
System.out.println(str);
char c[] = str.toCharArray();
int ptnCnt = 0;
for (char d : c) {
if(d==',')
ptnCnt++;
}
String result[] = new String[ptnCnt/4];
int i=-1;
int beginIndex = 0;
int cnt=0,loopcount=0;
for (char ele : c) {
loopcount++;
if(ele==',')
cnt++;
if(cnt==4){
cnt=0;
result[++i]=str.substring(beginIndex,loopcount);
beginIndex=loopcount;
}
}
for (String string : result) {
System.out.println(string);
}
This work pefectly and tested in Java 8
public String[] split(String input,int at){
String[] out = new String[2];
String p = String.format("((?:[^/]*/){%s}[^/]*)/(.*)",at);
Pattern pat = Pattern.compile(p);
Matcher matcher = pat.matcher(input);
if (matcher.matches()) {
out[0] = matcher.group(1);// left
out[1] = matcher.group(2);// right
}
return out;
}
//Ex: D:/folder1/folder2/folder3/file1.txt
//if at = 2, group(1) = D:/folder1/folder2 and group(2) = folder3/file1.txt
The accepted solution above by Saqib Rezwan does not add the leftover string to the list, if it divides the string after every 4th comma and the length of the string is 9 then it will leave the 9th character, and return the wrong list.
A complete solution would be :
private static ArrayList<String> splitStringAtNthOccurrence(String str, int n) {
int k = 0;
int startPoint = 0;
ArrayList<String> list = new ArrayList();
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == ',') {
k++;
if (k == n) {
String ab = str.substring(startPoint, i + 1);
list.add(ab);
startPoint = i + 1;
k = 0;
}
}
// if there is no comma left and there are still some character in the string
// add them to list
else if (!str.substring(i).contains(",")) {
list.add(str.substring(startPoint));
break;
}
}
return list;
}
}

Detecting multiple instances of word with .contains

I am trying to count the number of instances a certain string occurs within another string. The input string that I am searching is not formatted in any fashion.
I am currently doing the below but it is obvious that it only counts .contains once. What is the most efficient way to count instances multiple times.
public String computeBestFitKey() {
if(this.inputText == null)
return null;
String answer = null;
int bestCount = 0, tempCount = 0;
for(String s: logicTemplate.getLogicMap().keySet()) {
String[] keyWords = this.logicTemplate.getKeyWords(s);
for(String word: keyWords) {
if(this.inputText.toLowerCase().contains(word.toLowerCase())) {
System.out.println("word found: "+word.toLowerCase());
tempCount++;
}
}
if(tempCount > bestCount) {
bestCount = tempCount;
answer = s;
}
tempCount = 0;
}
return answer;
}
If you just need to count occurrences of a word, and it's not a homework assignment where you are restricted from using some of the standard facilities, then you can just do
int numOccurrences = 0;
Matcher m = Pattern.compile(word, Pattern.LITERAL).matcher(input);
while (m.find()) numOccurrences++;
Pattern.LITERAL is used to treat all character literally and ignore its special meaning in regex, if any.
You should be using indexOf(string str, int startFrom).
Replace this line: if(this.inputText.toLowerCase().contains(word.toLowerCase())) {
With these:
int lastIndex = -1;
String lowerTextInput = this.inputText.toLowerCase();
String lowerWord = word.toLowerCase();
while((lastIndex = (lowerTextInput .indexOf(lowerWord , lastIndex + 1)) > 0)
What this does is that it assigns lastIndex the value of your substring. If the string does not contain the substring, it will yield -1 and thus the while condition will break. If it does exist, the value of lastIndex will be incremented by 1 and the search is made again.
If you would like to make some improvements to this, especially if you are searching large strings, then I recommend you increase the value of lastIndex by the length of the substring you have matched.
static int countOccurences(String haystack, String needle)
{
int index, lastIndex = -1, count = 0;
while ((index = haystack.indexOf(needle, lastIndex + 1)) != -1)
{
lastIndex = index;
++count;
}
return count;
}

Java read integer from middle of the string

Is it possible in Java to efficiently read an integer from random position of the string? For instance, I have a
String s = "(34";
if (s.charAt(0) == '(')
{
// How to read a number from position = 1 to the end of the string?
// Of course, I can do something like
String s1 = s.substring(1);
int val = Integer.parseInt(s1);
}
but it dynamically creates a new instance of string and seems to be too slow and performance hitting.
UPDATE
Well, to be precise: I have an array of strings in form "(ddd" where d is a digit. So I do know that a number starts always from pos = 1. How do I efficently read these numbers?
Integer.parseInt(s1.replaceAll("[\\D]", ""))
Answered before the update:
I'm not an expert in regex, but hope this "\\d+" is useful to you. Invoke the below method with pattern: "\\d+".
public static int returnInt(String pattern,String inputString){
Pattern intPattern = Pattern.compile(pattern);
Matcher matcher = intPattern.matcher(inputString);
matcher.find();
String input = matcher.group();
return Integer.parseInt(input);
}
Answered after the update:
String is a final object, you cannot edit it, so if you want to get some digit value from it, you have the 2 ways:
1. Use your code, that will work fine, but if you care about performance, try 2nd way.
2. Divide your string on digits and add them to get the result:
public static void main(String[] args) {
String input = "(123456";
if(input.charAt(0) == '(') {
System.out.println(getDigit(input));
}
}
private static int getDigit(String s) {
int result = 0;
int increase = 10;
for(int i = 1; i < s.length(); i++) {
int digit = Character.getNumericValue(s.charAt(i));
result*=increase;
result += digit;
}
return result;
}
Output:
123456
If you don't want to allocate a new String then you can use the code in this other SO answer:
int charArrayToInt(char[] data, int start, int end) throws NumberFormatException {
int result = 0;
for (int i = start; i < end; i++) {
int digit = ((int)data[i] & 0xF);
if ((digit < 0) || (digit > 9)) throw new NumberFormatException();
result *= 10;
result += digit;
}
return result;
}
You can call it with charArrayToInt(s.toCharArray(), 1, s.length())

Find the Number of Occurrences of a Substring in a String

Why is the following algorithm not halting for me?
In the code below, str is the string I am searching in, and findStr is the string occurrences of which I'm trying to find.
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while (lastIndex != -1) {
lastIndex = str.indexOf(findStr,lastIndex);
if( lastIndex != -1)
count++;
lastIndex += findStr.length();
}
System.out.println(count);
How about using StringUtils.countMatches from Apache Commons Lang?
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
System.out.println(StringUtils.countMatches(str, findStr));
That outputs:
3
Your lastIndex += findStr.length(); was placed outside the brackets, causing an infinite loop (when no occurence was found, lastIndex was always to findStr.length()).
Here is the fixed version :
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while (lastIndex != -1) {
lastIndex = str.indexOf(findStr, lastIndex);
if (lastIndex != -1) {
count++;
lastIndex += findStr.length();
}
}
System.out.println(count);
A shorter version. ;)
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
System.out.println(str.split(findStr, -1).length-1);
The last line was creating a problem. lastIndex would never be at -1, so there would be an infinite loop. This can be fixed by moving the last line of code into the if block.
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while(lastIndex != -1){
lastIndex = str.indexOf(findStr,lastIndex);
if(lastIndex != -1){
count ++;
lastIndex += findStr.length();
}
}
System.out.println(count);
Do you really have to handle the matching yourself ? Especially if all you need is the number of occurences, regular expressions are tidier :
String str = "helloslkhellodjladfjhello";
Pattern p = Pattern.compile("hello");
Matcher m = p.matcher(str);
int count = 0;
while (m.find()){
count +=1;
}
System.out.println(count);
I'm very surprised no one has mentioned this one liner. It's simple, concise and performs slightly better than str.split(target, -1).length-1
public static int count(String str, String target) {
return (str.length() - str.replace(target, "").length()) / target.length();
}
Here it is, wrapped up in a nice and reusable method:
public static int count(String text, String find) {
int index = 0, count = 0, length = find.length();
while( (index = text.indexOf(find, index)) != -1 ) {
index += length; count++;
}
return count;
}
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
count++;
lastIndex += findStr.length() - 1;
}
System.out.println(count);
at the end of the loop count is 3; hope it helps
public int countOfOccurrences(String str, String subStr) {
return (str.length() - str.replaceAll(Pattern.quote(subStr), "").length()) / subStr.length();
}
A lot of the given answers fail on one or more of:
Patterns of arbitrary length
Overlapping matches (such as counting "232" in "23232" or "aa" in "aaa")
Regular expression meta-characters
Here's what I wrote:
static int countMatches(Pattern pattern, String string)
{
Matcher matcher = pattern.matcher(string);
int count = 0;
int pos = 0;
while (matcher.find(pos))
{
count++;
pos = matcher.start() + 1;
}
return count;
}
Example call:
Pattern pattern = Pattern.compile("232");
int count = countMatches(pattern, "23232"); // Returns 2
If you want a non-regular-expression search, just compile your pattern appropriately with the LITERAL flag:
Pattern pattern = Pattern.compile("1+1", Pattern.LITERAL);
int count = countMatches(pattern, "1+1+1"); // Returns 2
You can number of occurrences using inbuilt library function:
import org.springframework.util.StringUtils;
StringUtils.countOccurrencesOf(result, "R-")
Increment lastIndex whenever you look for next occurrence.
Otherwise it's always finding the first substring (at position 0).
The answer given as correct is no good for counting things like line returns and is far too verbose. Later answers are better but all can be achieved simply with
str.split(findStr).length
It does not drop trailing matches using the example in the question.
public int indexOf(int ch,
int fromIndex)
Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.
So your lastindex value is always 0 and it always finds hello in the string.
try adding lastIndex+=findStr.length() to the end of your loop, otherwise you will end up in an endless loop because once you found the substring, you are trying to find it again and again from the same last position.
Try this one. It replaces all the matches with a -.
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int numberOfMatches = 0;
while (str.contains(findStr)){
str = str.replaceFirst(findStr, "-");
numberOfMatches++;
}
And if you don't want to destroy your str you can create a new string with the same content:
String str = "helloslkhellodjladfjhello";
String strDestroy = str;
String findStr = "hello";
int numberOfMatches = 0;
while (strDestroy.contains(findStr)){
strDestroy = strDestroy.replaceFirst(findStr, "-");
numberOfMatches++;
}
After executing this block these will be your values:
str = "helloslkhellodjladfjhello"
strDestroy = "-slk-djladfj-"
findStr = "hello"
numberOfMatches = 3
As #Mr_and_Mrs_D suggested:
String haystack = "hellolovelyworld";
String needle = "lo";
return haystack.split(Pattern.quote(needle), -1).length - 1;
Based on the existing answer(s) I'd like to add a "shorter" version without the if:
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int count = 0, lastIndex = 0;
while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
lastIndex += findStr.length() - 1;
count++;
}
System.out.println(count); // output: 3
Here is the advanced version for counting how many times the token occurred in a user entered string:
public class StringIndexOf {
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
System.out.println("Enter a sentence please: \n");
String string = scanner.nextLine();
int atIndex = 0;
int count = 0;
while (atIndex != -1)
{
atIndex = string.indexOf("hello", atIndex);
if(atIndex != -1)
{
count++;
atIndex += 5;
}
}
System.out.println(count);
}
}
This below method show how many time substring repeat on ur whole string. Hope use full to you:-
String searchPattern="aaa"; // search string
String str="aaaaaababaaaaaa"; // whole string
int searchLength = searchPattern.length();
int totalLength = str.length();
int k = 0;
for (int i = 0; i < totalLength - searchLength + 1; i++) {
String subStr = str.substring(i, searchLength + i);
if (subStr.equals(searchPattern)) {
k++;
}
}
This solution prints the total number of occurrence of a given substring throughout the string, also includes the cases where overlapping matches do exist.
class SubstringMatch{
public static void main(String []args){
//String str = "aaaaabaabdcaa";
//String sub = "aa";
//String str = "caaab";
//String sub = "aa";
String str="abababababaabb";
String sub = "bab";
int n = str.length();
int m = sub.length();
// index=-1 in case of no match, otherwise >=0(first match position)
int index=str.indexOf(sub), i=index+1, count=(index>=0)?1:0;
System.out.println(i+" "+index+" "+count);
// i will traverse up to only (m-n) position
while(index!=-1 && i<=(n-m)){
index=str.substring(i, n).indexOf(sub);
count=(index>=0)?count+1:count;
i=i+index+1;
System.out.println(i+" "+index);
}
System.out.println("count: "+count);
}
}
Matcher.results()
You can find the number of occurrences of a substring in a string using Java 9 method Matcher.results() with a single line of code.
It produces a Stream of MatchResult objects which correspond to captured substrings, and the only thing needed is to apply Stream.count() to obtain the number of elements in the stream.
public static long countOccurrences(String source, String find) {
return Pattern.compile(find) // Pattern
.matcher(source) // Mather
.results() // Stream<MatchResults>
.count();
}
main()
public static void main(String[] args) {
System.out.println(countOccurrences("helloslkhellodjladfjhello", "hello"));
}
Output:
3
here is the other solution without using regexp/patterns/matchers or even not using StringUtils.
String str = "helloslkhellodjladfjhelloarunkumarhelloasdhelloaruhelloasrhello";
String findStr = "hello";
int count =0;
int findStrLength = findStr.length();
for(int i=0;i<str.length();i++){
if(findStr.startsWith(Character.toString(str.charAt(i)))){
if(str.substring(i).length() >= findStrLength){
if(str.substring(i, i+findStrLength).equals(findStr)){
count++;
}
}
}
}
System.out.println(count);
If you need the index of each substring within the original string, you can do something with indexOf like this:
private static List<Integer> getAllIndexesOfSubstringInString(String fullString, String substring) {
int pointIndex = 0;
List<Integer> allOccurences = new ArrayList<Integer>();
while(fullPdfText.indexOf(substring,pointIndex) >= 0){
allOccurences.add(fullPdfText.indexOf(substring, pointIndex));
pointIndex = fullPdfText.indexOf(substring, pointIndex) + substring.length();
}
return allOccurences;
}
public static int getCountSubString(String str , String sub){
int n = 0, m = 0, counter = 0, counterSub = 0;
while(n < str.length()){
counter = 0;
m = 0;
while(m < sub.length() && str.charAt(n) == sub.charAt(m)){
counter++;
m++; n++;
}
if (counter == sub.length()){
counterSub++;
continue;
}
else if(counter > 0){
continue;
}
n++;
}
return counterSub;
}
🍑 Just a little more peachy answer
public int countOccurrences(String str, String sub) {
if (str == null || str.length() == 0 || sub == null || sub.length() == 0) return 0;
int count = 0;
int i = 0;
while ((i = str.indexOf(sub, i)) != -1) {
count++;
i += sub.length();
}
return count;
}
I was asked this question in an interview just now and I went completely blank. (Like always, I told myself that the moment the interview ends ill get the solution) which I did, 5 mins after the call ended :(
int subCounter=0;
int count =0;
for(int i=0; i<str.length(); i++) {
if((subCounter==0 && "a".equals(str.substring(i,i+1)))
|| (subCounter==1 && "b".equals(str.substring(i,i+1)))
|| (subCounter==2 && "c".equals(str.substring(i,i+1)))) {
++subCounter;
}
if(subCounter==3) {
count = count+1;
subCounter=0;
}
}
System.out.println(count);

Categories

Resources