How to split a String of numbers and chars, only by chars - java

i need to split a String into parts of number sequences and chars between them. Something like this:
input: "123+34/123(23*12)/100"
output[]:["123","+","34","/","123","(","23","*","12",")","/","100"]
Is this somehow possible, or is it possible to split a String by multiple chars? Otherwise, is it possible to loop through a String in Java?

You can use a regular expression.
String input = "123+34/123(23*12)/100";
Pattern pattern = Pattern.compile("\\d+|[\\+\\-\\/\\*\\(\\)]");
Matcher matcher = pattern.matcher(input);
while(matcher.find()) {
System.out.println(matcher.group());
}

Use a lookahead assertion based regex for splitting the input string.
String input = "123+34/123(23*12)/100";
System.out.println(Arrays.toString(input.split("(?<=[/)+*])\\B(?=[/)+*])|\\b")));
Output:
[123, +, 34, /, 123, (, 23, *, 12, ), /, 100]

Related

extract specific word from comma separated string in java using regex

input - [1, 1111, 2020, BMW, Frontier, EXTENDED CAB PICKUP 2-DR, Silver, 16558]
I want to extract here BMW and I am using (^(?:[^\\,]*\\,){3}) this regex.
This results into - BMW, Frontier, EXTENDED CAB PICKUP 2-DR, Silver, 16558].
Could any one help me with this? thanks in advance
As you can only enter a pattern without making use of groups, you could make use of finite repetition for example {0,1000} in the positive lookbehind as Java does not support infinite repetition.
(?<=^\\[[^,]{0,1000},[^,]{0,1000},[^,]{0,1000},\\h{0,10})\\w{3,10}(?=[^\\]\\[]*\\])
Explanation
(?<= Positive lookbehind, assert what is on the left is
^\[ Start of string, match [
[^,]{0,1000},[^,]{0,1000},[^,]{0,1000}, Match 3 times any char except , followed by the ,
\h{0,10} Match 0-10 times a horizontal whitespace char
) Close lookbehind
\w{3,10} Match 3-10 word chars
(?= Positive lookahead, assert what is on the right is
[^\]\[]*\] Match until the ]
) Close lookahead
Java demo
Code example
final String regex = "(?<=^\\[[^,]{0,1000},[^,]{0,1000},[^,]{0,1000},\\h{0,10})\\w{3,10}(?=[^\\]\\[]*\\])";
final String string = "[1, 1111, 2020, BMW, Frontier, EXTENDED CAB PICKUP 2-DR, Silver, 16558]";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Output
BMW
If it is comma seperated string, you could just split function from the string class which would convert the comma seperated string to array. Refer - Link
The string split() method breaks a given string around matches of the
given regular expression.
Syntax - Public String [ ] split ( String regex, int limit )
Input String: 016-78967
Regular Expression: -
Output : {"016", "78967"}
Then you could into the array to find out the particular keyword from it.

separate mathematical expression in java

Would anyone be able to help me with separating this mathematical expression 125*1*4*4+82*1*10+2*59+2+4 in Java. I want to get the numbers form the expression, and I'm not sure how to use split() method over here.
You can use split with this regex [*+] :
String[] numbers = "125*1*4*4+82*1*10+2*59+2+4".split("[*+]");
Outputs
[125, 1, 4, 4, 82, 1, 10, 2, 59, 2, 4]
In case the mathematical expression contains spaces, you can remove them before and use split so you can use :
String[] numbers = "125*1*4*4+82*1*10+2*59+2+4".replaceAll("\\s+", "").split("[*+]");
//---------------------------------------------^----------------------^
Note you can add another arithmetic operators like [*+-/]
Another solution from eparvan:
In case you are not sure what the expression can contain you can use :
String[] numbers = "125*1*4*4+82*1*10+2*59+2+4".replaceAll("\\s+", "").split("[^0-9]");
//----------------------------------------------------------------------------^----^
Edit
What if I want to get the "+" and "*" ?
Input: 125*1*4*4+82*1*10+2*59+2+4 Output: ***+**+*++
In this case you can split with \d+ like this :
String[] numbers = "125*1*4*4+82*1*10+2*59+2+4".replaceAll("\\s+", "").split("\\d+");
But i will prefert to go with Pattern it is more practice then split for example you can use :
String str = "125*1/4*4+82*1*10+2/59-2+4";
String regex = "[^\\d]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}

Splitting and Parsing formula String

I have below formula
(Trig01:BAO)/(((Trig01:COUNT*86400)-Trig01:UPI-Trig01:SOS)*2000)
I want to split and get output of staring values which are before colon only,
Final output need as -
{ "BAO","COUNT","UPI","SOS" }
Thanks in advance,
You can try with Positive Lookbehind in below regex pattern to get all the alphanumeric character after colon
(?<=:)[^\W]+
Online demo
Pattern explanation:
(?<= look behind to see if there is:
: ':'
) end of look-behind
[^\W]+ any character except: non-word characters
(all but a-z, A-Z, 0-9, _) (1 or more times)
Sample code:
String str="(Trig01:BAO)/(((Trig01:COUNT*86400)-Trig01:UPI-Trig01:SOS)*2000)";
Pattern p=Pattern.compile("(?<=:)[^\\W]+");
Matcher m=p.matcher(str);
while(m.find()){
System.out.println(m.group());
}
Use Regex, try this:
public static List<String> extractSubstringsFromAllMatches(String sourceString, String pattern) {
Pattern regexPattern = Pattern.compile(pattern);
Matcher matcher = regexPattern.matcher(sourceString);
List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group(1));
}
return matches;
}
Get the results you require by calling:
extractSubstringsFromAllMatches(YourString,":(\\w*)\\W")
Try this one-line solution:
String[] arr = str.replaceAll("^.*?(?=\\w+:)|:[^:]*$", "").split(":.*?(?=\\w+(:|$))");
This works by first stripping off the leading and trailing non-target chars, then splitting on the intervening chars. Matching is done using look aheads, which assert, but font capture, that a word followed by a colon follows.
Here's some test code:
String str = "(Trig01:BAO)/(((Trig02:COUNT*86400)-Trig03:UPI-Trig04:SOS)*2000)";
String[] arr = str.replaceAll("^.*?(?=\\w+:)|:[^:]*$", "").split(":.*?(?=\\w+(:|$))");
System.out.println(Arrays.toString(arr));
Output:
[Trig01, Trig02, Trig03, Trig04]

Java: extract the single matching groups from a string with regular expression [duplicate]

This question already has answers here:
How to split a string between letters and digits (or between digits and letters)?
(8 answers)
Closed 8 years ago.
I have this kind of string: 16B66C116B or 222A3*C10B
It's a number (with unknow digits) followed or by a letter ("A") or by a star and a letter ("*A"). This patter is repeated 3 times.
I want to split this string to have: [number,text,number,text,number,text]
[16, B, 66, C, 116, B]
or
[16, B, 66, *C, 116, B]
I wrote this:
String tmp = "16B66C116B";
String tmp2 = "16B66*C116B";
String pattern = "(\\d+)(\\D{1,2})(\\d+)(\\D{1,2})(\\d+)(\\D{1,2})";
boolean q = tmp.matches(pattern);
String a[] = tmp.split(pattern);
the pattern match right, but the splitting doesn't work.
(I'm open to improve my pattern string, I think that it could be write better).
You are misunderstanding the functionality of split. Split will split the string on the occurence of the given regular expression, since your expression matches the whole string it returns an empty array.
What you want is to extract the single matching groups (the stuff in the brackets) from the match. To achieve this you have to use the Pattern and Matcher classes.
Here a code snippet which will print out all matches:
Pattern regex = Pattern.compile("(\\d+)(\\D{1,2})(\\d+)(\\D{1,2})(\\d+)(\\D{1,2})");
Matcher matcher = regex.matcher("16B66C116B");
while (matcher.find()) {
for (int i = 1; i <= matcher.groupCount(); ++i) {
System.out.println(matcher.group(i));
}
}
Of course you can improve the regular expression (like another user suggested)
(\\d+)([A-Z]+)(\\d+)(\\*?[A-Z]+)(\\d+)([A-Z]+)
Try with this pattern (\\d)+|(\\D)+ and use Matcher#find() to find the next subsequence of the input sequence that matches the pattern.
Add all of them in a List or finally convert it into array.
String tmp = "16B66C116B";
String tmp2 = "16B66*C116B";
String pattern = "((\\d)+|(\\D)+)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(tmp);
while (m.find()) {
System.out.println(m.group());
}

Regex match repeatation punctuation in java

I have some punctuation [] punctuation = {'.', ',' , '!', '?'};. And I want create a regex that can match the word that was combined from those punctuations.
For example some string I want to find: "....???", "!!!!!......", "??.....!", so on.
Thanks for any advice.
Use String.matches() with the posix regex for "punctuation":
str.matches("\\p{Punct}+");
FYI according to the Pattern javadoc, \p{Punct} is one of
!"#$%&'()*+,-./:;<=>?#[\]^_`{|}~
Also, The ^ and $ aren't needed in the expression either, because matches() must matche the whole input to return true, so start and end are implied.
Try this, it should match and group all the symbols written between []:
([.,!?]+)
Tested it with
??..,..!fsdgsdfgsdfgsdfg
And output was
??..,..!
Also tested with this:
String s = "??.....!fsdgsdfgsdfgsdfg?.,!0000a";
Pattern p = Pattern.compile("([.,!?]+)");
Matcher m = p.matcher(s);
while(m.find()) {
System.out.println(m.group(1));
}
And output was
??.....!
?.,!
You can try with a Unicode category for punctuation and a while loop to match your input, as such:
String test = "!...abcd??...!!efgh....!!??abc!";
Pattern pattern = Pattern.compile("\\p{Punct}{2,}");
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
System.out.println(matcher.group());
}
Output:
!...
??...!!
....!!??
Note: this has the advantage of matching any punctuation character sequence larger than 1 character (hence, the last "!" is not matched by design). To decide the minimum length of the punctuation sequence, just play with the {2,} part of the Pattern.

Categories

Resources