Java Regex get all numbers - java

I need to retrieve all numbers from a String, example :
"a: 1 | b=2 ; c=3.2 / d=4,2"
I want get this result :
1
2
3.2
4,2
So, i don't know how to say that in Regex on Java.
Actually, i have this :
(?<=\D)(?=\d)|(?<=\d)(?=\D)
He split letter and number (but the double value is not respected), and the result is :
1
2
3
2 (problem)
4
2 (problem)
Can you help me ?
Thanks :D

You might use a capturing group with a character class:
[a-z][:=]\h*(\d+(?:[.,]\d+)?)
Explanation
[a-z] Word boundary, match a char a-z
[:=] Match either : or =
\h* Match 0+ horizontal whitespace chars
( Capture group 1
\d+(?:[.,]\d+)? Match 1+ digits with an optional decimal part with either . or ,
) Close group
Regex demo | Java demo
For example
String regex = "[a-z][:=]\\h*(\\d+(?:[.,]\\d+)?)";
String string = "a: 1 | b=2 ; c=3.2 / d=4,2";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
1
2
3.2
4,2

You can do it as follows:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) throws InterruptedException {
// Test
String s = "a: 1 | b=2 ; c=3.2 / d=4,2";
showNumbers(s);
}
static void showNumbers(String s) {
Pattern regex = Pattern.compile("\\d[\\d,.]*");
Matcher matcher = regex.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
Output:
1
2
3.2
4,2

You can use
/\d+(?:[.,]\d+)?/g
demo

You may jsut need a regex like (\d(?:[.,]\d)?) that
find a digit
evently dot/comma + other digit after
The multi-digit version is (\d+(?:[.,]\d+)?)
String value = "a: 1 | b=2 ; c=3.2 / d=4,2";
Pattern p = Pattern.compile("(\\d(?:[.,]\\d)?)");
Matcher m = p.matcher(value);
while (m.find()) {
System.out.println(m.group());
}
// 1
// 2
// 3.2
// 4,2

Related

Add all the numbers which have + symbol and replace the same with the added value

I would like to group all the numbers to add if they are supposed to be added.
Test String: '82+18-10.2+3+37=6 + 7
Here 82+18 cab be added and replaced with the value as '100.
Then test string will become: 100-10.2+3+37=6 +7
Again 2+3+37 can be added and replaced in the test string as
follows: 100-10.42=6 +7
Now 6 +7 cannot be done because there is a space after value
'6'.
My idea was to extract the numbers which are supposed to be added like below:
82+18
2+3+37
And then add it and replace the same using the replace() method in string
Tried Regex:
(?=([0-9]{1,}[\\+]{1}[0-9]{1,}))
Sample Input:
82+18-10.2+3+37=6 + 7
Java Code for identifying the groups to be added and replaced:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ReplaceAddition {
static String regex = "(?=([0-9]{1,}[\\+]{1}[0-9]{1,}))";
static String testStr = "82+18-10.2+3+37=6 + 7 ";
public static void main(String[] args) {
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(testStr);
while (matcher.find()) {
System.out.println(matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
}
}
}
Output:
82+18
2+18
2+3
3+37
Couldn't understand where I'm missing. Help would be appreciated...
I tried simplifying the regexp by removing the positive lookahead operator
(?=...)
And the enclosing parenthesis
(...)
After these changes, the regexp is as follows
static String regex = "[0-9]{1,}[\\+]{1}[0-9]{1,}";
When I run it, I'm getting the following result:
82+18
2+3
This is closer to the expected, but still not perfect, because we're getting "2+3" instead of 2+3+37. In order to handle any number of added numbers instead of just two, the expression can be further tuned up to:
static String regex = "[0-9]{1,}(?:[\\+]{1}[0-9]{1,})+";
What I added here is a non-capturing group
(?:...)
with a plus sign meaning one or more repetition. Now the program produces the output
82+18
2+3+37
as expected.
Another solution is like so:
public static void main(String[] args)
{
final var p = Pattern.compile("(?:\\d+(?:\\+\\d+)+)");
var text = new StringBuilder("82+18-10.2+3+37=6 + 7 ");
var m = p.matcher(text);
while(m.find())
{
var sum = 0;
var split = m.group(0).split("\\+");
for(var str : split)
{
sum += Integer.parseInt(str);
}
text.replace(m.start(0),m.end(0),""+sum);
m.reset();
}
System.out.println(text);
}
The regex (?:\\d+(?:\\+\\d+)+) finds:
(?: Noncapturing
\\d+ Any number of digits, followed by
(?: Noncapturing
\\+ A plus symbol, and
\\d+ Any number of digits
)+ Any number of times
) Once
So, this regex matches an instance of any number of numbers separated by '+'.

Java Regex: Grouping consecutive 1 or 0 in a binary string

I want to capture all the consecutive groups in a binary string
1000011100001100111100001
should give me
1
0000
111
0000
11
00
1111
0000
1
I have made ([1?|0?]+) regex in my java application to group the consequential 1 or 0 in the string like 10000111000011.
But when I run it in my code, there is nothing in the console printed:
String name ="10000111000011";
regex("(\\[1?|0?]+)" ,name);
public static void regex(String regex, String searchedString) {
Pattern pattern = Pattern.compile(regex);
Matcher regexMatcher = pattern.matcher(searchedString);
while (regexMatcher.find())
if (regexMatcher.group().length() > 0)
System.out.println(regexMatcher.group());
}
To avoid syntax error in the runtime of regex, I have changed the ([1?|0?]+) to the (\\[1?|0?]+)
Why there is no group based on regex?
First - just as an explanation - your regex defines a character class ([ ... ]) that matches any of the characters 1, ?, | or 0 one or more times (+). I think you mean to have ( ... ) in it, among other things, which would make the | an alternation lazy matching a 0 or a 1. But that's not either what you want (I think ;).
Now, the solution might be this:
([01])\1*
which matches a 0 or a 1, and captures it. Then it matches any number of the same digit (\1 is a back reference to what ever is captured in the first capture group - in this case the 0 or the 1) any number of times.
Check it out at ideone.
You can try this:
(1+|0+)
Explanation
Sample Code:
final String regex = "(1+|0+)";
final String string = "10000111000011\n"
+ "11001111110011";
final Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Group " + 1 + ": " + matcher.group(1));
}

Find All Word between < and > with Regex

I want to find word between < and > from a String.
For example:
String str=your mobile number is <A> and username is <B> thanks <C>;
I want to get A, B, C from the String.
I have tried
import java.util.regex.*;
public class Main
{
public static void main (String[] args)
{
String example = your mobile number is <A> and username is <B> thanks <C>;
Matcher m = Pattern.compile("\\<([^)]+)\\>").matcher(example);
while(m.find()) {
System.out.println(m.group(1));
}
}
}
What's wrong with what I am doing?
Use the following idiom and back-reference to get the values for your A, B and C placeholders:
String example = "your mobile number is <A> and username is <B> thanks <C>";
// ┌ left delimiter - no need to escape here
// | ┌ group 1: 1+ of any character, reluctantly quantified
// | | ┌ right delimiter
// | | |
Matcher m = Pattern.compile("<(.+?)>").matcher(example);
while (m.find()) {
System.out.println(m.group(1));
}
Output
A
B
C
Note
If you favor a solution with no indexed back-reference, and "look-arounds", you can achieve the same with the following code:
String example = "your mobile number is <A> and username is <B> thanks <C>";
// ┌ positive look-behind for left delimiter
// | ┌ 1+ of any character, reluctantly quantified
// | | ┌ positive look-ahead for right delimiter
// | | |
Matcher m = Pattern.compile("(?<=<).+?(?=>)").matcher(example);
while (m.find()) {
// no index for back-reference here, catching main group
System.out.println(m.group());
}
I personally find the latter less readable in this instance.
You need to use > or <> inside the negated character class. [^)]+ in your regex matches any charcater but not of ), one or more times. So this would match also the < or > symbols.
Matcher m = Pattern.compile("<([^<>]+)>").matcher(example);
while(m.find()) {
System.out.println(m.group(1));
}
OR
Use lookarounds.
Matcher m = Pattern.compile("(?<=<)[^<>]*(?=>)").matcher(example);
while(m.find()) {
System.out.println(m.group());
}
Can you please try this?
public static void main(String[] args) {
String example = "your mobile number is <A> and username is <B> thanks <C>";
Matcher m = Pattern.compile("\\<(.+?)\\>").matcher(example);
while(m.find()) {
System.out.println(m.group(1));
}
}

Extract substring that appears after certain pattern

I need to extract a substring that appears after a certain pattern in the input string. I have been trying various combinations but not getting expected output.
The input string can be in following 2 forms
1. 88,TRN:2014091900217161 SNDR REF:149IF1007JMO2507 BISCAYNE BLVD STE
2. 88,TRN:2014091900217161 SNDR REF:149IF1007JMO2507
I need to write a regex that will be applicable to above 2 variations and extract '149IF1007JMO2507' part that follows 'SNDR REF:'.
Please find below sample program that i have written.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTester {
private static final String input = "88,TRN:2014091900217161 SNDR REF:149IF1007JMO2507 BISCAYNE BLVD STE";
private static Pattern pattern = Pattern.compile(".*SNDR REF:(.*?)(\\s.)*");
private static Matcher matcher = pattern.matcher(input);
public static void main (String[] args) {
if (matcher.matches()) {
System.out.println(matcher.group(1));
}
}
}
Output:149IF1007JMO2507 BISCAYNE BLVD STE
I want output to be '149IF1007JMO2507'
Thank you.
You can use the following idiom to find your sub-string:
String[] examples = {
"88,TRN:2014091900217161 SNDR REF:149IF1007JMO2507 BISCAYNE BLVD STE",
"88,TRN:2014091900217161 SNDR REF:149IF1007JMO2507"
};
// ┌ look-behind for "SNDR REF:"
// | ┌ anything, reluctantly quantified
// | | ┌ lookahead for
// | | | whitespace or end of input
Pattern p = Pattern.compile("(?<=SNDR\\sREF:).+?(?=\\s|$)");
// iterating examples
for (String s: examples) {
Matcher m = p.matcher(s);
// iterating single matches (one per example here)
while (m.find()) {
System.out.printf("Found: %s%n", m.group());
}
}
Output
Found: 149IF1007JMO2507
Found: 149IF1007JMO2507
Note
I expect you don't know in advance it's going to be "149IF1007JMO2507", hence the contextual matching.
You can use this regexp:
private static Pattern pattern = Pattern.compile(".*SNDR REF:([^\\s]+).*");
This will take everything after "SNDR REF
You can do it with replaceAll
str = str.replaceAll(".*(REF:(\\S+)).*", "$2");

Regex for 2 different strings accounting for optional elements

I have two strings "2007 AL PLAIN TEXT 5567 (NS)" and "5567" in the second string, I only want to extract one group out of both the strings which is 5567. How do I write a java regex for this ? The format will be 4 digit year, 2 digit jurisdiction, the string plain text, then the number I want to extract and finally (NS) but the problem is all except the number can be optional, How do I write a regex for this that can capture the number 5567 only in a group ?
You can do it in one line:
String num = input.replaceAll("(.*?)?(\\b\\w{4,}\\b)(\\s*\\(NS\\))?$", "$2");
Assuming your target is "a word at least 4 alphanumeric characters long".
You need to use ? quantifier, which means that the match is optional, '?:' groups a match, but doesn't create a backreference for that group.Here is the code:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class Regexp
{
public static void main(String args[])
{
String x = "2007 AL PLAIN TEXT 5567 (NS)";
String y = "5567";
Pattern pattern = Pattern.compile( "(?:.*[^\\d])?(\\d{4,}){1}(?:.*)?");
Matcher matcher = pattern.matcher(x);
while (matcher.find())
{
System.out.format("Text found in x: => \"%s\"\n",
matcher.group(1));
}
matcher = pattern.matcher(y);
while (matcher.find())
{
System.out.format("Text found in y: => \"%s\"\n",
matcher.group(1));
}
}
}
Output:
$ java Regexp
Text found in x: => "5567"
Text found in y: => "5567"

Categories

Resources