Find two substrings using one regex - java

External app sends following line:
U999;U999;$SMS=;client: John Doe; A$ABC12345;, SHA:12345ABCDE
I need to extract 2 values from it: John Doe and 12345ABCDE
Now I can extract separately those 2 values using regex:
(?=client:(.*?);) for John Doe
(?=SHA:(.*?)$) for 12345ABCDE
Is it possible to extract those values using one regex in Pattern and extract them as list of 2 values?

You could use a pattern matcher with two capture groups:
String input = "U999;U999;$SMS=;client: John Doe; A$ABC12345;, SHA:12345ABCDE";
String pattern = "^.*;\\s*client: ([^;]+);.*;.*\\bSHA:([^;]+).*$";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
if (m.find()) {
System.out.println("client: " + m.group(1));
System.out.println("SHA: " + m.group(2));
}
This prints:
client: John Doe
SHA: 12345ABCDE

Related

How to get only First Name Last Name from LDAP CN when format is last name\, first name

CN=Belzile\, Pierre,OU=LaptopUser,OU=Users,DC=Company,DC=local
I need only "Belzile Pierre" to be returned.
I need help with the regex syntax
For the regular expression we use Java syntax https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html.
Expected Result:
Belzile Pierre
You can use this regex and capture firstname and last name in group1 and group2,
CN=([a-zA-Z]+)\\,\s+([a-zA-Z]+)
Demo
Java code,
String s = "CN=Belzile\\, Pierre,OU=LaptopUser,OU=Users,DC=Company,DC=local";
Pattern p = Pattern.compile("CN=([a-zA-Z]+)\\\\,\\s+([a-zA-Z]+)");
Matcher m = p.matcher(s);
if(m.find()) {
System.out.println(m.group(1) + " " + m.group(2));
}
Prints your expected output,
Belzile Pierre

java parse regex multiple capture groups

Hi I need to be able to handle both of these scenarios
John, party of 4
william, party of 6 dislikes John, jeff
What I want to capture is
From string 1: John, 4
From String 2: william, 6, john, jeff
I'm pretty stumped at how to achieve this
I know that ([^,])+ gives me the first group (just the name before the comma, without including the comma) but I have no clue on how to concatenate the other portion of the expression.
You may use
(\w+)(?:,\s*party of (\d+)|(?![^,]))
See the regex demo.
Details
(\w+) - Group 1: one or more word chars
(?:,\s*party of (\d+)|(?![^,])) - a non-capturing group matching
,\s*party of (\d+) - ,, then 0+ whitespaces, then party of and a space, and then Group 2 capturing 1+ digits
| - or
(?![^,]) - a location that is followed with , or end of string.
See Java demo:
String regex = "(\\w+)(?:,\\s*party of (\\d+)|(?![^,]))";
List<String> strings = Arrays.asList("John, party of 4", "william, party of 6 dislikes John, jeff");
Pattern pattern = Pattern.compile(regex);
for (String s : strings) {
System.out.println("-------- Testing '" + s + "':");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1) + ": " + (matcher.group(2) != null ? matcher.group(2) : "N/A"));
}
}
Output:
-------- Testing 'John, party of 4':
John: 4
-------- Testing 'william, party of 6 dislikes John, jeff':
william: 6
John: N/A
jeff: N/A

Java pattern capture word(s)

I'm trying to capture word or words from a string like this:
input: "aa bb"
pattern: "(.*) bb"
expected group: "aa"
input: "aa yy bb xx"
pattern: "(.*) bb (.*)"
expected groups: "aa yy, xx"
But in my attempts it always captures whole string. Where is my mistake?
String patternString = "(.*) bb";
Log("patternString: " + patternString);
Pattern p = Pattern.compile(patternString);
Matcher m = p.matcher("aa bb");
while(m.find()) {
Log("group: " + m.group());
//Log: group: aa bb
}
You want to get the first group not the entire match. You should use m.group(1) for this, instead of m.group() which returns the entire match.
See the documentation of Matcher for the available API. Use Matcher#groupCount() to get the number of groups in the last match.

avoid using multiple split method

I have a string like this.
//Locaton;RowIndex;maxRows=New York, NY_10007;1;4
From this i need to get the contry name New York only.
How it can possible in a single step code.
i used..
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
str = str.split("=")[1];
str = str.split(",")[0]
the above code contails lots of splits.How can i avoid thiis.
i want to get the contry name only using single code.
Try to use this regular expression "=(.*?)," like this:
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
Pattern pattern = Pattern.compile("=(.*?),");
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
New York
Using matcher.group(1) means capturing groups make it easy to extract part of the regex match,parentheses also create a numbered capturing group.
It stores the part of the string matched by the part of the regular expression inside the parentheses.
Match "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 "
Group 1: "New York"
Use capture groups with regex which perfect capturing the specific data from string.
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
String pattern = "(.*?=)(.*?)(,.*)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(str);
if (m.find()) {
System.out.println("Group 1: " + m.group(1));
System.out.println("Group 2: " + m.group(2));
System.out.println("Group 3: " + m.group(3));
}
Here is the output
Group 1: Locaton;RowIndex;maxRows=
Group 2: New York
Group 3: , NY_10007;1;4

Getting overlapping matches with multiple patterns in Java regex

I have the same problem as in this link
but with multiple patterns. My regex is like:
Pattern word = Pattern.compile("([\w]+ [\d]+)|([\d]+ suite)|([\w]+ road)");
If my sample text is,
XYZ Road 123 Suite
My desire output is,
XYZ Road 123
123 suite
But am getting
XYZ Road 123
only.
Thanks in advance!
You could try the below regex which uses positive lookahead assertion.
(?=(\b\w+ Road \d+\b)|(\b\d+ suite\b))
DEMO
String s = "XYZ Road 123 Suite";
Matcher m = Pattern.compile("(?i)(?=(\\b\\w+ Road \\d+\\b)|(\\b\\d+ suite))").matcher(s);
while(m.find())
{
if(m.group(1) != null) System.out.println(m.group(1));
if(m.group(2) != null) System.out.println(m.group(2));
}
Output:
XYZ Road 123
123 Suite
(?=(\b[\w]+ [\d]+))|(?=(\b[\d]+ suite))|(?=(\b[\w]+ road))
Try this.See demo.Grab the captures.
https://regex101.com/r/dU7oN5/16
Use positive lookahead to avoid string being consumed.
Something like this, maybe?
Pattern p = Pattern.compile("([\\w ] Road) (\\d+) (Suite)");
Matcher m = p.matcher(input);
if(m.find) {
System.out.println(m.group(1) + " " + m.group(2));
System.out.println(m.group(2) + " " + m.group(3));
}

Categories

Resources