Regular Expression with parenthesis [duplicate] - java

This question already has answers here:
List of all special characters that need to be escaped in a regex
(10 answers)
Closed 4 years ago.
I wrote a Java program to find the pattern "Type": "Value" in a string.This is the pattern i wrote -> "Type[\W]*Value" . "Value" is replaced with actual value at run time
It works for strings like "Type":"Name" or "Type":"Name<" but when i pass parenthesis, it fails.
"Type":"Name(" - java.util.regex.PatternSyntaxException: Unclosed group near inex.
"Type":"Name()" - No match found.
I don't have much experience writing regular expression. Can someone please help me in understanding the issue.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang3.StringUtils;
public class PatternFinder {
public static void main(String args[]) {
System.out.println(findPattern("Name","abcd7<","{\"response\":{\"Name\":\"abcd7<\"}}"));
System.out.println(findPattern("Name","abcd7(","{\"response\":{\"Name\":\"abcd7(\"}}"));
}
private static int findPattern(String InputType, String InputValue, String responseString) {
System.out.println("extractEncode" + responseString);
int indexOfInputPattern = -1;
String InputStringPattern = "InputType[\\W]*InputValue";
InputStringPattern = StringUtils.replaceEach(InputStringPattern, new String[] { "InputType", "InputValue" },
new String[] { InputType, InputValue });
System.out.println(InputStringPattern);
if (!StringUtils.isBlank(InputStringPattern)) {
Pattern pattern = Pattern.compile(InputStringPattern);
indexOfInputPattern = indexOf(pattern, responseString);
}
return indexOfInputPattern;
}
private static int indexOf(Pattern pattern, String s) {
System.out.println(pattern.toString()+" "+s);
Matcher matcher = pattern.matcher(s);
return matcher.find() ? matcher.end() : -1;
}
}

You can either escape the parentheses by adding a backslash in front of them or use Pattern.quote to escape parts you don't want to be interpreted as Pattern.
Read more here https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html#quote(java.lang.String)

If you're searching for brackets () in regular expressions, you have to escape them with a backslash \. So you search for Name\(. This is because brackets have a special meaning in regular expressions.
To make things more complicated, \ is a special character in Java Strings, so you may find you have to escape that too. So your final expression is likely to look like Name\\(.

Related

Fix Regular Expression to allow optional fields

A data-line looks like this:
$POSL,VEL,SPL,,,4.1,0.0,4.0*12
The 7th field (4.1) is extracted to the named field SPEED using this Java Regexp.
\\$POSL,VEL,SPL,,,(?<SPEED>\\d+.\\d+),.*
New data has slightly changed. The fields in 4,5,6 may now contain data:
$POSL,VEL,SPL,a,b,c,4.0,a,b,c,d
But, the Regexp is now returning zero. Note: fields 4, 5, 6 may contain letters or numbers. But, they will not contain quoted Strings (so we don't need to worry about quoted commas).
Can someone offer a fix please?
You could optionally repeat chars a-zA-Z and digits using ,[A-Za-z0-9]*
As there is 1 comma more in the second string, you can make that part optional.
If you are not interested in the last part, but only in the capturing group, you can omit .* at the end. If the value can also occur at the end of the string, you can end the pattern with an alternation (?:,|$)
Note to escape the dot in this part \\d+\\.\\d+
\$POSL,VEL,SPL,[A-Za-z0-9]*,[A-Za-z0-9]*,(?:[A-Za-z0-9]*,)?(?<SPEED>\d+\.\d+)(?:,|$)
In Java with double escaped backslashes
String regex = "\\$POSL,VEL,SPL,[A-Za-z0-9]*,[A-Za-z0-9]*,(?:[A-Za-z0-9]*,)?(?<SPEED>\\d+\\.\\d+)(?:,|$)";
Regex demo
You may use \w+ for any digit/letter, for the fields 4, 5, 6
\\$POSL,VEL,SPL,\\w*,\\w*,\\w*,(?<SPEED>\\d+.\\d+),.*
REGEX DEMO
Note that in your post, the example and the regex may miss a comma to get the numbre as seventh field
Assuming in first input one , was missing.
package arraysAndStrings;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexGroupCapture {
public static void main(String[] args) {
String inputArr[] = { "$POSL,VEL,SPL,,,,4.1,0.0,4.0*12",
"$POSL,VEL,SPL,a,b,c,4.0,a,b,c,d" };
for (String input : inputArr) {
System.out.println(extractSpeed(input));
}
}
private static float extractSpeed(String input) {
float speed = 0;
try {
String regex = "\\$POSL,VEL,SPL,.*?,.*?,.*?,(?<SPEED>\\d+.\\d+),.*";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
speed = Float.parseFloat(matcher.group(1));
}
} catch (Exception e) {
e.printStackTrace();
}
return speed;
}
}
Output
=====
4.1
4.0

Finding the any number of characters between brackets using regex? [duplicate]

This question already has answers here:
Print regex matches in java
(2 answers)
Closed 3 years ago.
I've been struggling with some regex parsing to get a particular substring out of a string. I want to get the {0} out of a string. The caveat is that the substring can have any number of 0'
s within the {} and there can be many instances of {0} in the String. A few example inputs are:
{0} should print {0}
F-{000} print {000}
F-{00000000}-{0000} print {00000000} & {0000}
Here is the code I have:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main
{
public static void main(String[] args) {
String displayFormat = "A-{0000}";
printValue("^\\{[0]+}$", displayFormat); // this searches for a string beginning with {, ends with }, and has 1 or more instances of 0 in it
printValue("\\{[0]+\\}", displayFormat); // same as above but without the strict requirement of beginning and ending with {}, rather looks for the literal {}
}
public static void printValue(String regex, String displayFormat) {
final Matcher matcher = Pattern.compile(regex).matcher(displayFormat);
String zeroSection = null;
while(matcher.find()) {
if(matcher.groupCount() > 0) {
System.out.println("Group: " + matcher.group(1));
}
}
}
}
Any help would be appreciated. Thanks!
I was able to find the correct regex. The regex that was posted in the answer above seems to work, but I guess the issue was more around how I was printing the string. I should have been printing group(0) instead of group(1). Silly mistake.

String Split using a regular expression in Java?

I am trying split a string based on regular expression which contains "[.,?!]+'" all these characters including a single space but splitting is not happening?
Here's my class:
public class splitStr {
public static void main(String[] args) {
String S="He is a very very good boy, isn't he?";
S.trim();
if(1<=S.length() && S.length()<=400000){
String delim ="[ .,?!]+'";
String []s=S.split(delim);
System.out.println(s.length);
for(String d:s)
{
System.out.println(d);
}
}
}
}
The reason it's not working is because not all the delimiters are within the square brackets.
String delim ="[ .,?!]+'"; // you wrote this
change to this:
String delim ="[ .,?!']";
Do the characters +, ', [ and ] must be part of the split?
I'm asking this because plus sign and brackets have special meaning in regular expressions, and if you want them to be part of the match, they must be escaped with \
So, if you want an expression that includes all these characters, it should be:
delim = "[\\[ .,\\?!\\]\\+']"
Note that I had to write \\ because the backslash needs to be escaped inside java strings. I'm also not sure if ? and + need to be escaped because they're inside brackets (test it with and without backslashes before them)
I'm not in a front of a computer right now, so I haven't tested it, but I believe it should work.
import java.util.*;
import java.util.stream.Collectors;
public class StringToken {
public static void main(String[] args) {
String S="He is a very very good boy, isn't he?";
S.trim();
if(1<=S.length() && S.length()<=400000){
String delim = "[ .,?!']";
String []s=S.split(delim);
List<String> d = Arrays.asList(s);
d= d.stream().filter(item-> (item.length() > 0)).collect(Collectors.toList());
System.out.println(d.size());
for(String m:d)
{
System.out.println(m);
}
}
}
}

Why do strings with newlines do not match regular expressions in Java?

I have a String string that contains a newline (\n). When I try to match it with a regular expression pattern it returns false, although there should be a match.
package com.stackoverflow;
public class ExgExTest {
public static void main(String[] args) {
String pattern = ".*[0-9]{2}[A-Z]{2}.*";
String string = "123ABC\nDEF";
if (string.matches(pattern)) {
System.out.println("Matches.");
} else {
System.out.println("Does not match.");
}
} // END: main()
} // END: class
How can I match multiline strings with a regular expression?
How can I match multiline strings with a regular expression?
You need to use DOTALL (s) flag for this:
String pattern = "(?s).*[0-9]{2}[A-Z]{2}.*";
Take note of (?s) which will make DOT match new lines also.
You should use Pattern.quote(pattern) to escape all special characters in the pattern.
Documentation.

Can someone explain the regular expression part of String#replaceAll(..) method? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
//Its a question on replacement of duplicate characters
public class RemoveDuplicateChars {
static String testcase1 = "DPMD Jayawardene";
public static void main(String args[]){
RemoveDuplicateChars testInstance= new RemoveDuplicateChars();
String result = testInstance.remove(testcase1);
System.out.println(result);
}
//write your code here
public String remove(String str){
return str.replaceAll("(.)(?=.*\\1)", "");//how this replacement working
}
}
As you can see from the name of the class - it removes characters that repeat in a string.
Breakdown:
(.) - stands for any character, the brackets are used for grouping, so we'll be able to reference it later on using \1
(?=) - lookahead
(?=.*\\1) - we're looking forward
.* consuming any number of characters and looking for our first character\1
If the regex is truthy, the referenced character will be replaced with the empty string.
See Fiddle
From java.util.Pattern:
(.) : Match any character in a capture group (basically a variable named \1)
(?= : Zero-width positive lookahead (make sure the rest of the string matches)
.* any number of characters followed by
\\1 the captured group
In other words, it matches any character that also appears later in the string (i.e. is a duplicate). In Java, this would be:
for(int i=0; i<str.length(); i++) {
char captured = str.charAt(i); // (.)
if (str.substring(i+1).matches(".*" + captured)) { // (?=.*\1)
// the char is a duplicate, replace it with ""
}
}

Categories

Resources