Regular Expression in java- code error - java

I wrote a regEx program in java. I think that is true but The result is different. please help me to fix that.
my code:
String text ="My wife back me up over my decision to quit my job";
String patternString = "[/w/s]*back(\\s\\w+\\s)*up[/w/s]*.";
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
boolean matches = matcher.matches();
System.out.println("matches = " + matches);
output:
matches = false
I'm new in java programming. I want to write a program with regEx to test match of "back up" in the input sentence.
Thanks for your attention.

I think you pattern should be like this:
String patternString = "[\\w\\s]*back(\\s\\w+\\s)*up[\\w\\s]*.";

You are using forward slashes instead of backslashes:
String patternString = "[/w/s]*back(\\s\\w+\\s)*up[/w/s]*.";
^ ^ ^ ^
The two are not interchangeable (and don't forget that backslashes need to be doubled up).

Related

java.util.regex matching parentheses

I am using java.util.regex for matching like bellow
public static void main(String[] args) {
String input = "<b>I love you (LT): </b>xxxxxxxxxxxxxxxxxxxxxxxxx";
String patternStr = "I love you (LT):";
String noParentStr = "I love you";
Pattern pattern = Pattern.compile(patternStr);
Pattern noParentPattern = Pattern.compile(noParentStr);
Matcher matcher = pattern.matcher(input);
Matcher noParrentTheseMatcher = noParentPattern.matcher(input);
System.out.println("result:" + matcher.find());
System.out.println("result no parenthese:" + noParrentTheseMatcher.find());
}
I can see the input string contain patternStr "I love you (LT):". But I get the result
result:false
result no parenthese:true
How can i match string contain parentheses '(',')'
In regex, parentheses are meta characters.
i.e., they are reserved for special use.
Specifically a feature called "Capture Groups".
Try escaping them with a \ before each bracket
I love you \(LT\):
List of all special characters that need to be escaped in a regex
As it has been pointed out in the comments, you don't need to use a regex to check if your input String contains I love you (LT):. In fact, there is no actual pattern to represent, only a character by character comparison between a portion of your input and the string you're looking for.
To achieve what you want, you could use the contains method of the String class, which suits perfectly your needs.
String input = "<b>I love you (LT): </b>xxxxxxxxxxxxxxxxxxxxxxxxx";
String strToLookFor = "I love you (LT):";
System.out.println("Result w Contains: " + input.contains(strToLookFor)); //Returns true
Instead, if you actually need to use a regex because it is a requirement. Then, as #Yarin already said, you need to escape the parenthesis since those are characters with a special meaning. They're in fact employed for capturing groups.
String input = "<b>I love you (LT): </b>xxxxxxxxxxxxxxxxxxxxxxxxx";
String strToLookFor = "I love you (LT):";
Pattern pattern = Pattern.compile(strPattern);
Matcher matcher = pattern.matcher(input);
System.out.println("Result w Pattern: " + matcher.find()); //Returns true

Extracting a string using Regex

I have the following code to extract the string within double quotes using Regex.
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
The output I get now is java programming.But from the String str I want the content in the second double quotes which is programming. Can any one tell me how to do that using Regex.
If you take your example, and change it slightly to:
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
int i = 0
while(matcher.find()){
System.out.println("match " + ++i + ": " + matcher.group(1) + "\n");
}
You should find that it prints:
match 1: Java
match 2: programming
This shows that you are able to loop over all of the matches. If you only want the last match, then you have a number of options:
Store the match in the loop, and when the loop is finished, you have the last match.
Change the regex to ignore everything until your pattern, with something like: Pattern.compile(".*\"([^\"]*)\"")
If you really want explicitly the second match, then the simplest solution is something like Pattern.compile("\"([^\"]*)\"[^\"]*\"([^\"]*)\""). This gives two matching groups.
If you want the last token inside double quotes, add an end-of-line archor ($):
final Pattern pattern = Pattern.compile("\"([^\"]*)\"$");
In this case, you can replace while with if if your input is a single line.
Great answer from Paul. Well,You can also try this pattern
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
Java program
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Explanation
,\": matches a comma, followed by a quotation mark "
(\\w+): matches one or more words
\": matches the last quotation mark "
Then the group(\\w+) is captured (group 1 precisely)
Output
programming

How to parse a range input in java

I want to parse a range of data (e.g. 100-2000) in Java. Is this code correct:
String patternStr = "^(\\\\d+)-(\\\\d+)$";
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
if(matcher.find()){
// Doing some parser
}
Too many backslashes, and you can use matches() without anchors (^$).
String inputStr = "100-2000";
String patternStr = "(\\d+)-(\\d+)";
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
if (matcher.matches()) {
System.out.println(matcher.group(1) + " - " + matcher.group(2));
}
As for your question "Is this code correct", all you had to do was wrap the code in a class with a main method and run it, and you'd get the answer: No.
No, you're double (well, quadruple)-escaping the digits.
It should be: "^(\\d+)-(\\d+)$".
Meaning:
Start of input: ^
Group 1: 1+ digit(s): (\\d+)
Hyphen literal: -
Group 2: 1+ digit(s): (\\d+)
End of input: $
Notes
The groups are useful for back-references. Here you're using none, so you can ditch the parenthesis around the \\d+ expressions.
You are parsing the representation of a range in this example.
If you want an actual range class, you can use the [min-max] idiom, where "min" and "max" are numbers, for instance [0-9].
As mentioned by Andreas, you can use String.matches without the Pattern-Matcher idiom and the ^ and $, if you want to match the whole input.

Regex - to accept latin/ucs2 characters

I am trying to write a regex to accept latin/UCS2 characters. But I am getting error while doing that. In the following code, the 'text1' should pass for the pattern. I am still working on this. can anyone please help me in fxing this?
String text1 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz !\"#$%&'()*+,-./:;<=>?#"
+ "{|}~¡ ";
String pattern = "^[a-zA-Z0-9\\*\\?\\$\\[\\]\\(\\)\\|\\{\\}\\/\\'\\#\\~\\.,;\"\\<=\\>-#%&!+:~¡ ]+$";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(text1);
if (m.find()) {
System.out.println("true");
}
What is not working? Is the pattern not matching or is there an error message?
What I see first you have escaped so many characters, that doesn't need to be escaped and an important one is not escaped.
In a character class there are only a few characters that have a special meaning []- and ^ when it is at the first position. You haven't escaped the -, this can cause an error, so try:
String pattern = "^[a-zA-Z0-9*?$\\[\\]()|{}/'#~.,;\"<=>\\-#%&!+:~¡ £¤¥ §¿ ÄÅÆÇÉÑÖØÜßàäåæ èéìñòöøùü ]+$";
The next thing is: Have a look at Unicode Properties/Scripts. You can e.g. use \\p{L} to match a letter in any language.
String pattern = "^[\\p{L}\\p{M}0-9*?$\\[\\]()|{}/'#~.,;\"<=>\\-#%&!+:~¡ £¤¥ §¿]+$";
Would match all letters you had in your class and more!

forming correct regular expression in dynamic string

I have a FileInputStream who reads a file which somewhere contains a string subset looking like:
...
OperatorSpecific(XXX)
{
Customer(someContent)
SaveImage()
{
...
I would like to identify the Customer(someContent) part of the string and switch the someContent inside the parenthesis for something else.
someContent will be a dynamic parameter and will contain a string of maybe 5-10 chars.
I have used regEx before, like once or twice, but I feel that in a context such as this where I don't know what value will be inside the parenthesis I'm at a loss of how I should express it...
In summary I want to have a string returned to me which has my someContent value inside the Customer-parenthesis.
Does anyone have any bright ideas of how to get this done?
Try this one (double the escaping backslashes for the use in java!)
(?<=Customer\()[^\)]*
And replace with your content.
See it here at Regexr
(?<=Customer\() is look behind assertion. It checks at every position if there is a "Customer(" on the left, if yes it matches on the right all characters that are not a ")" with the [^\)]*, this is then the part that will be replaced.
Some working java code
Pattern p = Pattern.compile("(?<=Customer\\()[^\\)]*");
String original = "Customer(someContent)";
String Replacement = "NewContent";
Matcher m = p.matcher(original);
String result = m.replaceAll(Replacement);
System.out.println(result);
This will print
Customer(NewContent)
Using groups works and non-greedy works:
String s =
"OperatorSpecific(XXX)\n {\n" +
" Customer(someContent)\n" +
" SaveImage() {";
Pattern p = Pattern.compile("Customer\\((.*?)\\)");
Matcher matcher = p.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
will print
someContent
Untested, but something like the following should work:
Pattern pattern = Pattern.compile("\\s+Customer\\(\\s*(\\w+)\\s*\\)\\s*");
Matcher matcher = pattern.matcher(input);
matcher.matches();
System.out.println(matcher.group(1));
EDIT
This of course won't work with all possible cases:
// legal variable names
Customer(_someContent)
Customer($some_Content)

Categories

Resources