Java Regular Expression Match on +/-/*/% - java

So I have a string I would like to parse and I can not get my regular expression to work. I am using https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp as my regular expression guide.
I would like my regular expression to match on any of the following symbols.
+ - * % /
My code as follows. Input String: D[1]+D[0]. Should print true...but prints false.
String tmp = "D[1]+D[0]";
if(tmp.matches("[\\+\\-\\*\\/\\%]"))
System.out.println("true");
else
System.out.println("false");
Any ideas?

This is because matches wants the entire string to be matched, not just any part of it.
You do not need to escape characters inside square brackets.
String str = "D[1]+D[0]";
Pattern p = Pattern.compile("[+-/*]");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Found: " + m.group());
}

matches() must match the entire input, but all you need do is add .* to each end:
if (tmp.matches(".*[-+*/%].*"))
Note: Characters between [] don't need escaping if the hyphen is first or last.

Related

What is wrong in regexp in Java

I want to get the word text2, but it returns null. Could you please correct it ?
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR\\w+&&(\\w+)'\\)\\)");
Matcher matcher = patter1.matcher(str);
String result = null;
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
One way to do it is to match all possible pattern in parentheses:
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR[(]{2}&&\\w+\\s*'&&(\\w+)'[)]{2}");
Matcher matcher = patter1.matcher(str);
String result = "";
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
See IDEONE demo
You can also use [^()]* inside the parentheses to just get to the value inside single apostrophes:
Pattern patter1 = Pattern.compile("SETVAR[(]{2}[^()]*'&&(\\w+)'[)]{2}");
^^^^^^
See another demo
Let me break down the regex for you:
SETVAR - match SETVAR literally, then...
[(]{2} - match 2 ( literally, then...
[^()]* - match 0 or more characters other than ( or ) up to...
'&& - match a single apostrophe and two & symbols, then...
(\\w+) - match and capture into Group 1 one or more word characters
'[)]{2} - match a single apostrophe and then 2 ) symbols literally.
Your regex doesn't match your string, because you didn't specify the opened parenthesis also \\w+ will match any combinations of word character and it won't match space and &.
Instead you can use a negated character class [^']+ which will match any combinations of characters with length 1 or more except one quotation :
String str = "Text SETVAR((&&text1 '&&text2'))";
"SETVAR\\(\\([^']+'&&(\\w+)'\\)\\)"
Debuggex Demo

Java Regular Expression: match any number of digits in round brackets if the closing bracket is the last char in the String

I need some help to save my day (or my night). I would like to match:
Any number of digits
Enclosed by round brackets "()" [The brackets contain nothing else than digits]
If the closing bracket ")" is the last character in the String.
Here's the code I have come up with:
// this how the text looks, the part I want to match are the digits in the brackets at the end of it
String text = "Some text 45 Some text, text and text (1234)";
String regex = "[no idea how to express this.....]"; // this is where the regex should be
Pattern regPat = Pattern.compile(regex);
Matcher matcher = regPat.matcher(text);
String matchedText = "";
if (matcher.find()) {
matchedText = matcher.group();
}
Please help me out with the magic expression I have only managed to match any number of digits, but not if they are enclosed in brackets and are at the end of the line...
Thanks!
You can try this regex:
String regex = "\\(\\d+\\)$";
If you need to extract just the digits, you can use this regex:
String regex = "\\((\\d+)\\)$";
and get the value of matcher.group(1). (Explanation: The ( and ) characters preceded by backslashes match the round brackets literally; the ( and ) characters not preceded by
backslashes tell the matcher that the part inside, i.e. just the digits, form a capture group, and the part matching the group can be obtained by matcher.group(1), since this is the first, and only, capture group in the regex.)
This is the required regex for your condition
\\(\\d+\\)$

Iterating through String with .find() in Java regex

I'm currently trying to solve a problem from codingbat.com with regular expressions.
I'm new to this, so step-by-step explanations would be appreciated. I could solve this with String methods relatively easily, but I am trying to use regular expressions.
Here is the prompt:
Given a string and a non-empty word string, return a string made of each char just before and just after every appearance of the word in the string. Ignore cases where there is no char before or after the word, and a char may be included twice if it is between two words.
wordEnds("abcXY123XYijk", "XY") → "c13i"
wordEnds("XY123XY", "XY") → "13"
wordEnds("XY1XY", "XY") → "11"
etc
My code thus far:
String regex = ".?" + word+ ".?";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
String newStr = "";
while(m.find())
newStr += m.group().replace(word, "");
return newStr;
The problem is that when there are multiple instances of word in a row, the program misses the character preceding the word because m.find() progresses beyond it.
For example: wordEnds("abc1xyz1i1j", "1") should return "cxziij", but my method returns "cxzij", not repeating the "i"
I would appreciate a non-messy solution with an explanation I can apply to other general regex problems.
This is a one-liner solution:
String wordEnds = input.replaceAll(".*?(.)" + word + "(?:(?=(.)" + word + ")|(.).*?(?=$|." + word + "))", "$1$2$3");
This matches your edge case as a look ahead within a non-capturing group, then matches the usual (consuming) case.
Note that your requirements don't require iteration, only your question title assumes it's necessary, which it isn't.
Note also that to be absolutely safe, you should escape all characters in word in case any of them are special "regex" characters, so if you can't guarantee that, you need to use Pattern.quote(word) instead of word.
Here's a test of the usual case and the edge case, showing it works:
public static String wordEnds(String input, String word) {
word = Pattern.quote(word); // add this line to be 100% safe
return input.replaceAll(".*?(.)" + word + "(?:(?=(.)" + word + ")|(.).*?(?=$|." + word + "))", "$1$2$3");
}
public static void main(String[] args) {
System.out.println(wordEnds("abcXY123XYijk", "XY"));
System.out.println(wordEnds("abc1xyz1i1j", "1"));
}
Output:
c13i
cxziij
Use positive lookbehind and postive lookahead which are zero-width assertions
(?<=(.)|^)1(?=(.)|$)
^ ^ ^-looks for a character after 1 and captures it in group2
| |->matches 1..you can replace it with any word
|
|->looks for a character just before 1 and captures it in group 1..this is zero width assertion that doesn't move forward to match.it is just a test and thus allow us to capture the values
$1 and $2 contains your value..Go on finding till the end
So this should be like
String s1 = "abcXY123XYiXYjk";
String s2 = java.util.regex.Pattern.quote("XY");
String s3 = "";
String r = "(?<=(.)|^)"+s2+"(?=(.)|$)";
Pattern p = Pattern.compile(r);
Matcher m = p.matcher(s1);
while(m.find()) s3 += m.group(1)+m.group(2);
//s3 now contains c13iij
works here
Use regex as follows:
Matcher m = Pattern.compile("(.|)" + Pattern.quote(b) + "(?=(.?))").matcher(a);
for (int i = 1; m.find(); c += m.group(1) + m.group(2), i++);
Check this demo.

Java String matches and replaceAll differ in matching parentheses

I have strings with parentheses and also escaped characters. I need to match against these characters and also delete them. In the following code, I use matches() and replaceAll() with the same regex, but the matches() returns false, while the replaceAll() seems to match just fine, because the replaceAll() executes and removes the characters. Can someone explain?
String input = "(aaaa)\\b";
boolean matchResult = input.matches("\\(|\\)|\\\\[a-z]+");
System.out.printf("matchResult=%s\n", matchResult);
String output = input.replaceAll("\\(|\\)|\\\\[a-z]+", "");
System.out.printf("INPUT: %s --> OUTPUT: %s\n", input, output);
Prints out:
matchResult=false
INPUT: (aaaa) --> OUTPUT: aaaa
matches matches the whole input, not part of it.
The regular expression \(|\)|\\[a-z]+ doesn't describe the whole word, but only parts of it, so in your case it fails.
What matches is doing has already been explained by Binyamin Sharet. I want to extend this a bit.
Java does not have a "findall" or a "g" modifier like other languages have it to get all matches at once.
The Java Matcher class knows only two methods to use a pattern against a string (without replacing it)
matches(): matches the whole string against the pattern
find(): returns the next match
If you want to get all things that fits your pattern, you need to use find() in a loop, something like this:
Pattern p = Pattern
.compile("\\(|\\)|\\\\[a-z]+");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group(0));
}
or if you are only interested if your pattern exists in the string
if (m.find()) {
System.out.println(m.group());
} else {
System.out.println("not found");
}

Regular expression to match unescaped special characters only

I'm trying to come up with a regular expression that can match only characters not preceded by a special escape sequence in a string.
For instance, in the string Is ? stranded//? , I want to be able to replace the ? which hasn't been escaped with another string, so I can have this result : **Is Dave stranded?**
But for the life of me I have not been able to figure out a way. I have only come up with regular expressions that eat all the replaceable characters.
How do you construct a regular expression that matches only characters not preceded by an escape sequence?
Use a negative lookbehind, it's what they were designed to do!
(?<!//)[?]
To break it down:
(
?<! #The negative look behind. It will check that the following slashes do not exist.
// #The slashes you are trying to avoid.
)
[\?] #Your special charactor list.
Only if the // cannot be found, it will progress with the rest of the search.
I think in Java it will need to be escaped again as a string something like:
Pattern p = Pattern.compile("(?<!//)[\\?]");
Try this Java code:
str="Is ? stranded//?";
Pattern p = Pattern.compile("(?<!//)([?])");
m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1).replace("?", "Dave"));
}
m.appendTail(sb);
String s = sb.toString().replace("//", "");
System.out.println("Output: " + s);
OUTPUT
Output: Is Dave stranded?
I was thinking about this and have a second simplier solution, avoiding regexs. The other answers are probably better but I thought I might post it anyway.
String input = "Is ? stranded//?";
String output = input
.replace("//?", "a717efbc-84a9-46bf-b1be-8a9fb714fce8")
.replace("?", "Dave")
.replace("a717efbc-84a9-46bf-b1be-8a9fb714fce8", "?");
Just protect the "//?" by replacing it with something unique (like a guid). Then you know any remaining question marks are fair game.
Use grouping. Here's one example:
import java.util.regex.*;
class Test {
public static void main(String[] args) {
Pattern p = Pattern.compile("([^/][^/])(\\?)");
String s = "Is ? stranded//?";
Matcher m = p.matcher(s);
if (m.matches)
s = m.replaceAll("$1XXX").replace("//", "");
System.out.println(s + " -> " + s);
}
}
Output:
$ java Test
Is ? stranded//? -> Is XXX stranded?
In this example, I'm:
first replacing any non-escaped ? with "XXX",
then, removing the "//" escape sequences.
EDIT Use if (m.matches) to ensure that you handle non-matching strings properly.
This is just a quick-and-dirty example. You need to flesh it out, obviously, to make it more robust. But it gets the general idea across.
Match on a set of characters OTHER than an escape sequence, then a regex special character. You could use an inverted character class ([^/]) for the first bit. Special case an unescaped regex character at the front of the string.
String aString = "Is ? stranded//?";
String regex = "(?<!//)[^a-z^A-Z^\\s^/]";
System.out.println(aString.replaceAll(regex, "Dave"));
The part of the regular expression [^a-z^A-Z^\\s^/] matches non-alphanumeric, whitespace or non-forward slash charaters.
The (?<!//) part does a negative lookbehind - see docco here for more info
This gives the output Is Dave stranded//?
try matching:
(^|(^.)|(.[^/])|([^/].))[special characters list]
I used this one:
((?:^|[^\\])(?:\\\\)*[ESCAPABLE CHARACTERS HERE])
Demo: https://regex101.com/r/zH1zO3/4

Categories

Resources