Regex matches during look behind - java

I am using the regex below to match strings; I was expecting the following results
Regex ^.*(?<!abc)(?<!def)(?<!ghi).xyz.co.*
Not Match
ghi.xyz.org
ghi-hipqr.xyz.org
abc-hipqr.xyz.org
Match
qrs.xyz.org
qrs-hipqr.xyz.org
However, ghi-hipqr.xyz.org is matching the regex (it shouldn't have since there is a look behind for the string ghi which is present in the string.
How can I fix it?

It is failing because ghi is not immediately before .xyz. in your string. Java (like many more regex engines) doesn't support variable length negative length look-behind assertion.
You can use this negative lookahead expressions instead:
^(?!.*\b(?:abc|def|ghi)\b).*\.xyz\.org.*$
RegEx Demo

Related

Regex Match Reset \K Equalent In Java

I have come up with a regex pattern to match a part of a Json value. But only PRCE engine is supporting this. I want to know the Java equalent of this regex.
Simplified version
cif:\K.*(?=(.+?){4})
Matches part of the value, leaving the last 4 characters.
cif:test1234
Matched value will be test
https://regex101.com/r/xV4ZNa/1
Note: I can only define the regex and the replace text. I don't have access to the Java code since it's handle by a propriotery log masking framework.
You can write simplify the pattern to:
(?<=cif:).*(?=....)
Explanation
(?<=cif:) Positive lookbehind, assert cif: to the left
.* Match 0+ times any character without newlines
(?=....) Positive lookahead, assert 4 characters (which can include spaces)
See a regex demo.
If you don't want to match empty strings, then you can use .+ instead
(?<=cif:).+(?=....)
You can use a lookbehind assertion instead:
(?<=cif:).*(?=(.+?){4})
Demo: https://regex101.com/r/xV4ZNa/3

Regex: match everything the other regex left

I am struggling with the following issue: say there's a regex 1 and there's regex 2 which should match everything the regex 1 does not.
Let's have the regex 1:
/\$\d+/ (i.e. the dollar sign followed by any amount of digits.
Having a string like foo$12___bar___$34wilma buzz it detects $12 and $34.
How does the regex 2 should look in order to match the remained parts of the aforementioned string, i.e. foo, ___bar___ and wilma buzz? In other words it should pick up all the "remained" chunks of the source string.
You may use String#split to split on given regex and get remaining substrings in an array:
String[] arr = str.split( "\\$\\d+" );
//=> ["foo", "___bar___", "wilma buzz"]
RegEx Demo
It was tricky to get this working, but this regex will match everything besides \$\d+ for you. EDIT: no longer erroneously matches $44$444 or similar.
(?!\$\d+)(.+?)\$\d+|\$\d+|(?!\$\d+)(.+)
Breakdown
(?!\$\d+)(.+?)\$\d+
(?! ) negative lookahead: assert the following string does not match
\$\d+ your pattern - can be replaced with another pattern
(.+?) match at least one symbol, as few as possible
\$\d+ non-capturing match your pattern
OR
\$\d+ non-capturing group: matches one instance of your pattern
OR
(?!\$\d+)(.+)
(?!\$\d+) negative lookahead to not match your pattern
(.+) match at least one symbol, as few as possible
GENERIC FORM
(?!<pattern>)(.+?)<pattern>|<pattern>|(?!<pattern>)(.+)
By replacing <pattern>, you can match anything that doesn't match your pattern. Here's one that matches your pattern, and here's an example of arbitrary pattern (un)matching.
Good luck!
Try this one
[a-zA-Z_]+
Or even better
[^\$\d]+ -> With the ^symbol you can negotiate the search like ! in the java -> not equal

How to match a substring following after a string satisfying the specific pattern

Imagine, that I have the string 12.34some_text.
How can I match the substring following after the second character (4 in my case) after the . character. In that particular case the string I want to match is some_text.
For the string 56.78another_text it will be another_text and so on.
All accepted strings have the pattern \d\d\.\d\d\w*
If you wish to match everything from the second character after a specific one (i.e. the dot) you can use a lookbehind, like this:
(?<=[.]\d{2})(\w*)
demo
(?<=[.]\d{2}) is a positive lookbehind that matches a dot [.] followed by two digits \d{2}.
Since you are using java and the given pattern is \d\d\.\d\d\w* you will get some_text from 12.34some_textby using
String s="12.34some_text";
s.substring(5,s.length());
and you can compare the substring!

regular expression to match string of characters only if it is not followed by a specific character

I need a regular expression which will match a string only if it doesn't followed by a forward slash (/) character and I want to match whole string.
For example below string should match
/Raj/details/002-542545-1145457
but not this one
/Raj/details/002-542545-1145457/show
I tried to use Negative Lookahead to achieve this as specified by this answer.
My code is like this.
pattern = Pattern.compile("/.*/details/(?!.*/)");
matcher = pattern.matcher(text);
if(matcher.matches()) {
System.out.println("success");
} else {
System.out.println("failure");
}
It is giving failure. But if I use matcher.find() then it is giving success.
Please help me understanding why it is not matching and a reg-exp to achieve this?
This
^/[^/]+/details/[^/]+$
will match
/Raj/details/002-542545-1145457
but not
/Raj/details/002-542545-1145457/show
I think your regex is not doing what you are expecting it to do. You are missing a part of the expression that captures the numerical data after /details/.
Your regex is positive for .find() because there is a match inside the string for your current expression, but the string does not match the expression entirely which is why .matches() doesn't work.
Your current expression is not a greedy search, it stops matching as soon as it gets to /details/. It fails the match if there is a / after /details/, so it is not matching the characters between /details/ and any potential / - in your examples, the numerical data. Which causes .matches() to fail, even though there is still a positive match.
If you want it to match the whole string up to and including the numbers but nothing afterwards, the following regex should work: /.*/details/[0-9\-]*(?!.*/) - with that both .find() and .matches() will return positive, as the expression is now matching everything up to the potential /.
You regex looks OK. matcher() only matches the whole String, while find() matches next substring inside your String.

Java String validation only one alphanumeric with Regex

I want to do validation for a String which can only contains alphanumeric and only one special character. I tried with (\\W).{1,1}(\\w+).
But it is true only when I start with a special character. But I can have one special character at any place in String.
Use the ^ and $ anchors to instruct the regex engine to start matching from the beginning of the string and stop matching at the end of the string, so taking your regex:
^(\\W).{1,1}(\\w+)$
Please take a look at this Oracle (Java) tutorial on regular expressions.
Try this regexp: \w*\W?\w* (Java string: "\\w*\\W?\\w*")
This expression has a drawback of matching zero-length strings. If your input must have exactly one special character, remove the question mark ? from the expression.
use matcher.find() and not matcher.match() and search for \\w and remove plus (+) because it will match all alphanumeric characters sequence in your string.If your string contains only them, your regex will match whole string.
if I understand your regex correctly, this could solve your problem:
([\w]+)([^\w])([\w]+)

Categories

Resources