Regex with Whitespace

Regex with Whitespace - java

I am try to write a regex to match the following:
act=MATCHME
act=Match me too
I have the following regex to match either one but not both. Here is my effort:
matches MATCHME: act=(\w+)
matches Match me too: (\w+\s\w+\s\w+)
Is there anyway to can combine the two with OR, or may I be looking at this wrong?
I am using the JAVA regex engine.

You may use an optional non-capturing group:
act=(\w+(?:\s+\w+\s+\w+)?)
^^^^^^^^^^^^^^^^^
See the regex demo
The ? matches 1 or 0 occurrences of the quantified subpattern. When it is applied to a grouping construct, the quantification is applied to the whole pattern sequence, so (?:\s+\w+\s+\w+)? matches 1 or 0 sequences of 1+ whitespaces, 1+ word chars, 1+ whitespaces and again 1+ word chars.
You may further subsegment the pattern if you need to capture 2-word substrings after act=.

Surely you know how to compose regular expressions by alternation.

This regular expression may help you
^[a-zA-Z ]*$

Related

Regex Match Reset \K Equalent In Java

I have come up with a regex pattern to match a part of a Json value. But only PRCE engine is supporting this. I want to know the Java equalent of this regex.
Simplified version
cif:\K.*(?=(.+?){4})
Matches part of the value, leaving the last 4 characters.
cif:test1234
Matched value will be test
https://regex101.com/r/xV4ZNa/1
Note: I can only define the regex and the replace text. I don't have access to the Java code since it's handle by a propriotery log masking framework.

You can write simplify the pattern to:
(?<=cif:).*(?=....)
Explanation
(?<=cif:) Positive lookbehind, assert cif: to the left
.* Match 0+ times any character without newlines
(?=....) Positive lookahead, assert 4 characters (which can include spaces)
See a regex demo.
If you don't want to match empty strings, then you can use .+ instead
(?<=cif:).+(?=....)

You can use a lookbehind assertion instead:
(?<=cif:).*(?=(.+?){4})
Demo: https://regex101.com/r/xV4ZNa/3

Regex: match everything the other regex left

I am struggling with the following issue: say there's a regex 1 and there's regex 2 which should match everything the regex 1 does not.
Let's have the regex 1:
/\$\d+/ (i.e. the dollar sign followed by any amount of digits.
Having a string like foo$12___bar___$34wilma buzz it detects $12 and $34.
How does the regex 2 should look in order to match the remained parts of the aforementioned string, i.e. foo, ___bar___ and wilma buzz? In other words it should pick up all the "remained" chunks of the source string.

You may use String#split to split on given regex and get remaining substrings in an array:
String[] arr = str.split( "\\$\\d+" );
//=> ["foo", "___bar___", "wilma buzz"]
RegEx Demo

It was tricky to get this working, but this regex will match everything besides \$\d+ for you. EDIT: no longer erroneously matches $44$444 or similar.
(?!\$\d+)(.+?)\$\d+|\$\d+|(?!\$\d+)(.+)
Breakdown
(?!\$\d+)(.+?)\$\d+
(?! ) negative lookahead: assert the following string does not match
\$\d+ your pattern - can be replaced with another pattern
(.+?) match at least one symbol, as few as possible
\$\d+ non-capturing match your pattern
OR
\$\d+ non-capturing group: matches one instance of your pattern
OR
(?!\$\d+)(.+)
(?!\$\d+) negative lookahead to not match your pattern
(.+) match at least one symbol, as few as possible
GENERIC FORM
(?!<pattern>)(.+?)<pattern>|<pattern>|(?!<pattern>)(.+)
By replacing <pattern>, you can match anything that doesn't match your pattern. Here's one that matches your pattern, and here's an example of arbitrary pattern (un)matching.
Good luck!

Try this one
[a-zA-Z_]+
Or even better
[^\$\d]+ -> With the ^symbol you can negotiate the search like ! in the java -> not equal

Java regular expressions for specific name\value format

I'm not familiar yet with java regular expressions. I want to validate a string that has the following format:
String INPUT = "[name1 value1];[name2 value2];[name3 value3];";
namei and valuei are Strings should contain any characters expect white-space.
I tried with this expression:
String REGEX = "([\\S*\\s\\S*];)*";
But if I call matches() I get always false even for a good String.
what's the best regular expression for it?

This does the trick:
(?:\[\w.*?\s\w.*?\];)*
If you want to only match three of these, replace the * at the end with {3}.
Explanation:
(?:: Start of non-capturing group
\[: Escapes the [ sign which is a meta-character in regex. This
allows it to be used for matching.
\w.*?: Lazily matches any word character [a-z][A-Z][0-9]_. Lazy matching means it attempts to match the character as few times possible, in this case meaning that when will stop matching once it finds the following \s.
\s: Matches one whitespace
\]: See \[
;: Matches one semicolon
): End of non-capturing group
*: Matches any number of what is contained in the preceding non-capturing group.
See this link for demonstration

You should escape square brackets. Also, if your aim is to match only three, replace * with {3}
(\[\\S*\\s\\S*\];){3}

Regular Expressions \w character class and equals sign

I am creating a regular expression to match the string
#servername:port:databasename
and through https://regex101.com/ I came up with
\#(((\w+.*-*)+)?\w+)(:\d+)(:\w+)
which matches
e.g. #CORA-PC:1111:databasename or #111.111.1.111:111:databasename
However when I use this regular expression to pattern match in my java code the String #CORA-PC:1111:database=name is also matched.
Why is \w matching the = equals sign? I also tried [0-9a-zA-Z] but it also matched the = equals sign?
Can anyone help me with this?
Thanks!

The .* is a greedy dot matching subpattern that matches the whole line and then backtracks to accommodate for the subsequent subpatterns. That is why the pattern can match a = symbol (see demo - Group 3 matches that part with =) .
Your pattern is rather fragile, as the first part contains nested quantifiers with optional subpatterns that slows down the regex execution and causes other issues. You need to make it more linear.
#(\w+(?:[-.]\w+)*)?(:\d+)(:\w+)
See the regex demo
The regex will match
# - # symbol
(\w+(?:[-.]\w+)*)? - an optional group matching
\w+ - 1+ word chars
(?:[-.]\w+)* - 0+ sequences of a - or . ([-.]) followed with 1+ word chars
(:\d+) - a : symbol followed with 1+ digits
(:\w+) - a : symbol followed with 1+ word chars
If you need to avoid partial matching, use String#matches() (see demo).
NOTE: In Java, backslashes must be doubled.
Code example (Java):
String s = "#CORA-PC:1111:databasename";
String rx = "#(?:\\w+(?:[-.]\\w+)*)?:\\d+:\\w+";
System.out.println(s.matches(rx));
Code example (JS):
var str = '#CORA-PC:1111:databasename';
alert(/^#(?:\w+(?:[-.]\w+)*)?:\d+:\w+$/.test(str));

How to reject repetition of character within Java regular expression

I am looking for help with a Java regular expression please.
My regular expression should accept a string of length 5 only, with characters matching [BDILMOP] only.
No repeated characters are allowed - eg. BDILM is allowed, but BDILL or BDLLL are not.
Please help - I'm new to regex and so would appreciate any advice that you could throw my way.
Thanks!

You can use this negative lookahead based regex:
^(?!.*(.).*\1)[BDILMOP]{5}$
(?!.*(.).*\1) is negative lookahead which fails the match if there is any repetition in input. (.) captures a letter in group #1 and \1 is back-reference of the same group thus checking repetition.
RegEx Demo

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex with Whitespace - java

Surely you know how to compose regular expressions by alternation.

This regular expression may help you ^[a-zA-Z ]*$

Related

Regex Match Reset \K Equalent In Java

Regex: match everything the other regex left

Java regular expressions for specific name\value format

Regular Expressions \w character class and equals sign

How to reject repetition of character within Java regular expression

Categories

Resources