Regular expression for both length with whitespaces - java

I am trying to write a regular expression with following conditions.
Allow empty at any position in string.
First three are characters-range (1-3)
Next six are numeric (must) -range (6)
Next optional to have characters - range (1-3)
After that optional to have numeric - range(0-2)
For this i tried lot of things nothing works.
^[a-zA-Z]{1,3}[0-9]{6}[a-zA-Z]{0,3}[0-9]{0,2}
This expression works fine for matching all criteria but it is not allowing empty strings. Thanks in advance.
I just want to validate the string like "AB 123456 ADF 12".
As i mentioned first point the string contains empty space at any position in given string like "AB 123 456 ADF 12".

You have to wrap your pattern in parentheses and make it optional using ?:
^(?:[a-zA-Z]{1,3}[0-9]{6}[a-zA-Z]{0,3}[0-9]{0,2})?$
^ Assert beginning of string
(?: Start of non-capturing group
[a-zA-Z]{1,3}[0-9]{6}[a-zA-Z]{0,3}[0-9]{0,2} Your pattern
)? End of NCG, optional
$ Assert end of string
If you want to match strings with whitespace characters add \\s (or \s treating literal) and remove ?:
^(?:[a-zA-Z]{1,3}[0-9]{6}[a-zA-Z]{0,3}[0-9]{0,2}|\s*)$
^^^^
Live demo
Update
Based on comment:
^(?:[a-zA-Z](?:\s*[a-zA-Z]){0,2}\s*\d(?:\s*\d){5}(?:\s*[a-zA-Z](?:\s*[a-zA-Z]){0,2})?\s*(?:\d\s*\d?)?)$
Live demo

Related

Regex pattern matching with multiple strings

Forgive me. I am not familiarized much with Regex patterns.
I have created a regex pattern as below.
String regex = Pattern.quote(value) + ", [NnoneOoff0-9\\-\\+\\/]+|[NnoneOoff0-9\\-\\+\\/]+, "
+ Pattern.quote(value);
This regex pattern is failing with 2 different set of strings.
value = "207e/160";
Use Case 1 -
When channelStr = "207e/160, 149/80"
Then channelStr.matches(regex), returns "true".
Use Case 2 -
When channelStr = "207e/160, 149/80, 11"
Then channelStr.matches(regex), returns "false".
Not able to figure out why? As far I can understand it may be because of the multiple spaces involved when more than 2 strings are present with separated by comma.
Not sure what should be correct pattern I should write for more than 2 strings.
Any help will be appreciated.
If you print your pattern, it is:
\Q207e/160\E, [NnoneOoff0-9\-\+\/]+|[NnoneOoff0-9\-\+\/]+, \Q207e/160\E
It consists of an alternation | matching a mandatory comma as well on the left as on the right side.
Using matches(), should match the whole string and that is the case for 207e/160, 149/80 so that is a match.
Only for this string 207e/160, 149/80, 11 there are 2 comma's, so you do get a partial match for the first part of the string, but you don't match the whole string so matches() returns false.
See the matches in this regex demo.
To match all the values, you can use a repeating pattern:
^[NnoeOf0-9+/-]+(?:,\h*[NnoeOf0-90+/-]+)*$
^ Start of string
[NnoeOf0-9\\+/-]+
(?: Non capture group
,\h* Match a comma and optional horizontal whitespace chars
[NnoeOf0-90-9\\+/-]+ Match 1+ any of the listed in the character class
)* Close the non capture group and optionally repeat it (if there should be at least 1 comma, then the quantifier can be + instead of *)
$ End of string
Regex demo
Example using matches():
String channelStr1 = "207e/160, 149/80";
String channelStr2 = "207e/160, 149/80, 11";
String regex = "^[NnoeOf0-9+/-]+(?:,\\h*[NnoeOf0-90+/-]+)*$";
System.out.println(channelStr1.matches(regex));
System.out.println(channelStr2.matches(regex));
Output
true
true
Note that in the character class you can put - at the end not having to escape it, and the + and / also does not have to be escaped.
You can use regex101 to test your RegEx. it has a description of everything that's going on to help with debugging. They have a quick reference section bottom right that you can use to figure out what you can do with examples and stuff.
A few things, you can add literals with \, so \" for a literal double quote.
If you want the pattern to be one or more of something, you would use +. These are called quantifiers and can be applied to groups, tokens, etc. The token for a whitespace character is \s. So, one or more whitespace characters would be \s+.
It's difficult to tell exactly what you're trying to do, but hopefully pointing you to regex101 will help. If you want to provide examples of the current RegEx you have, what you want to match and then the strings you're using to test it I'll be happy to provide you with an example.
^(?:[NnoneOoff0-9\\-\\+\\/]+ *(?:, *(?!$)|$))+$
^ Start
(?: ... ) Non-capturing group that defines an item and its separator. After each item, except the last, the separator (,) must appear. Spaces (one, several, or none) can appear before and after the comma, which is specified with *. This group can appear one or more times to the end of the string, as specified by the + quantifier after the group's closing parenthesis.
Regex101 Test

Make regular expression fail for invalid input [duplicate]

This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 2 years ago.
Upon validation using regular expression in Java, I need to return true for height having values :
80cm
80.2cm
80.25cm
My regular expression is as follows :
(\d)(\d?)(.?)(\d?)(\d?)(c)(m)
However if I pass in height as 71-80cm , the regular expression returns true too.
What change should I make to the regular expression to return false when height is 71-80cm ?
. matches any character, so you need to have \\. or just \. depending on the source. Check out: Java RegEx meta character (.) and ordinary dot?
Furthermore, additional changes need to be made such that e.g. 8025cm is not accepted if that is what you want.
I assume that the OP wishes to match substrings of the form
abcm
where:
"cm" is a literal;
"cm" is not followed by a letter;
"b" is the string representation of a non-negative float or integer (e.g., "80" or "80.25", but not "08" or ".25"); and
"a" is a character other than "-", "+" and ".", unless "b" is at the beginning of the string, in which case "a" is an empty string.
If my assumptions are correct you could use the following regex to match b in abcm:
(?<![-+.\d])[1-9]\d*(?:\.\d+)?cm(?![a-zA-Z])
Demo
The regex engine performs the following operations:
(?<! # begin negative lookbehind
[-+.\d] # match '-', '+', '.' or a digit
) # end negative lookbehind
[1-9] # match digit other than zero
\d* # match 0+ digits
(?:\.\d+) # match '.' followed by 1+ digits in a non-cap grp
? # optionally match non-cap grp
cm # match 'cm'
(?![a-zA-Z]) # match a letter in a negative lookahead
If my assumptions about what is required are not correct it may be evident how my answer could be adjusted appropriately.
Ok, let's take your expression and clean it up a little. You don't need all the capturing groups (..), since all you're interested in is validating the complete string. For that reason you should also enclose the expression in line beginning ^ and line end $ anchors, so your expression can't match inside a larger string. Lastly, you can group the period and trailing digits together (?:), since you won't get one without the other as per your example data. Which gets us:
^\d\d?(?:\.\d\d?)?cm$
See regex demo.
Then in Java, that check could look like this:
boolean foundMatch = subjectString.matches("^\\d\\d?(?:\\.\\d\\d?)?cm$");

Regular expression to determine if the String consists of more than 4 numbers

I want to extract URL strings from a log which looks like below:
<13>Mar 27 11:22:38 144.0.116.31 AgentDevice=WindowsDNS AgentLogFile=DNS.log PluginVersion=X.X.X.X Date=3/27/2019 Time=11:22:34 AM Thread ID=11BC Context=PACKET Message= Internal packet identifier=0000007A4843E100 UDP/TCP indicator=UDP Send/Receive indicator=Snd Remote IP=X.X.X.X Xid (hex)=9b01 Query/Response=R Opcode=Q Flags (hex)=8081 Flags (char codes)=DR ResponseCode=NOERROR Question Type=A Question Name=outlook.office365.com
I am looking to extract Name text which contains more that 5 digits.
A possible way suggested is (\d.*?){5,} but does not seem to work, kindly suggest another way get the field.
Example of string match:
outlook12.office345.com
outlook.office12345.com
You can look for the following expression:
Name=([^ ]*\d{5,}[^ ]*)
Explanation:
Name= look for anything that starts with "Name=", than capture if:
[^ ]* any number of characters which is not a space
\d{5,} then 5 digits in a row
[^ ]* then again, all digits up to a white space
This regular expression:
(?<=Name=).*\d{5,}.*?(?=\s|$)
would extract strings like outlook.office365666.com (with 5 or more consecutive digits) from your example input.
Demo: https://regex101.com/r/YQ5l2w/1
Try this pattern: (?=\b.*(?:\d[^\d\s]*){5,})\S*
Explanation:
(?=...) - positive lookahead, assures that pattern inside it is matched somewhere ahead :)
\b - word boundary
(?:...) - non-capturing group
\d[^\d\s]* - match digit \d, then match zero or more of any characters other than whitespace \s or digit \d
{5,} - match preceeding pattern 5 or more times
\S* - match zero or more of any characters other than space to match the string if assertion is true, but I think you just need assertion :)
Demo
If you want only consecutive numbers use simplified pattern (?=\b.*\d{5,})\S*.
Another demo
Of course, you have to add positive lookbehind: (?<=Name=) to assert that you have Name= string preceeding
Try this regex
([a-z0-9]{5,}.[a-z0-9]{5,})+.com
https://regex101.com/r/OzsChv/3
It Groups,
outlook.office365.com
outlook12.office345.com
also all url strings

regular expression retrieve a portion of a string that not contain a string

I have some strings like the following:
it.mycompany.db.beans.str1.PD_T_CLASS
it.mycompany.db.beans.join.PD_T_CLASS
it.mycompany.db.beans.str2.PD_T_CLASS_1
it.mycompany.db.beans.join.PD_T_CLASS_1
PD_T_CLASS myVar = new PD_T_CLASS();
myVar.setPD_T_CLASS(something);
and I want to select "PD_" part to substitute it with "" (the void string) but only inf the entire line does not contain the string ".join."
what I want to achieve is:
it.mycompany.db.beans.str1.T_CLASS
it.mycompany.db.beans.join.PD_T_CLASS
it.mycompany.db.beans.str2.T_CLASS_1
it.mycompany.db.beans.join.PD_T_CLASS_1
T_CLASS myVar = new T_CLASS();
myVar.setT_CLASS(something);
The substitution is not a problem since I'm using eclipse search tool and will hit replace as soon as it show me the right result.
I have tried:
^((?!\.join\.).)*(PD_)*$ // whole string selected
^((?!\.join\.).)*(\bPD_\b)*$ // whole string selected
I start getting frustrated since I've searched a bit around (the ^((?!join bla bla come from those searches)
Can you help me?
You may use the following regex:
(?m)(?:\G(?!\A)|^(?!.*\.join\.))(.*?)PD_
and replace with
$1
See the regex demo
Details:
(?m) - a Pattern.MULTILINE inline modifier flag that will force ^ to match the beginning of a line rather than a whole string
(?:\G(?!\A)|^(?!.*\.join\.)) - either of the two alternatives:
\G(?!\A) - the end of the previous successful match
| - or
^(?!.*\.join\.) - start of a line that has no .join. text in it (as the (?!.*\.join\.) is a negative lookahead that will fail the match if it matches any 0+ chars other than line break chars (.*) and then .join.)
(.*?) - Capturing group #1 (referred to with the $1 backreference in the replacement pattern): any 0+ chars other than line breaks, as few as possible, up to the first occurrence of ...
PD_ - a literal PD_
The replacement is a $1 backreference to the first capturing group that will restore any text matched before PD_s.

Java regular expressions for specific name\value format

I'm not familiar yet with java regular expressions. I want to validate a string that has the following format:
String INPUT = "[name1 value1];[name2 value2];[name3 value3];";
namei and valuei are Strings should contain any characters expect white-space.
I tried with this expression:
String REGEX = "([\\S*\\s\\S*];)*";
But if I call matches() I get always false even for a good String.
what's the best regular expression for it?
This does the trick:
(?:\[\w.*?\s\w.*?\];)*
If you want to only match three of these, replace the * at the end with {3}.
Explanation:
(?:: Start of non-capturing group
\[: Escapes the [ sign which is a meta-character in regex. This
allows it to be used for matching.
\w.*?: Lazily matches any word character [a-z][A-Z][0-9]_. Lazy matching means it attempts to match the character as few times possible, in this case meaning that when will stop matching once it finds the following \s.
\s: Matches one whitespace
\]: See \[
;: Matches one semicolon
): End of non-capturing group
*: Matches any number of what is contained in the preceding non-capturing group.
See this link for demonstration
You should escape square brackets. Also, if your aim is to match only three, replace * with {3}
(\[\\S*\\s\\S*\];){3}

Categories

Resources