Java regular expression to validate numeric comma separated number and hyphen - java

Valid1: 2
valid2: 3-5
Valid3: 2,4-6
valid4: 2,4,5
valid5: 2-7,8-9
Valid4: 2,5-7,9-13,15,17-20
All the expression on the above should be valid in one regex.
the digit in the left side of hyphen should be smaller than right hand side.

First, as #MikeFHay suggested above, regex were not made to check if one digit is bigger than the other (for that you'll have to parse the expression). If we'll ignore that requirement - the rest can be achieved via the following regex:
((\d\,(?=\d))|(\d\-(?=\d))|\d)+
in Java:
"((\\d\\,(?=\\d))|(\\d\\-(?=\\d))|\\d)+"
Explanation:
This regex uses lookahead to validate that each comma or dash is preceded and followed by a digit: (\d\,(?=\d)) so that each "substring" that contains a dash/comma will have to be in the format of: digit,digit or digit-digit.
Of course that a number that doesn't contain commas/dashes is also valid - hence the rightmost side of the or which is simply a \d
Link to online demo

Related

Java Regex to validate group field pattern example - abc.def.gh1

I am just writing some piece of java code where I need to validate groupId (maven) passed by user.
For example - com.fb.test1.
I have written regex which says string should not start and end with '.' and can have alphanumeric characters delimited by '.'
[^\.][[a-zA-Z0-9]+\\.{0,1}]*[a-zA-Z0-9]$
But this regex not able to find out consecutive '.' For example - com..fb.test. I have added {0,1} followed by decimal to restrict it limitation to 1 but it didnt work.
Any leads would be highly appreciated.
The quantifier {0,1} and the dot should not be in the character class, because you are repeating the whole character class allowing for 0 or more dots, including { , } chars.
You can also exclude a dot to the left using a negative lookbehind instead of matching an actual character that is not a dot.
In Java you could write the pattern as
(?<!\\.)[a-zA-Z0-9]+(?:\\.[a-zA-Z0-9]+)+[a-zA-Z0-9]$
Note that the $ makes sure that that match is at the end of the string.
Regex demo

Regular expression to determine if the String consists of more than 4 numbers

I want to extract URL strings from a log which looks like below:
<13>Mar 27 11:22:38 144.0.116.31 AgentDevice=WindowsDNS AgentLogFile=DNS.log PluginVersion=X.X.X.X Date=3/27/2019 Time=11:22:34 AM Thread ID=11BC Context=PACKET Message= Internal packet identifier=0000007A4843E100 UDP/TCP indicator=UDP Send/Receive indicator=Snd Remote IP=X.X.X.X Xid (hex)=9b01 Query/Response=R Opcode=Q Flags (hex)=8081 Flags (char codes)=DR ResponseCode=NOERROR Question Type=A Question Name=outlook.office365.com
I am looking to extract Name text which contains more that 5 digits.
A possible way suggested is (\d.*?){5,} but does not seem to work, kindly suggest another way get the field.
Example of string match:
outlook12.office345.com
outlook.office12345.com
You can look for the following expression:
Name=([^ ]*\d{5,}[^ ]*)
Explanation:
Name= look for anything that starts with "Name=", than capture if:
[^ ]* any number of characters which is not a space
\d{5,} then 5 digits in a row
[^ ]* then again, all digits up to a white space
This regular expression:
(?<=Name=).*\d{5,}.*?(?=\s|$)
would extract strings like outlook.office365666.com (with 5 or more consecutive digits) from your example input.
Demo: https://regex101.com/r/YQ5l2w/1
Try this pattern: (?=\b.*(?:\d[^\d\s]*){5,})\S*
Explanation:
(?=...) - positive lookahead, assures that pattern inside it is matched somewhere ahead :)
\b - word boundary
(?:...) - non-capturing group
\d[^\d\s]* - match digit \d, then match zero or more of any characters other than whitespace \s or digit \d
{5,} - match preceeding pattern 5 or more times
\S* - match zero or more of any characters other than space to match the string if assertion is true, but I think you just need assertion :)
Demo
If you want only consecutive numbers use simplified pattern (?=\b.*\d{5,})\S*.
Another demo
Of course, you have to add positive lookbehind: (?<=Name=) to assert that you have Name= string preceeding
Try this regex
([a-z0-9]{5,}.[a-z0-9]{5,})+.com
https://regex101.com/r/OzsChv/3
It Groups,
outlook.office365.com
outlook12.office345.com
also all url strings

regex pattern losses last character

I have the following regex to extract a domain from a url: "^(http:\\/\\/|https:\\/\\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-zA-Z0-9]+)?$" when I get the 3rd group, I get a the domain missing the last charcter in it. for example: facebook becomes faceboo
I'm using Java 8
The regex Works fine in case of having a path (Group 4) that doesn't have any numbers in it.
if I put a number into the 4th group it cuts the domain's last character.
You need to escape the dot characters
"^(http:\\/\\/|https:\\/\\/)?(www\\.)?([a-zA-Z0-9]+)\\.[a-zA-Z0-9]*\\.[a-z]{3}\\.?([a-zA-Z0-9]+)?$"
It's a special character in regex that means "Any character", which will mean it matches a dot, or any letter.

Regex-How to prevent repeated special characters?

I don't have an experience on Regular Expressions. I need to a regular expression which doesn't allow to repeat of special characters (+-*/& etc.)
The string can contain digits, alphanumerics, and special characters.
This should be valid : abc,df
This should be invalid : abc-,df
i will be really appreciated if you can help me ! Thanks for advance.
Two solutions presented so far match a string that is not allowed.
But the tilte is How to prevent..., so I assume that the regex
should match the allowed string. It means that the regex should:
match the whole string if it does not contain 2
consecutive special characters,
not match otherwise.
You can achieve this putting together the following parts:
^ - start of string anchor,
(?!.*[...]{2}) - a negative lookahead for 2 consecutive special
characters (marked here as ...), in any place,
a regex matching the whole (non-empty) string,
$ - end of string anchor.
So the whole regex should be:
^(?!.*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2}).+$
Note that within a char class (between [ and ]) a backslash
escaping the following char should be placed before - (if in
the middle of the sequence), closing square bracket,
a backslash itself and / (regex terminator).
Or if you want to apply the regex to individual words (not the whole
string), then the regex should be:
\b(?!\S*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2})\S+
[\,\+\-\*\/\&]{2,} Add more characters in the square bracket if you want.
Demo https://regex101.com/r/CBrldL/2
Use the following regex to match the invalid string.
[^A-Za-z0-9]{2,}
[^\w!\s]{2,} This would be a shortest version to match any two consecutive special characters (ignoring space)
If you want to consider space, please use [^\w]{2,}

Particular java regular expression

How would I check that a String input in Java has the format:
xxxx-xxxx-xxxx-xxxx
where x is a digit 0..9?
Thanks!
To start, this is a great source of regexps: http://www.regular-expressions.info. Visit it, poke and play around. Further the java.util.Pattern API has a concise overview of regex patterns.
Now, back to your question: you want to match four consecutive groups of four digits separated by a hyphen. A single group of 4 digits can in regex be represented as
\d{4}
Four of those separated by a hyphen can be represented as:
\d{4}-\d{4}-\d{4}-\d{4}
To make it shorter you can also represent a single group of four digits and three consecutive groups of four digits prefixed with a hyphen:
\d{4}(-\d{4}){3}
Now, in Java you can use String#matches() to test whether a string matches the given regex.
boolean matches = value.matches("\\d{4}(-\\d{4}){3}");
Note that I escaped the backslashes \ by another backslash \, because the backslashes have a special meaning in String. To represent the actual backslash, you'd have to use \\.
String objects in Java have a matches method which can check against a regular expression:
myString.matches("^\\d{4}(-\\d{4}){3}$")
This particular expression checks for four digits, and then three times (a hyphen and four digits), thus representing your required format.

Categories

Resources