One Regular Expression to validate Zip code not working - java

Below regex (see code snippet) will satisfy all following four conditions:
12345
12345-6789
12345_6789
12345 1234
I need to include a 5th condition which is 123456789 (hence, 9 digits only, no space) I've tried to change the current regex to this ^[0-9]{5}(_|-|\s){0,1}|[0-9]{4} but this doesn't work
public static boolean isZipCodeValid(String zipcode) {
return zipcode.matches("^\\d{5}(?:[-_\\s]\\d{4})?$");
}

I think you shouldn't have a | there:
^[0-9]{5}(_|-|\s){0,1}|[0-9]{4}
^
here!
Delete that and everything should work.
With that | being there, it means that matching [0-9]{4} is an alternative. So it will either match 5 digits or 4. That's why you end up with 2 matches.

Use the following regex:
^\\d{5}(?:[-_\\s]?\\d{4})?$
It has been tested on https://regex101.com

Try to break down your expression and it should become easier to see:
always 5 digits (probably the first not being 0),
a group of 4 digits preceded by either nothing or one of 3 characters.
If you translate that to regex you should get something like (I included the non-0 first digit as well):
[1-9]\d{4}
(x\d{4})? - that's the optional (?) group ((...)) with x being a placeholder. Now translate "either nothing or one of a 3 characters" into an expression and you get [-_ ]?. (Note that I replaced \s with a space because \s includes tabs and other whitespace.
If you take that all to gether you get [1-9]\d{4}([-_ ]?\d{4})?.
Side note: \d matches other digits as well, e.g. arabic ones. You might want to use [0-9] instead.

I come up with a 2nd method which validate against 9 digits only so somewhere in my code I do something like below. Is not the most elegant way but it does what is supposed to...
if(!isUsaZipCodeValid(bo.getZipCode())){
if((isValidDigit9Only(bo.getZipCode()))){
//do nothing
} else {
errors.add("zipCode", new ActionMessage("error.label.zipcode.usa.digits.only"));
}
}
//and these are the two methods.
public static boolean isUsaZipCodeValid(String zipcode){
/*Below regex will satisfy all three condtions for zip-code. E.g-
12345
12345-6789
12345_6789
12345 1234
*/
return zipcode.matches("^\\d{5}(?:[-_\\s]\\d{4})?$");
}
public static boolean isValidDigit9Only(String zipcode){
/*Below regex will satisfy the below condition for the for zip-code. E.g-
123456789
*/
return zipcode.matches("[0-9]{9}");
}

Related

Complicated regex and possible simple way to do it [duplicate]

I don't write many regular expressions so I'm going to need some help on the one.
I need a regular expression that can validate that a string is an alphanumeric comma delimited string.
Examples:
123, 4A67, GGG, 767 would be valid.
12333, 78787&*, GH778 would be invalid
fghkjhfdg8797< would be invalid
This is what I have so far, but isn't quite right: ^(?=.*[a-zA-Z0-9][,]).*$
Any suggestions?
Sounds like you need an expression like this:
^[0-9a-zA-Z]+(,[0-9a-zA-Z]+)*$
Posix allows for the more self-descriptive version:
^[[:alnum:]]+(,[[:alnum:]]+)*$
^[[:alnum:]]+([[:space:]]*,[[:space:]]*[[:alnum:]]+)*$ // allow whitespace
If you're willing to admit underscores, too, search for entire words (\w+):
^\w+(,\w+)*$
^\w+(\s*,\s*\w+)*$ // allow whitespaces around the comma
Try this pattern: ^([a-zA-Z0-9]+,?\s*)+$
I tested it with your cases, as well as just a single number "123". I don't know if you will always have a comma or not.
The [a-zA-Z0-9]+ means match 1 or more of these symbols
The ,? means match 0 or 1 commas (basically, the comma is optional)
The \s* handles 1 or more spaces after the comma
and finally the outer + says match 1 or more of the pattern.
This will also match
123 123 abc (no commas) which might be a problem
This will also match 123, (ends with a comma) which might be a problem.
Try the following expression:
/^([a-z0-9\s]+,)*([a-z0-9\s]+){1}$/i
This will work for:
test
test, test
test123,Test 123,test
I would strongly suggest trimming the whitespaces at the beginning and end of each item in the comma-separated list.
You seem to be lacking repetition. How about:
^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$
I'm not sure how you'd express that in VB.Net, but in Python:
>>> import re
>>> x [ "123, $a67, GGG, 767", "12333, 78787&*, GH778" ]
>>> r = '^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$'
>>> for s in x:
... print re.match( r, s )
...
<_sre.SRE_Match object at 0xb75c8218>
None
>>>>
You can use shortcuts instead of listing the [a-zA-Z0-9 ] part, but this is probably easier to understand.
Analyzing the highlights:
[a-zA-Z0-9 ]+ : capture one or more (but not zero) of the listed ranges, and space.
(?:[...]+,)* : In non-capturing parenthesis, match one or more of the characters, plus a comma at the end. Match such sequences zero or more times. Capturing zero times allows for no comma.
[...]+ : capture at least one of these. This does not include a comma. This is to ensure that it does not accept a trailing comma. If a trailing comma is acceptable, then the expression is easier: ^[a-zA-Z0-9 ,]+
Yes, when you want to catch comma separated things where a comma at the end is not legal, and the things match to $LONGSTUFF, you have to repeat $LONGSTUFF:
$LONGSTUFF(,$LONGSTUFF)*
If $LONGSTUFF is really long and contains comma repeated items itself etc., it might be a good idea to not build the regexp by hand and instead rely on a computer for doing that for you, even if it's just through string concatenation. For example, I just wanted to build a regular expression to validate the CPUID parameter of a XEN configuration file, of the ['1:a=b,c=d','2:e=f,g=h'] type. I... believe this mostly fits the bill: (whitespace notwithstanding!)
xend_fudge_item_re = r"""
e[a-d]x= #register of the call return value to fudge
(
0x[0-9A-F]+ | #either hardcode the reply
[10xks]{32} #or edit the bitfield directly
)
"""
xend_string_item_re = r"""
(0x)?[0-9A-F]+: #leafnum (the contents of EAX before the call)
%s #one fudge
(,%s)* #repeated multiple times
""" % (xend_fudge_item_re, xend_fudge_item_re)
xend_syntax = re.compile(r"""
\[ #a list of
'%s' #string elements
(,'%s')* #repeated multiple times
\]
$ #and nothing else
""" % (xend_string_item_re, xend_string_item_re), re.VERBOSE | re.MULTILINE)
Try ^(?!,)((, *)?([a-zA-Z0-9])\b)*$
Step by step description:
Don't match a beginning comma (good for the upcoming "loop").
Match optional comma and spaces.
Match characters you like.
The match of a word boundary make sure that a comma is necessary if more arguments are stacked in string.
Please use - ^((([a-zA-Z0-9\s]){1,45},)+([a-zA-Z0-9\s]){1,45})$
Here, I have set max word size to 45, as longest word in english is 45 characters, can be changed as per requirement

Extract exactly n digits in a sentence using REGEX

Example
The no.s 1234 65
Input: n
For n=4, the output should be 1234
For n=2, the output should be : 65 (not 12)
Tried \d{n} which gives 12 and \d{n,} gives 1234 but i want the exact matching one.
Pattern p = Pattern.compile("//\d{n,}");
you need negative lookaround assertion: (?<!..): negative look behind, and (?!..): negative look ahead : regex101
(?<!\d)\d{4}(?!\d)
however not all regex engine supports them, maybe a work around may match also the preceeding character and following character (contrary to look-around which are 0 width matches), (\D matches all excpet a digit)
(?:^|\D)(\d{4})(?:\D|$)
I think what you meant is the \b character.
Hence, the regex you're looking for would be (for n=2):
\b\d{2}\b
From what I understand, you're looking for a regex that will match a number in a string which has n digits, taking into into account the spacing between the numbers. If that's the case, you're looking for something like this:
\b\d{4}\b
The \b will ensure the match is constrained to the start/end of a 'word' where a word is the boundary between anything matched by \w (which includes digits) and anything matched by the opposite, \W (which includes spaces).
I don't code in java but I can try to answer this using regex in general.
If your number is in the format d1d2d3d4 d5d6 and you want to extract digits d5d6, create 3 groups as r'([0-9]+)("/s")([0-9]+)' – each set of parenthesis () represent one group. Now, extract the third group only in another object which is your required output.

Regex: How to find exact value length?

I got these several cases of given String:
key1=12345
key1=12345&key2=12345
key1=12345123456789
key1=12345123456789&key2=123456789
Using this pattern: (key1)=([^&]{5})(|$)).
The expected results are:
12345
12345, 12345
nothing
nothing
And while running, the results were:
12345
12345, 12345
12345
12345, 12345
Which means that the {5} is actually cutting the text by the given length which is 5 and not looking for exact 5.
How can I make it to look for exact 5 and not to cut the text ?
This pattern will do it:
=([^&]{5})(?:&|$)
It finds =, followed by 5 captured characters that are not &, immediately be followed by either & or the end of the string.
Test
public static void main(String[] args) {
test("key1=12345");
test("key1=12345&key2=12345");
test("key1=12345123456789");
test("key1=12345123456789&key2=123456789");
}
private static void test(String input) {
Matcher m = Pattern.compile("=([^&]{5})(?:&|$)").matcher(input);
List<String> list = new ArrayList<>();
while (m.find())
list.add(m.group(1));
System.out.println(list);
}
Output
[12345]
[12345, 12345]
[]
[]
([^=&]+)=(?<![^=&])([^=&]{5})(?![^=&])
key, then =, then five characters that are not = or &, not surrounded by more characters that are not = or &. (\b is useful in limited circumstances where your values are guaranteed to only consist of letters, numbers and underscores; negative lookaround is much more general).
Basically, you must add boundaries somehow. One can use anchors for that (such as \b for "word boundary", or ^ and $ for string/line boundary). Other way is to limit match until a given character appears - e.g [^&\n] - not until the & sign or a newline, then check the length programmaticaly in Java.
I came up with a regex of my own: demo here
(?:=)(\d{5})(?:[&|\n])
First of all I look for a = sign, but do not capture it.
Then I look for 5 digits...
... followed either by & sign or a newline (which I do not capture either)
If you need to look only for key1, key2 values, just add (?:key\d=) instead of (?:=) - demo
Then the only matches found are of given length.
#Stribizhev's regex might be potentially more secure and more false-positive protected, though.

Regex to allow only 10 or 16 digit comma separated number

I want to validate a textfield in a Java based app where I want to allow only comma separated numbers and they should be either 10 or 16 digits. I have a regex that ^[0-9,;]+$ to allow only numbers, but it doesn't work for 10 or 16 digits only.
You can use {n,m} to specify length.
So matching one number with either 10 or 16 digits would be
^(\d{10}|\d{16})$
Meaning: match for exactly 10 or 16 digits and the stuff before is start-of-line and the stuff behind is end-of-line.
Now add separator:
^((\d{10}|\d{16})[,;])*(\d{10}|\d{16})$
Some sequences of 10-or-16 digit followed by either , or ; and then one sequece 10-or-16 with end-of-line.
You need to escape those \ in java.
public static void main(String[] args) {
String regex = "^((\\d{10}|\\d{16})[,;])*(\\d{10}|\\d{16})$";
String y = "0123456789,0123456789123456,0123456789";
System.out.println(y.matches(regex)); //Should be true
String n = "0123456789,01234567891234567,0123456789";
System.out.println(n.matches(regex)); //should be false
}
I would probably use this regex:
(\d{10}(?:\d{6})?,?)+
Explanation:
( - Begin capture group
\d{10} - Matching at least 10 digits
(?: - Begin non capture group
\d{6} - Match 6 more digits
)? - End group, mark as optional using ?
,? - optionally capture a comma
)+ - End outer capture group, require at least 1 or more to exist? (mabye change to * for 0 or more)
The following inputs match this regex
1234567890123456,1234567890
1234567890123456
1234567890
these inputs do not match
123,1234567890
12355
123456789012
You need to have both anchors and word boundaries:
/^(?:\b(?:\d{10}|\d{16})\b,?)*$/
The anchors are necessary so you don't get false positives for partial matches and the word boundaries are necessary so you don't get false positives for 20, 26, 30, 32 digit numbers.
Here is my version
(?:\d+,){9}\d+|(?:\d+,){15}\d+
Let's review it. First of all there is a problem to say: 10 or 16. So, I have to create actually 2 expressions with | between them.
Second, the expression itself. Your version just says that you allow digits and commas. However this is not what you really want because for example string like ,,, will match your regex.
So, the regex should be like (?:\d+,){n}\d+ that means: sequence of several digits terminated by comma and then sequence of several digits, e.g. 123,45,678 (where 123,45 match the first part and 678 match the second part)
Finally we get regex that I have written in the beginning of my answer:
(?:\d+,){9}\d+|(?:\d+,){15}\d+
And do not forget that when you write regex in you java code you have to duplicate the back slash, like this:
Pattern.compile("\\d+,{9}\\d+|\\d+,{15}\\d+")
EDIT: I have just added non-capturing group (?: ...... )

regex can't get special constructs (?=x) to work

I'm trying to get a valid regex to use in java (java.util.regex) that validates the following format:
a number that has max 15 digits, of which 3 MAX digits may be decimal which are preceeded by a separator (,)
So, valid would be:
123456789012345 (15 digits, ok)
12345678901234,1
[EDIT], these should also be valid:
1234567890123,45
123456789012,345
So far i've come up with the following regex pattern:
Pattern = [0-9]{1,15}(\,[0-9]{1,3})?
This checks for a a range of 1 to 15 digits, following by an optional separator followed by another 3 digits. However, this doesnt check the total length of the input. With this regex, a value of 123456789012345,123 would be valid.
So, i thought i'd add an extra expression that checks the total length, and use the "(?=" construct to simulate the logical AND behaviour.
So i started with adding that to my existing regex expression as follows:
Pattern = (?= [0-9]{1,15}(\,[0-9]{1,3})?)
This however results in basically everything i throw at it failing, and i cant get it to work further. I don't see what i'm doing wrong here? After this works, i'd add another expression to check total length, something like (?=.{16}) i think.
[EDIT]
Realised you wanted to accept total length of 16 if there is a ,, and also that you don't really need to use lookaround here, since you only have two cases. This works just fine:
public static boolean isValid(String input) {
return input.matches("^(\\d{0,15}|\\d{1,12},\\d{1,3})$");
}
This returns valid if one of these is true
input consists of 0-15 numbers or
input consists of 1-12 numbers, followed by a ,, followed by 1-3 numbers
[EDIT2]
Ok, new try:
public static boolean isValid(String input) {
return input.matches("^(\\d{0,15}|(?=.{3,16})\\d+,\\d{1,3})$");
}
This returns valid if one of these is true
input consists of 0-15 numbers or
input consists of 3-16 characters, consisting of at least one digit, followed by a ,, followed by 1-3 numbers
What about this one? play it on RegExr
\d{12,15},?\d{3}
this worked for me.
boolean a = Pattern.matches("\\d{15}|\\d{12},\\d{3}", "123456789012345");
System.out.println(a);//true
boolean b = Pattern.matches("\\d{15}|\\d{12},\\d{3}", "123456789012,345");
System.out.println(b);//true
boolean c = Pattern.matches("\\d{15}|\\d{12},\\d{3}", "1234567890123,456");
System.out.println(c);//false
so your regEx is:
\d{15}|\d{12},\d{3}
Try this regex:
^\d{1,12}(?:\d{0,3}|\d{0,2},\d|\d{0,1},\d\d|,\d\d\d)$

Categories

Resources