Regex for postcode with - allowed - java

This is my regex for postcodes
^[a-zA-Z0-9]{1,9}$
but A-12345 is not allowed. How to change the regex that - will be also allowed?

Add - at the beginning or at the end of the character set ([...]):
^[-a-zA-Z0-9]{1,9}$
Why at the beginning or at the end?: If - is placed as the first or the last character, it will match - literally instead of matching range of characters.

Try this:
^[a-zA-Z0-9-]{1,9}$
This will match strings consisting of 1 to 9 Latin letters, decimal digits or hyphens. If you use the CASE_INSENSITIVE flag, you can simplify this to:
^[a-z0-9-]{1,9}$

Related

Regex-How to prevent repeated special characters?

I don't have an experience on Regular Expressions. I need to a regular expression which doesn't allow to repeat of special characters (+-*/& etc.)
The string can contain digits, alphanumerics, and special characters.
This should be valid : abc,df
This should be invalid : abc-,df
i will be really appreciated if you can help me ! Thanks for advance.
Two solutions presented so far match a string that is not allowed.
But the tilte is How to prevent..., so I assume that the regex
should match the allowed string. It means that the regex should:
match the whole string if it does not contain 2
consecutive special characters,
not match otherwise.
You can achieve this putting together the following parts:
^ - start of string anchor,
(?!.*[...]{2}) - a negative lookahead for 2 consecutive special
characters (marked here as ...), in any place,
a regex matching the whole (non-empty) string,
$ - end of string anchor.
So the whole regex should be:
^(?!.*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2}).+$
Note that within a char class (between [ and ]) a backslash
escaping the following char should be placed before - (if in
the middle of the sequence), closing square bracket,
a backslash itself and / (regex terminator).
Or if you want to apply the regex to individual words (not the whole
string), then the regex should be:
\b(?!\S*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2})\S+
[\,\+\-\*\/\&]{2,} Add more characters in the square bracket if you want.
Demo https://regex101.com/r/CBrldL/2
Use the following regex to match the invalid string.
[^A-Za-z0-9]{2,}
[^\w!\s]{2,} This would be a shortest version to match any two consecutive special characters (ignoring space)
If you want to consider space, please use [^\w]{2,}

Regex for only two comma separated values, keeping second value optional

I am creating regex for two comma separated values (example - coordinates), i am using regex like below -
^(\-?\d+(\.\d+)?),\s*(\-?\d+(\.\d+)?)$
The above regex mandates two comma separated values, but i want the second value as optional including comma, so i tried changing the regex like this -
^(\-?\d+(\.\d+)?)(,\s*(\-?\d+(\.\d+)?)$)?
This is working but and keeping the second value optional, but it is also allowing comma without any second value like below -
3456,
What can be added in the regex to not allowing comma if second value is not present ? Thanks.
You misplaced the quantifier with the anchor.
Use
^(-?\d+(\.\d+)?)(,\s*(-?\d+(\.\d+)?))?$
^^
See the regex demo.
You may adjust the number of capturing groups in your pattern and convert the optional group into non-capturing by adding ?:after the opening (. I'd use it like
^(-?\d+(?:\.\d+)?)(?:,\s*(-?\d+(?:\.\d+)?))?$
See another demo.
Also note you do not need to escape a hyphen outside a character class.
When using it in Java, do not forget to use double backslashes to define a literal backslash in the string literal and omit ^ and $ if you use the pattern with .matches() method:
s.matches("-?\\d+(?:\\.\\d+)?(?:,\\s*-?\\d+(?:\\.\\d+)?)?")
Details:
^ - start of string anchor
(-?\d+(\.\d+)?) - Group 1 matching an optional hyphen, 1+ digits, then an optional sequence (Group 2) of a dot followed with one or more digits
(,\s*(-?\d+(\.\d+)?))? - an optional sequence (Group 3) matching one or zero occurrences of:
, - comma
\s* - zero or more whitespaces
(-?\d+(\.\d+)?) - Group 4 matching
-? - an optional hyphen
\d+ - one or more digits
(\.\d+)? - Group 5 matching an optional sequence of a dot followed with 1 or more digits
$ - end of string

Restrict consecutive characters using Java Regex

I need to allow alphanumeric characters , "?","." , "/" and "-" in the given string. But I need to restrict consecutive - only.
For example:
www.google.com/flights-usa should be valid
www.google.com/flights--usa should be invalid
currently I'm using ^[a-zA-Z0-9\\/\\.\\?\\_\\-]+$.
Please suggest me how to restrict consecutive - only.
You may use grouping with quantifiers:
^[a-zA-Z0-9/.?_]+(?:-[a-zA-Z0-9/.?_]+)*$
See the regex demo
Details:
^ - start of string
[a-zA-Z0-9/.?_]+ - 1 or more characters from the set defined in the character class (can be replaced with [\w/.?]+)
(?:-[a-zA-Z0-9/.?_]+)* - zero or more sequences ((?:...)*) of:
- - hyphen
[a-zA-Z0-9/.?_]+ - see above
$ - end of string.
Or use a negative lookahead:
^(?!.*--)[a-zA-Z0-9/.?_-]+$
^^^^^^^^^
See the demo here
Details:
^ - start of string
(?!.*--) - a negative lookahead that will fail the match once the regex engine finds a -- substring after any 0+ chars other than a newline
[a-zA-Z0-9/.?_-]+ - 1 or more chars from the set defined in the character class
$ - end of string.
Note that [a-zA-Z0-9_] = \w if you do not use the Pattern.UNICODE_CHARACTER_CLASS flag. So, the first would look like "^[\\w/.?]+(?:-[\\w/.?]+)*$" and the second as "^(?!.*--)[\\w/.?-]+$".
One approach is to restrict multiple dashes with negative look-behind on a dash, like this:
^(?:[a-zA-Z0-9\/\.\?\_]|(?<!-)-)+$
The right side of the |, i.e. (?<!-)-, means "a dash, unless preceded by another dash".
Demo.
I'm not sure of the efficiency of this, but I believe this should work.
^([a-zA-Z0-9\/\.\?\_]|\-([^\-]|$))+$
For each character, this regex checks if it can match [a-zA-Z0-9\/\.\?\_], which is everything you included in your regex except the hyphen. If that does not match, it instead tries to match \-([^\-]|$), which matches a hyphen not followed by another hyphen, or a hyphen at the end of the string.
Here's a demo.

How to replace all non-digit charaters in a string?

I need to replace all non-digit charaters in the string. For instance:
String: 987sdf09870987=-0\\\`42
Replaced: 987**sdf**09870987**=-**0**\\\`**42
That's all non-digit char-sequence wrapped into ** charaters. How can I do that with String::replaceAll()?
(?![0-9]+$).*
the regex doesn't match what I want. How can I do that?
(\\D+)
You can use this and replace by **$1**.See demo.
https://regex101.com/r/fM9lY3/2
You can use a negated character class for a non-digit and use the 0th group back-reference to avoid overhead with capturing groups (it is minimal here, but still is):
String x = "987sdf09870987=-0\\\\\\`42";
x = x.replaceAll("[^0-9]+", "**$0**");
System.out.println(x);
See demo on IDEONE. Output: 987**sdf**09870987**=-**0**\\\`**42.
Also, in Java regex, character classes look neater than multiple escape symbols, that is why I prefer this [^0-9]+ pattern meaning match 1 or more (+) symbols other than (because of ^) digits from 0 to 9 ([0-9]).
A couple of words about your (?![0-9]+$).* regex. It consists of a negative lookahead (?![0-9]+$) that checks if from the current position onward there are no digits only (if there are only digits up to the end of string, the match fails), and .* matching any characters but a newline. You can see example of what it is doing here. I do not think it can help you since you need to actually match non-numbers, not just check if digits are absent.

Match first occurrence of semicolon in string, only if not preceded by '--'

I'm trying to write a regular expression for Java that matches if there is a semicolon that does not have two (or more) leading '-' characters.
I'm only able to get the opposite working: A semicolon that has at least two leading '-' characters.
([\-]{2,}.*?;.*)
But I need something like
([^([\-]{2,})])*?;.*
I'm somehow not able to express 'not at least two - characters'.
Here are some examples I need to evaluate with the expression:
; -- a : should match
-- a ; : should not match
-- ; : should not match
--; : should not match
-;- : should match
---; : should not match
-- semicolon ; : should not match
bla ; bla : should match
bla : should not match (; is mandatory)
-;--; : should match (the first occuring semicolon must not have two or more consecutive leading '-')
It seems that this regex matches what you want
String regex = "[^-]*(-[^-]+)*-?;.*";
DEMO
Explanation: matches will accept string that:
[^-]* can start with non dash characters
(-[^-]+)*-?; is a bit tricky because before we will match ; we need to make sure that each - do not have another - after it so:
(-[^-]+)* each - have at least one non - character after it
-? or - was placed right before ;
;.* if earlier conditions ware fulfilled we can accept ; and any .* characters after it.
More readable version, but probably little slower
((?!--)[^;])*;.*
Explanation:
To make sure that there is ; in string we can use .*;.* in matches.
But we need to add some conditions to characters before first ;.
So to make sure that matched ; will be first one we can write such regex as
[^;]*;.*
which means:
[^;]* zero or more non semicolon characters
; first semicolon
.* zero or more of any characters (actually . can't match line separators like \n or \r)
So now all we need to do is make sure that character matched by [^;] is not part of --. To do so we can use look-around mechanisms for instance:
(?!--)[^;] before matching [^;] (?!--) checks that next two characters are not --, in other words character matched by [^;] can't be first - in series of two --
[^;](?<!--) checks if after matching [^;] regex engine will not be able to find -- if it will backtrack two positions, in other words [^;] can't be last character in series of --.
How about just splitting the string along -- and if there are two or more sub strings, checking if the last one contains a semicolon?
How about using this regex in Java:
[^;]*;(?<!--[^;]{0,999};).*
Only caveat is that it works with up to 999 character length between -- and ;
Java Regex Demo
I think this is what you're looking for:
^(?:(?!--).)*;.*$
In other words, match from the start of the string (^), zero or more characters (.*) followed by a semicolon. But replacing the dot with (?:(?!--).) causes it to match any character unless it's the beginning of a two-hyphen sequence (--).
If performance is an issue, you can exclude the semicolon as well, so it never has to backtrack:
^(?:(?!--|;).)*;.*$
EDIT: I just noticed your comment that the regex should work with the matches() method, so I padded it out with .*. The anchors aren't really necessary, but they do no harm.
You need a negative lookahead!
This regex will match any string which does not contain your original match pattern:
(?!-{2,}.*?;.*).*?;.*
This Regex matches a string which contains a semicolon, but not one occuring after 2 or more dashes.
Example:

Categories

Resources