Pattern
^\\d{1}-\\d{10}|\\d{1,9}|^TWC([0-9){12})$
should validate any of these
1-23232445
1-232323
1-009121212
12
12222
TWC12222
TWC1222324
When i test for TWC pattern doesn't match, I have added "|" to consider OR condition and then to have numbers from 0-9 but limiting to 12 digits. What am i missing ?
TWC([0-9)
I think this is where it might be not working??
You need
TWC([0-9]{12})
Complete answer...
(\d{1}-\d{1,12})|^TWC(\d{1,12})$
even nicer answer ..
^(\\d-|TWC|)(\\d{1,12})$ // this syntax i believe will match your needs.
tested :)
^([0-9]-|TWC|)([0-9]{1,12})$ // or
^(\d-|TWC|)(\d{1,12})$
breakdown
^
this denotes the start of the string
\d or [0-9]
denotes one character of the numbers 0 through 9 (note \d might not work in some lanagues or require different syntax!)
|
is essentially an OR
{1,12}
will only accept a particular pattern 1-12 times for instance in my code the patternw ould be \d or [0-9]
$
is the end of the line
this essentially checks if the line contains a [0-9] with a - after,TWC, or just a nothing space to account for nothing being there at the start then reads up to 12 digits. Should work for all your cases.
testing
edit code.
all unit tests. click on "java" if you want to see them :0
more testing.
NOTE:
YOU NEED TO LOOK AT THE SYNTAX OF WHAT YOU ARE USING IN SOME CASES YOU MIGHT NEED TO \ SOME THINGS IN ORDER FOR THEM TO WORK.. IN C++/C its 2 // IN ORDER FOR THESE TO WORK PLEASE BE VERY WARY ABOUT PARTICULAR SYNTAXES.
Sorry for all the confusion, and also for lying a whole bunch apparently. The issue you're having is that you are using exact quantifiers in a couple of places you don't mean to, namely the {10} and {12}. This requires exactly ten or twelve digits in those spots. What you presumably want is for those to be {1,10} and {1,12} respectively.
What I would do is something like this, using parentheses and quantifiers to clean everything up and repeating yourself as little as possible, to avoid confusion. You've got three possible prefixes (a digit and a dash, or "TWC", or nothing). I'd put those possibilities all together, and then add the rest. This makes the regex much easier to look at.
^(\\d-|TWC){0,1}\\d{1,12}$
The breakdown:
^ is at the beginning, always.
(\\d-|TWC){0,1} Next comes either a single digit followed by a dash, or the string "TWC". This prefix occurs either zero times (for no prefix) or one time.
\\d{1,12}$ Finally, there is a string of one to twelve digits, followed by the end of the line/input (depending on your DOTALL settings of course).
Of course you won't be able to simplify it quite this much if the different prefixes can only allow certain numbers of digits, but this is the basic idea.
You've also got what looks like a typo; TWC([0-9){12}) should be TWC([0-9]{12}). I'm guessing this was just a typo when writing out the question though, since what you have right now would blow up at runtime when you tried to use it otherwise, and it sounds like it's working for some of your inputs.
Related
I am trying to figure out a regex to match a password that contains
one upper case letter.
one number
one special character.
and at least 4 characters of length
the regex that I wrote is
^((?=.*[0-9])(?=.*[A-Z])(?=.*[^A-Za-z0-9])){4,}
however it is not working, and I couldn't figure out why.
So please can someone tell me why this code is not working, where did I mess up, and how to correct this code.
Your regex can be rewritten as
^(
(?=.*[0-9])
(?=.*[A-Z])
(?=.*[^A-Za-z0-9])
){4,}
As you see {4,} applies to group which doesn't let you match any character since look-around is zero-width, which effectively means "4 or more of nothing".
You need to add . before {4,} to let your regex handle "and at least 4 characters of length" point (rest is handled by look-around).
You can remove that capturing group since you don't really need it.
So try with something like
^(?=.*[0-9])(?=.*[A-Z])(?=.*[^A-Za-z0-9]).{4,}
You could come up with sth. like:
^(?=.*[A-Z])(?=.*\d)(?=.*[!"§$%&/()=?`]).{4,}$
In multiline mode, see a demo on regex101.com.
This approach specifies the special characters directly (which could be extended, obviously).
From the following list only the bold ones would satisfy these criteria:
test
Test123!
StrongPassword34?
weakone
Tabaluga"12???
You can still enhance this expression by being more specific and requiring contrary pairs. Just to remind you, the dot-star (.*) brings you down the line and then backtracks eventually. This will almost always require more steps than to directly look for contrary pairs.
Consider the following expression:
^ # bind the expression to the beginning of the string
(?=[^A-Z\n\r]*[A-Z]) # look ahead for sth. that is not A-Z, or newline and require one of A-Z
(?=[^\d\n\r]*\d) # same construct for digits
(?=\w*[^\w\n\r]) # same construct for special chars (\w = _A-Za-z0-9)
.{4,}
$
You'll see a significant reduction in steps as the regex engine does not have to backtrack everytime.
I'm looking for a regex that will match a period character, ONLY if none of that period's surrounding characters are also periods.
Fine by me... leave! FAIL
Okay.. You win. SUCCEED
Okay. SUCCEED //Note here, the period is the last char in the string.
I was thinking do:
[^\\.*]\\.
But that is just wrong and probably not at all in the right direction. I hope this question helps others in the same situation as well.
Thanks.
You need to wrap the dot in negative look arounds:
(?<![.])[.](?![.])
I prefer [.] over \\., because:
It's easier to read - there are too many back slashes in java literals already
[.] looks a bit like an X wing fighter from Star Wars ™
You can use negative look ahead and look behind or this alternative regex:
String regex = "(^\\.[^\\.]|[^\\.]\\.[^\\.]|[^\\.]\\.$)";
The first alternative check the beginning ^ of the string (if it can start with a dot), the second looks for any dot inside and the third looks for a dot at the end of the string $.
That regex will still match any period that isn't preceded by another period.
[^\.]\.[^\.] Takes care of both sides of the target period.
EDIT: Java doesn't have a raw string like Python, so you would need full escapes: [^.]\\.[^.]|^\\.[^.]|[^.]\\.$
I have a regex like this (which is thanks to you guys in a big way):
(?<=( |\\s|\\A|^))#(!)[\\w]{3,}+ ?[\\w]*
Which works great however I now need to match one more case and I can't work out how to do it. I need to have a minimum of 3 chars after the # which I've done but I also need toallow for a minimum of 3 chars, at least two before a space and one after however a space is optional. So I need to match these patterns:
#tst
#tst test
#ts t
How can I enforce a minimum of three chars if there's no space or a minimum of two chars, a space and then at least one more char? I can do it as two seperate expressions but I'm hoping it's possible to do it with one?
Can anyone point me in the right direction..
Thanks.
EDIT:
Ok I think I've kind of achieved what I want with:
(?<=( |\\s|\\A|^))#{1}(([\\w]{3,}+ ?[\\w]*)|([\\w]{2,} {1}[\\w]{1,}))
Is there a more efficient way or is this how it should be done?
I think you can simplify your expression a bit:
String regex = "(?<=(\\s|\\A|^))#(\\w{2,} ?\\w+)";
What I have done:
Removed the redundant space from the first part.
Simplified the last expression. It now accepts, as per your description, a minimum of 2 characters, followed by an optional whitespace, followed by at least one more character.
I'm not sure what the point of the (!) part was, so it is removed in this version to match your test cases.
The following one should suit your needs:
(?<=(?<!\w)#)\w{2,} ?\w+
Debuggex Demo
Don't forget to escape the backslashes in Java since in a string literal:
(?<=(?<!\\w)#)\\w{2,} ?\\w+
The simplest regex I can think of is:
(?<=\s|^)#(\s*\w){3,}
I've been pulling my hair out over this, and I know it's a simple solution that just seems to escape me at the moment.
I am attempting to perform a match using a Regex code (client side, character classes only) that will match "looking for" within 20 spaces (any character) of "male".
I don't care what the characters or spaces are, it must not find a match if the two words/phrases are more than 20 characters apart.
I have the code setup to match the phrases I just need to know how to set the parameter of a distance search. "Only match Looking for with Male if they are within zero to twenty characters of each other.
(?i).*looking for.{0,20}male.*
The (?i) flag is just "ignore case".
EDIT:
with the suggestions:
Pattern.compile("(?is).*\\blooking for\\b.{0,20}\\bman\\b.*");
Maybe you shouldn't pull your hair out but instead start with the root of the issue? I mean can't you write your code/application more logical so you wouldn't need to do such weird string search with even weirder distance matching?
I have two regular expressions that I use to validate Colorado driver's license formats.
[0-9]{2}[-][0-9]{3}[-][0-9]{4}
and
[0-9]{9}
We have to allow for only 9 digits but the user is free to enter it in as 123456789 or 12-345-6789.
Is there a way I can combine these into one? Like a regex conditional statement of sorts? Right now I am simply enumerating through all the available formats and breaking out once one is matched. I could always strip the hyphens out before I do the compare and only use [0-9]{9}, but then I won't be learning anything new.
For a straight combine,
(?:[0-9]{2}[-][0-9]{3}[-][0-9]{4}|[0-9]{9})
or to merge the logic (allowing dashes in one position without the other, which may not be desired),
[0-9]{2}-?[0-9]{3}-?[0-9]{4}
(The brackets around the hyphens in your first regex aren't doing anything.)
Or merging the logic so that both hyphens are required if one is present,
(?:\d{2}-\d{3}-|\d{5})\d{4}
(Your [0-9]s can also be replaced with \ds.)
How about using a backreference to match the second hyphen only if the first is given:
\d{2}(-?)\d{3}\1\d{4}
Although I've never used regexes in Java so if it's supported the syntax might be different. I've just tried this out in Ruby.
A neat version which will allow either dashes or not dashes is:
\d{2}(-?)\d{3}\1\d{4}
The capture (the brackets) will capture either '-' or nothing. The \1 will match again whatever was captured.
I think this should work:
\d{2}-?\d{3}-?\d{4}