Minimum amount of chars excluding space - java

I have a regex like this (which is thanks to you guys in a big way):
(?<=( |\\s|\\A|^))#(!)[\\w]{3,}+ ?[\\w]*
Which works great however I now need to match one more case and I can't work out how to do it. I need to have a minimum of 3 chars after the # which I've done but I also need toallow for a minimum of 3 chars, at least two before a space and one after however a space is optional. So I need to match these patterns:
#tst
#tst test
#ts t
How can I enforce a minimum of three chars if there's no space or a minimum of two chars, a space and then at least one more char? I can do it as two seperate expressions but I'm hoping it's possible to do it with one?
Can anyone point me in the right direction..
Thanks.
EDIT:
Ok I think I've kind of achieved what I want with:
(?<=( |\\s|\\A|^))#{1}(([\\w]{3,}+ ?[\\w]*)|([\\w]{2,} {1}[\\w]{1,}))
Is there a more efficient way or is this how it should be done?

I think you can simplify your expression a bit:
String regex = "(?<=(\\s|\\A|^))#(\\w{2,} ?\\w+)";
What I have done:
Removed the redundant space from the first part.
Simplified the last expression. It now accepts, as per your description, a minimum of 2 characters, followed by an optional whitespace, followed by at least one more character.
I'm not sure what the point of the (!) part was, so it is removed in this version to match your test cases.

The following one should suit your needs:
(?<=(?<!\w)#)\w{2,} ?\w+
Debuggex Demo
Don't forget to escape the backslashes in Java since in a string literal:
(?<=(?<!\\w)#)\\w{2,} ?\\w+

The simplest regex I can think of is:
(?<=\s|^)#(\s*\w){3,}

Related

Regular expression for the name

I need to build a regex for a name with the following pattern, so
John D.E. would pass the regex test.
Basically what I want is:
N number of chars(a-zA-Z) goes first
Then there's exactly one space
Exactly one char(a-zA-Z)
Exactly one dot
Exactly one char(a-zA-Z)
Exactly one dot
I wrote this regex ^([a-zA-Z]*)+( {1})+([a-zA-Z]{1})+(\.)+([a-zA-Z]{1})+(\.), but it doesn't seem to work properly (the expression still allows n number of spaces, for example). How do I restrict it? {1} doesn't work.
Try this:
^([a-zA-Z])+([ ]{1})([a-zA-Z]{1})([.])([a-zA-Z]{1})([.])
I've taken space and dots into class ([]). If you don't do this with dot, then it means any character. Alo pluses are redundant, they mean more than one character.
P.S.: #f1sh correctly notices, that having {1} doesn't change a thing, so the shorter form would be:
^([a-zA-Z])+([ ])([a-zA-Z])([.])([a-zA-Z])([.])

Regex to match a period but not if there is a period on either side (Java)

I'm looking for a regex that will match a period character, ONLY if none of that period's surrounding characters are also periods.
Fine by me... leave! FAIL
Okay.. You win. SUCCEED
Okay. SUCCEED //Note here, the period is the last char in the string.
I was thinking do:
[^\\.*]\\.
But that is just wrong and probably not at all in the right direction. I hope this question helps others in the same situation as well.
Thanks.
You need to wrap the dot in negative look arounds:
(?<![.])[.](?![.])
I prefer [.] over \\., because:
It's easier to read - there are too many back slashes in java literals already
[.] looks a bit like an X wing fighter from Star Wars ™
You can use negative look ahead and look behind or this alternative regex:
String regex = "(^\\.[^\\.]|[^\\.]\\.[^\\.]|[^\\.]\\.$)";
The first alternative check the beginning ^ of the string (if it can start with a dot), the second looks for any dot inside and the third looks for a dot at the end of the string $.
That regex will still match any period that isn't preceded by another period.
[^\.]\.[^\.] Takes care of both sides of the target period.
EDIT: Java doesn't have a raw string like Python, so you would need full escapes: [^.]\\.[^.]|^\\.[^.]|[^.]\\.$

Java Regexp pattern check

Pattern
^\\d{1}-\\d{10}|\\d{1,9}|^TWC([0-9){12})$
should validate any of these
1-23232445
1-232323
1-009121212
12
12222
TWC12222
TWC1222324
When i test for TWC pattern doesn't match, I have added "|" to consider OR condition and then to have numbers from 0-9 but limiting to 12 digits. What am i missing ?
TWC([0-9)
I think this is where it might be not working??
You need
TWC([0-9]{12})
Complete answer...
(\d{1}-\d{1,12})|^TWC(\d{1,12})$
even nicer answer ..
^(\\d-|TWC|)(\\d{1,12})$ // this syntax i believe will match your needs.
tested :)
^([0-9]-|TWC|)([0-9]{1,12})$ // or
^(\d-|TWC|)(\d{1,12})$
breakdown
^
this denotes the start of the string
\d or [0-9]
denotes one character of the numbers 0 through 9 (note \d might not work in some lanagues or require different syntax!)
|
is essentially an OR
{1,12}
will only accept a particular pattern 1-12 times for instance in my code the patternw ould be \d or [0-9]
$
is the end of the line
this essentially checks if the line contains a [0-9] with a - after,TWC, or just a nothing space to account for nothing being there at the start then reads up to 12 digits. Should work for all your cases.
testing
edit code.
all unit tests. click on "java" if you want to see them :0
more testing.
NOTE:
YOU NEED TO LOOK AT THE SYNTAX OF WHAT YOU ARE USING IN SOME CASES YOU MIGHT NEED TO \ SOME THINGS IN ORDER FOR THEM TO WORK.. IN C++/C its 2 // IN ORDER FOR THESE TO WORK PLEASE BE VERY WARY ABOUT PARTICULAR SYNTAXES.
Sorry for all the confusion, and also for lying a whole bunch apparently. The issue you're having is that you are using exact quantifiers in a couple of places you don't mean to, namely the {10} and {12}. This requires exactly ten or twelve digits in those spots. What you presumably want is for those to be {1,10} and {1,12} respectively.
What I would do is something like this, using parentheses and quantifiers to clean everything up and repeating yourself as little as possible, to avoid confusion. You've got three possible prefixes (a digit and a dash, or "TWC", or nothing). I'd put those possibilities all together, and then add the rest. This makes the regex much easier to look at.
^(\\d-|TWC){0,1}\\d{1,12}$
The breakdown:
^ is at the beginning, always.
(\\d-|TWC){0,1} Next comes either a single digit followed by a dash, or the string "TWC". This prefix occurs either zero times (for no prefix) or one time.
\\d{1,12}$ Finally, there is a string of one to twelve digits, followed by the end of the line/input (depending on your DOTALL settings of course).
Of course you won't be able to simplify it quite this much if the different prefixes can only allow certain numbers of digits, but this is the basic idea.
You've also got what looks like a typo; TWC([0-9){12}) should be TWC([0-9]{12}). I'm guessing this was just a typo when writing out the question though, since what you have right now would blow up at runtime when you tried to use it otherwise, and it sounds like it's working for some of your inputs.

Need regular expression for pattern this

I need a regular expression for below pattern
It can start with / or number
It can only contain numbers, no text
Numbers can have space in between them.
It can contain /*, at least 1 number and space or numbers and /*
Valid Strings:
3232////33 43/323//
3232////3343/323//
/3232////343/323//
Invalid Strings:
/sas/3232/////dsds/
/ /34343///// /////
///////////
My Problem is, it can have space between numbers like /3232 323/ but not / /.
How to validate it ?
I have tried so far:
(\\d[\\d ]*/+) , (/*\\d[\\d ]*/+) , (/*)(\\d*)(/*)
This regex should work for you:
^/*(?:\\d(?: \\d)*/*)+$
Live Demo: http://www.rubular.com/r/pUOYFwV8SQ
My solution is not so simple but it works
^(((\d[\d ]*\d)|\d)|/)*((\d[\d ]*\d)|\d)(((\d[\d ]*\d)|\d)|/)*$
Just use lookarounds for the last criteria.
^(?=.*?\\d)([\\d/]*(?:/ ?(?!/)|\\d ?))+$
The best would have been to use conditional regex, but I think Java doesn't support them.
Explanation:
Basically, numbers or slashes, followed by one number and a space, or one slash and a space which is not followed by another slash. Repeat that. The space is made optional because I assume there's none at the end of your string.
Try this java regex
/*(\\d[\\d ]*(?<=\\d)/+)+
It meets all your criteria.
Although you didn't specifically state it, I have assumed that a space may not appear as the first or last character for a number (ie spaces must be between numbers)
"(?![A-z])(?=.*[0-9].*)(?!.*/ /.*)[0-9/ ]{2,}(?![A-z])"
this will match what you want but keep in mind it will also match this
/3232///// from /sas/3232/////dsds/
this is because part of the invalid string is correct
if you reading line by line then match the ^ $ and if you are reading an entire block of text then search for \r\n around the regex above to match each new line

two regex patterns, can they be one?

I have two regular expressions that I use to validate Colorado driver's license formats.
[0-9]{2}[-][0-9]{3}[-][0-9]{4}
and
[0-9]{9}
We have to allow for only 9 digits but the user is free to enter it in as 123456789 or 12-345-6789.
Is there a way I can combine these into one? Like a regex conditional statement of sorts? Right now I am simply enumerating through all the available formats and breaking out once one is matched. I could always strip the hyphens out before I do the compare and only use [0-9]{9}, but then I won't be learning anything new.
For a straight combine,
(?:[0-9]{2}[-][0-9]{3}[-][0-9]{4}|[0-9]{9})
or to merge the logic (allowing dashes in one position without the other, which may not be desired),
[0-9]{2}-?[0-9]{3}-?[0-9]{4}
(The brackets around the hyphens in your first regex aren't doing anything.)
Or merging the logic so that both hyphens are required if one is present,
(?:\d{2}-\d{3}-|\d{5})\d{4}
(Your [0-9]s can also be replaced with \ds.)
How about using a backreference to match the second hyphen only if the first is given:
\d{2}(-?)\d{3}\1\d{4}
Although I've never used regexes in Java so if it's supported the syntax might be different. I've just tried this out in Ruby.
A neat version which will allow either dashes or not dashes is:
\d{2}(-?)\d{3}\1\d{4}
The capture (the brackets) will capture either '-' or nothing. The \1 will match again whatever was captured.
I think this should work:
\d{2}-?\d{3}-?\d{4}

Categories

Resources