At least 1 capital, followed by numeric, optional dashes and space - java

I want to implement regex which allows only capital, numeric, optional dash and space.
Format is: At least 1 capital, followed by numeric and optional dash and space.
I tried
/^[A-Z0-9- ]+$/
But it's not working. Can anyone please help. Thanks in advance.

You have mentioned "Atleast 1 capital followed by numeric and optional dash and space"
this is the regular expression for at least one Capital letter at first followed by any number of letters, hypens and spaces
^[A-Z][a-zA-Z0-9 -]*$
In case you meant Capital letter at any position,
this is the regular expression for at least one Capital letter at any position , and only containing zero or more letters[upper and lowercase] , numbers, space and hyphen at other positions
^[a-zA-Z0-9 -]*[A-Z][a-zA-Z0-9 -]*$
This is valid
This is valid 2
This is n*t valid
thiS is valid 3
this Is -4 valid
This + is not VALID.
--- this is Valid 2 --

Related

How to insert spaces after full stops at the end of sentences, but not in abbreviations or floating point numbers?

I have a JTextArea in which I want to replace all full stops without a space next to them e.g in "This is a sentence.This is another C.O.D sentence.This is yet another C.A.T. sentence." to "This is a sentence. This is another C.O.D sentence. This is yet another C.A.T. sentence.". But I don't want the abbreviations or floating point numbers to gain extra spaces e.g "This is a C.A.T. float 5.5" should not become "This is a C. A. T. float 5. 5"! I am using string.replaceAll(".",". ") for this which is not proving to be sufficient.
Keeping it simple, without negative look-behinds and such:
s = s.replaceAll("([^A-Z0-9.])\\.([^0-9 \t])", "$1. $2");
Replace the period when not:
after a capital itself (U.N.C. or M.Twain)
after a digit (1. - hoping the sentence does not end in a digit)
after a period (...)
before a digit (.5 - hoping the next sentence does not start with a digit)
before a space or tab
you can use the regex
([^A-Z])\.(?!\d)
which replaces all "." not followed by a number and not preceded by a uppercase letter
see the regex demo, online compiler
(You should edit your question to clearly state your requirement, e.g. handling of abbreviation)
You could replace (?<!\b[A-Z])\.(?!\d) with .<space>
Demonstration: https://regex101.com/r/g1g7Yg/1
Explanation:
(?<! ) negative look-behind group
\b[A-Z] word boundary following by one uppercase character
(i.e. one upper case character)
\. a dot
(?!\d) negative look-ahead group, of single digit
Which basically means, replace a dot if it is NOT preceded by single upper case character, and NOT followed by digit
There are still some flaws that it will not replace Hello world.1 apple 1 day. It shouldn't be difficult to change the regex to fix this if you understand the above regex.

Emailid validation using java regex

Can anyone please help me to find out a solution of this problem using "java regex".
Question: The EmailId should be in the following format <<1st part>>.<<2nd part>>#<<3rd part>><<4th part>>
1st part should contain alpha numeric characters and it must contain at least 1 uppercase alphabet, 1 lowercase alphabet, and 1 number.
2nd part should contain alpha numeric characters.
3rd part should be an alphabetical value of length 3 to 8.
4th part can be “.com” or “.co.in”
My solution is:
if(EmailId.matches(""^(?=.*\\d)(?=.*[a-z])(?=.*[A-Z]).{3,}\\.[\\w&&[^_]]+#[\\w&&[^_]]{3,8}\\.(com|co\\.in)")){
return true;
}
But this solution is accepting "RAKESH1.Roshan#infy.co.in" this Email Id, which is not acceptable.
I don't know where I am going wrong.
Please help!!!!!!!!
Your regex is not working because of . used in your pattern. If you do not allow all chars, you should only stick to specific classes of chars you allow.
I suggest:
.matches("(?=\\p{Alnum}*\\p{Upper})(?=\\p{Alnum}*[0-9])(?=\\p{Alnum}*\\p{Lower})\\p{Alnum}*[.]\\p{Alnum}+#\\p{Alpha}{3,8}[.]co(m|[.]in)"))
See the Java demo. Since the pattern is used with .matches(), no ^ at the start and $ at the end anchors are necessary.
Details:
(?=\\p{Alnum}*\\p{Upper}) - Right from the start of the string, there must be an uppercase letter after 0+ alphanumeric chars
`(?=\p{Alnum}*[0-9]) - Right from the start of the string, there must be a digit after 0+ alphanumeric chars
(?=\\p{Alnum}*\\p{Lower}) - Right from the start of the string, there must be a lowercase letter after 0+ alphanumeric chars
\\p{Alnum}* - 0 or more alphanumeric chars (replace * with + if you need to require at least 1)
[.] - a literal . char
\\p{Alnum}+ - 1 or more alphanumeric chars
# - a literal # char
\\p{Alpha}{3,8} - 3 to 8 or more alphabetic chars
[.]co(m|[.]in) - .com or .co.in at the end of the string.
The problem is that your lookaheads aren't limited to the first part.
For example, with the input RAKESH1.Roshan#infy.co.in, the lookahead (?=.*[a-z]) will skip RAKESH1.R and find the following lowercase o.
You can fix this by changing .* to [^.]* in all lookaheads.
Another problem is the .{3,}. This will match any character, not just alphanumeric ones. Change this to [\\w&&[^_]]{3,} (or just [\\w&&[^_]]+).

I need help on regular expression to allow number with character

condition:
123 not valid
123 A valid
abc123 valid
abc123Ab valid
I have to apply regular expression compulsory character with number?
This will match any string starting with an optional set of digits followed by a combination of white spaces, letters and digits. But it still matches 123_ (that's 123 followed by a space `)
^\d*[\sa-zA-Z0-9]+$
The following will check if you have at least one letter in your string combined with optional digits, white spaces and letters.
[a-zA-Z\s\d]*[a-zA-Z]+?[a-zA-Z\s\d]*
[a-zA-Z\s\d] match a single character present in [].
Quantifier * : Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
(([a-zA-Z\s])*(\d{1,})([a-zA-Z\s]){1,}|([a-zA-Z\s]){1,}(\d{1,})([a-zA-Z\s])*)
first part of this expression will ensure string can start without any letters but atleast 1 digit must be present and should end with 1 or many letters. second part will ensure string can start with atleast 1 letter followed by atleast 1 digit and then followed by 0 or any number of letters.

Java Regex - What does each of these parts do?

if(password.matches("(?=.*[0-9].*[0-9])(\\w{8,})") )
System.out.println("Valid Password");
else
System.out.println("Invalid Password");
I am checking a password to ensure it has at least 8 characters in length, which can be letters or digits and it must have at least 2 digits. This appears to work for me but I just wanted to confirm I was doing this right. Also, I have been trying to research and figure out exactly what each piece is doing. Below is what I believe each piece to be doing, but if I am incorrect, would you please explain what the particular portion is actually doing. Thanks
?= tells the program to remember if the digits [0-9] which I am looking for are found ?
.* says for any number of [0-9]?
[0-9] Specifies any number from 0-9.
.*[0-9] Then the regex looks for another number from 0-9 ?
(\\w{8,}) looks for any letters (uppercase or lowercase) and digits, with a minimum length of 8 characters?
That regex has two main parts:
(?=.*[0-9].*[0-9])
(\\w{8,})
Part 1. is a positive look ahead, which has the form (?=pattern). "Look arounds" (positive/negative look behinds/aheads) assert, without consuming (or capturing), that the adjacent input matches a certain pattern. In this case, it's asserting that the input following the current point contains (at least) 2 digits (.* meaning 0-n chars, [0-9] meaning any number character). Incidentally, it could be expressed more succinctly as (?=(.*[0-9]){2}
Part 2. means "at least 8 word characters" - a word character is any letter, any number or an underscore. The brackets around it (unnecessarily) capture the 8+ word characters as group 1
?= is a positive look ahead, That means that it is searching for something ahead of it.
http://www.regular-expressions.info/lookaround.html
For more info on the look ahead.
http://rubular.com/
Great for testing out any regex.

Java Regular Expression: what is " '- "

I came up to a line in java that uses regular expressions.
It needs a user input of Last Name
return lastName.matches( "[a-zA-z]+([ '-][a-zA-Z]+)*" );
I would like to know what is the function of the [ '-].
Also why do we need both a "+" and a "*" at the same time, and the [ '-][a-zA-Z] is in brackets?
Your RE is: [a-zA-z]+([ '-][a-zA-Z]+)*
I'll break it down into its component parts:
[a-zA-Z]+
The string must begin with any letter, a-z or A-Z, repeated one or more times (+).
([ '-][a-zA-Z]+)*
[ '-]
Any single character of <space>, ', or -.
[a-zA-Z]+
Again, any letter, a-z or A-Z, repeated once or more times.
This combination of letters ('- and a-ZA-Z) may then be repeated zero or more times.
Why [ '-]? To allow for hiphenated names, such as Higgs-Boson or names with apostrophes, such as O'Reilly, or names with spaces such as Van Dyke.
The expression [ '-] means "one of ', , or -". The order is very important - the dash must be the last one, otherwise the character class would be considered a range, and other characters with code points between the space and the quote ' would be accepted as well.
+ means "one or more repetitions"; * means "zero or more repetitions", referring to the term of the regular expression preceding the + or * modifier.]
Overall, the expression matches groups of lowercase and uppercase letters separated by spaces, dashes, or single quotes.
it means it can be any of the characters space ' or - ( space, quote dash )
the - can be done as \- as it also can mean a range... like a-z
This looks like it is a pattern to match double-barreled (space or hyphen) or I-don't-know-what-to-call-it names like O'Grady... for example:
It would match
counter-terrorism
De'ville
O'Grady
smith-jones
smith and wesson
But it will not match
jones-
O'Learys'
#hashtag
Bob & Sons
The idea is, after the first [A-Za-z]+ consumes all the letters it can, the match will end right there unless the next character is a space, an apostrophe, or a hyphen ([ '-]). If one of those characters is present, it must be followed by at least one more letter.
A lot of people have difficulty with this. The naively write something like [A-Za-z]+[ '-]?[A-Za-z]*, figuring both the separator and the extra chunks of letters are optional. But they're not independently optional; if there is a separator ([ '-]), it must be followed by at least one more letter. Otherwise it would treat strings like R'- j'-' as valid. Your regex doesn't have that problem.
By the way, you've got a typo in your regex: [a-zA-z]. You want to watch out for that, because [A-z] does match all the uppercase and lowercase letters, so it will seem to be working correctly as long as the inputs are valid. But it also matches several non-letter characters whose code points happen to lie between those of Z and a. And very few IDEs or regex tools will catch that error.

Categories

Resources