using regex in java for validate email - java

I am trying to validate a certain subset of the e-mail format with regular expressions, but what I've tried so far doesn't quite work. This is my regex (Java):
boolean x = l.matches(
"^[_A-Za-z0-9-\\\\+]+(\\\\.[_A-Za-z0-9-]+)*#\"\n" +"+ \"[A-Za-z0-9-]+(\\\\.[A-Za-z0-9]+)*(\\\\.[A-Za-z]{2,})$"
);
Thse are the conditions that the string has to match:
Mail domain is from the list:
www.fightclub.uk
www.fightclub.lk
www.fightclub.sa
www.fightclub.cc
www.fightclub.jp
www.fightclub.se
www.fightclub.xy
www.fightclub.gi
www.fightclub.rl
www.fightclub.ss
username has 3 to 6 characters(only lowercase English letters and numbers)
examples:
sonia6#fightclub.com is valid
am#fightclub2.lk is invalid

You can use:
^[a-z0-9]{3,6}#fightclub\.(?:uk|lk|sa|cc|jp|se|xy|gi|rl|ss)$
^ indicates start of string
[a-z0-9]{3,6} lowercase letters or number with length 3-6 characters
followed by #fightclub
followed by a period \.
followed by a list of domains (?: indicate that it's a non-capturing group. All your domain extensions are listed here.
$ indicates end of string
DEMO: https://regex101.com/r/rYYXYA/1

Related

How to remove everything after specific character in string using Java

I have a string that looks like this:
analitics#gmail.com#5
And it represents my userId.
I have to send that userId as parameter to the function and send it in the way that I remove number 5 after second # and append new number.
I started with something like this:
userService.getUser(user.userId.substring(0, userAfterMigration.userId.indexOf("#") + 1) + 3
What is the best way of removing everything that comes after the second # character in string above using Java?
Here is a splitting option:
String input = "analitics#gmail.com#5";
String output = String.join("#", input.split("#")[0], input.split("#")[1]) + "#";
System.out.println(output); // analitics#gmail.com#
Assuming your input would only have two at symbols, you could use a regex replacement here:
String input = "analitics#gmail.com#5";
String output = input.replaceAll("#[^#]*$", "#");
System.out.println(output); // analitics#gmail.com#
You can capture in group 1 what you want to keep, and match what comes after it to be removed.
In the replacement use capture group 1 denoted by $1
^((?:[^#\s]+#){2}).+
^ Start of string
( Capture group 1
(?:[^#\s]+#){2} Repeat 2 times matching 1+ chars other than #, and then match the #
) Close group 1
.+ Match 1 or more characters that you want to remove
Regex demo | Java demo
String s = "analitics#gmail.com#5";
System.out.println(s.replaceAll("^((?:[^#\\s]+#){2}).+", "$1"));
Output
analitics#gmail.com#
If the string can also start with ##1 and you want to keep ## then you might also use:
^((?:[^#]*#){2}).+
Regex demo
The simplest way that would seem to work for you:
str = str.replaceAll("#[^.]*$", "");
See live demo.
This matches (and replaces with blank to delete) # and any non-dot chars to the end.

Restrict particular domain in email regular expression

I have an existing regex which validates the email input field.
[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!$%&'*+/=?^_`{|}~-]+)*(\\.)?#(?:[a-zA-Z0-9ÄÖÜäöü](?:[a-zA-Z0-9-_ÄÖÜäöü]*[a-zA-Z0-9_ÄÖÜäöü])?\\.)+[a-zA-Z]{2,}
Now, I want this regex to not match for two particular type of email IDs. Which are wt.com and des.net
To do that I made the following changes in the above expression like this.
[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!$%&'*+/=?^_`{|}~-]+)*(\\.)?#(?!wt\\.com)(?!des\\.net)(?:[a-zA-Z0-9ÄÖÜäöü](?:[a-zA-Z0-9-_ÄÖÜäöü]*[a-zA-Z0-9_ÄÖÜäöü])?\\.)+[a-zA-Z]{2,}
After this it does not matches with any email id which ends with the wt.com and des.net which is right.
But the problem is it does not match with wt.comm or any other letter after the restricted string too..
I just want to restrict email which ends with wt.com and des.net
How do I do that?
Below is the sample emails which should match or not.
ajcom#wt.com : no match
ajcom#aa.an : match
ajcom#wt.coms :match
ajcom#des.net : no match
ajcom#des.neta: match
If you want to prevent only wt.com and des.net which have no characters after it you can add $ anchor (which represents end of string) at the end of each negative-look-ahead.
So instead of (?!wt\\.com)(?!des\\.net) use (?!wt\\.com$)(?!des\\.net$)

Java replace all occurences of regex with another regex

Let's say I have a string with an xml many occurences of <tagA>:
String example = " (...) some xml here (...)
<tagA>283940</tagA>
(...) some xml here (...)
<tagA>& 9940</tagA>
<tagA>- 99440</tagA>
<tagA>< 99440</tagA>
<tagA>99440</tagA>
(...) more xml here (...) "
The content should contain only digits, but sometimes it has a random character followed by a whitespace and the the digits.
I want to remove the unwanted character and the whitespace. How to do that?
So far I know I should be looking for a regex "<tagA>. [0-9]*<\/tagA>" but I am stuck here.
I want to replace the characters because among those characters there are "&", ">", "<" signs which make the xml invalid (which prevents me from treating this as an XML).
The regex that you're looking for is:
<(\w+)>(\D{0,})(\d+)
On the search Group 1 you'll get the TAG, on the Group 2 you'll get your weird stuff (everything that is not a digit) and in Group 3 there's the number.
There's an "enhanced version" of this regex that might work in more situations: (\w{0,})(<\w+>)(\D{0,})(\d+)(\D{0,})(<\/\w+>)(\w{0,})
This will place in the Group 1 any whitespace that might be before the tag. Group 7 will take care of the trailing whitespaces.
Group 2 and 6 will match the opening tag and closing tag.
Group 3 and 5 will match any weird character that you might have between your value.
Group 4 will contain your value.
With the String::replaceAll, you can filter and sanitize by printing only the group 2, 4 and 6, getting rid of the rest.
//input data
String s = "<tagA>283940</tagA>\n" +
" <tagA>& 9940<</tagA>\n" +
" <tagA>- 99440</tagA>\n" +
" <tagA>< 99440</tagA>\n" +
" <tagA>99440</tagA>"
+ "<13243> asdfasdf </>";
String replaced = s.replaceAll("(\\s{0,})(<\\w+>)(\\D{0,})(\\d+)(\\D{0,})(<\\/\\w+>)(\\s{0,})", "$2$4$6");
System.out.println(replaced);
Output: <tagA>283940</tagA><tagA>9940</tagA><tagA>99440</tagA><tagA>99440</tagA><tagA>99440</tagA><13243> asdfasdf </>

Weird password check matching using regex in Java

I'm trying to check a password with the following constraint:
at least 9 characters
at least 1 upper case
at least 1 lower case
at least 1 special character into the following list:
~ ! # # $ % ^ & * ( ) _ - + = { } [ ] | : ; " ' < > , . ?
no accentuated letter
Here's the code I wrote:
Pattern pattern = Pattern.compile(
"(?!.*[âêôûÄéÆÇàèÊùÌÍÎÏÐîÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ€£])"
+ "(?=.*\\d)"
+ "(?=.*[a-z])"
+ "(?=.*[A-Z])"
+ "(?=.*[`~!##$%^&*()_\\-+={}\\[\\]\\\\|:;\"'<>,.?/])"
+ ".{9,}");
Matcher matcher = pattern.matcher(myNewPassword);
if (matcher.matches()) {
//do what you've got to do when you
}
The issue is that some characters like € or £ doesn't make the password wrong.
I don't understand why this is working that way since I explicitly exclude € and £ from the authorized list.
Rather than trying to disallow those non-ascii characters why not makes your regex accept only ASCII characters like this:
Pattern pattern = Pattern.compile(
"(?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*\\p{Print})\\p{ASCII}{9,})");
Also see use of \p{Print} instead of the big character class. I believe that would be suffice for you.
Check Javadoc for more details
This just allows printable Ascii. Note that it allows space character, but you could disallow space by setting \x21 instead.
Edit - I didn't see a number in the requirement, saw it in your regex, but wasn't sure.
# "^(?=.*[A-Z])(?=.*[a-z])(?=.*[`~!##$%^&*()_\\-+={}\\[\\]|:;\"'<>,.?])[\\x20-\\x7E]{9,}$"
^
(?= .* [A-Z] )
(?= .* [a-z] )
(?= .* [`~!##$%^&*()_\-+={}\[\]|:;"'<>,.?] )
[\x20-\x7E]{9,}
$

Java String Replace Regex

I am doing some string replace in SQL on the fly.
MySQLString = " a.account=b.account ";
MySQLString = " a.accountnum=b.accountnum ";
Now if I do this
MySQLString.replaceAll("account", "account_enc");
the result will be
a.account_enc=b.account_enc
(This is good)
But look at 2nd result
a.account_enc_num=a.account_enc_num
(This is not good it should be a.accountnum_enc=b.accountnum_enc)
Please advise how can I achieve what I want with Java String Replace.
Many Thanks.
From your comment:
Is there anyway to tell in Regex only replace a.account=b.account or a.accountnum=b.accountnum. I do not want accountname to be replace with _enc
If I understand correctly you want to add _enc part only to account or accountnum. To do this you can use
MySQLString = MySQLString.replaceAll("\\baccount(num)?\\b", "$0_enc");
(num)? mean that num is optional so regex will accept account or accountnum
\\b at start mean that there can be no letters, numbers or "_" before account so it wont accept (affect) something like myaccount, or my_account.
\\b at the end will prevent other letters, numbers or "_" after account or accountnum.
It's hard to extrapolate from so few examples, but maybe what you want is:
MySQLString = MySQLString.replaceAll("account\\w*", "$0_enc");
which will append _enc to any sequence of letters, digits, and underscores that starts with account.
try
String s = " a.accountnum=b.accountnum ".replaceAll("(account[^ =]*)", "$1_enc");
it means replace any sequence characters which are not ' ' or '=' which starts the word "account" with the sequence found + "_enc".
$1 is a reference to group 1 in regex; group 1 is the expression in parenthesis (account[^ =]+), i.e. our sequence
See http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for details

Categories

Resources