masking of email address in java - java

I am trying to mask email address with "*" but I am bad at regex.
input : nileshxyzae#gmail.com
output : nil********#gmail.com
My code is
String maskedEmail = email.replaceAll("(?<=.{3}).(?=[^#]*?.#)", "*");
but its giving me output nil*******e#gmail.com I am not getting whats getting wrong here. Why last character is not converted?
Also can someone explain meaning all these regex

Your look-ahead (?=[^#]*?.#) requires at least 1 character to be there in front of # (see the dot before #).
If you remove it, you will get all the expected symbols replaced:
(?<=.{3}).(?=[^#]*?#)
Here is the regex demo (replace with *).
However, the regex is not a proper regex for the task. You need a regex that will match each character after the first 3 characters up to the first #:
(^[^#]{3}|(?!^)\G)[^#]
See another regex demo, replace with $1*. Here, [^#] matches any character that is not #, so we do not match addresses like abc#example.com. Only those emails will be masked that have 4+ characters in the username part.
See IDEONE demo:
String s = "nileshkemse#gmail.com";
System.out.println(s.replaceAll("(^[^#]{3}|(?!^)\\G)[^#]", "$1*"));

If you're bad at regular expressions, don't use them :) I don't know if you've ever heard the quote:
Some people, when confronted with a problem, think
"I know, I'll use regular expressions." Now they have two problems.
(source)
You might get a working regular expression here, but will you understand it today? tomorrow? in six months' time? And will your colleagues?
An easy alternative is using a StringBuilder, and I'd argue that it's a lot more straightforward to understand what is going on here:
StringBuilder sb = new StringBuilder(email);
for (int i = 3; i < sb.length() && sb.charAt(i) != '#'; ++i) {
sb.setCharAt(i, '*');
}
email = sb.toString();
"Starting at the third character, replace the characters with a * until you reach the end of the string or #."
(You don't even need to use StringBuilder: you could simply manipulate the elements of email.toCharArray(), then construct a new string at the end).
Of course, this doesn't work correctly for email addresses where the local part is shorter than 3 characters - it would actually then mask the domain.

Your Look-ahead is kind of complicated. Try this code :
public static void main(String... args) throws Exception {
String s = "nileshkemse#gmail.com";
s= s.replaceAll("(?<=.{3}).(?=.*#)", "*");
System.out.println(s);
}
O/P :
nil********#gmail.com

I like this one because I just want to hide 4 characters, it also dynamically decrease the hidden chars to 2 if the email address is too short:
public static String maskEmailAddress(final String email) {
final String mask = "*****";
final int at = email.indexOf("#");
if (at > 2) {
final int maskLen = Math.min(Math.max(at / 2, 2), 4);
final int start = (at - maskLen) / 2;
return email.substring(0, start) + mask.substring(0, maskLen) + email.substring(start + maskLen);
}
return email;
}
Sample outputs:
my.email#gmail.com > my****il#gmail.com
info#mail.com > i**o#mail.com

//In Kotlin
val email = "nileshkemse#gmail.com"
val maskedEmail = email.replace(Regex("(?<=.{3}).(?=.*#)"), "*")

public static string GetMaskedEmail(string emailAddress)
{
string _emailToMask = emailAddress;
try
{
if (!string.IsNullOrEmpty(emailAddress))
{
var _splitEmail = emailAddress.Split(Char.Parse("#"));
var _user = _splitEmail[0];
var _domain = _splitEmail[1];
if (_user.Length > 3)
{
var _maskedUser = _user.Substring(0, 3) + new String(Char.Parse("*"), _user.Length - 3);
_emailToMask = _maskedUser + "#" + _domain;
}
else
{
_emailToMask = new String(Char.Parse("*"), _user.Length) + "#" + _domain;
}
}
}
catch (Exception) { }
return _emailToMask;
}

Related

Remove elements from Date Format String using a Regular Expression

I want to remove elements a supplied Date Format String - for example convert the format "dd/MM/yyyy" to "MM/yyyy" by removing any non-M/y element.
What I'm trying to do is create a localised month/year format based on the existing day/month/year format provided for the Locale.
I've done this using regular expressions, but the solution seems longer than I'd expect.
An example is below:
public static void main(final String[] args) {
System.out.println(filterDateFormat("dd/MM/yyyy HH:mm:ss", 'M', 'y'));
System.out.println(filterDateFormat("MM/yyyy/dd", 'M', 'y'));
System.out.println(filterDateFormat("yyyy-MMM-dd", 'M', 'y'));
}
/**
* Removes {#code charsToRetain} from {#code format}, including any redundant
* separators.
*/
private static String filterDateFormat(final String format, final char...charsToRetain) {
// Match e.g. "ddd-"
final Pattern pattern = Pattern.compile("[" + new String(charsToRetain) + "]+\\p{Punct}?");
final Matcher matcher = pattern.matcher(format);
final StringBuilder builder = new StringBuilder();
while (matcher.find()) {
// Append each match
builder.append(matcher.group());
}
// If the last match is "mmm-", remove the trailing punctuation symbol
return builder.toString().replaceFirst("\\p{Punct}$", "");
}
Let's try a solution for the following date format strings:
String[] formatStrings = { "dd/MM/yyyy HH:mm:ss",
"MM/yyyy/dd",
"yyyy-MMM-dd",
"MM/yy - yy/dd",
"yyabbadabbadooMM" };
The following will analyze strings for a match, then print the first group of the match.
Pattern p = Pattern.compile(REGEX);
for(String formatStr : formatStrings) {
Matcher m = p.matcher(formatStr);
if(m.matches()) {
System.out.println(m.group(1));
}
else {
System.out.println("Didn't match!");
}
}
Now, there are two separate regular expressions I've tried. First:
final String REGEX = "(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)";
With program output:
MM/yyyy
MM/yyyy
yyyy-MMM
Didn't match!
Didn't match!
Second:
final String REGEX = "(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*)";
With program output:
MM/yyyy
MM/yyyy
yyyy-MMM
MM/yy - yy
Didn't match!
Now, let's see what the first regex actually matches to:
(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*) First regex =
(?:[^My]*) Any amount of non-Ms and non-ys (non-capturing)
([My]+ followed by one or more Ms and ys
[^\\w]* optionally separated by non-word characters
(implying they are also not Ms or ys)
[My]+) followed by one or more Ms and ys
(?:[^My]*) finished by any number of non-Ms and non-ys
(non-capturing)
What this means is that at least 2 M/ys are required to match the regex, although you should be careful that something like MM-dd or yy-DD will match as well, because they have two M-or-y regions 1 character long. You can avoid getting into trouble here by just keeping a sanity check on your date format string, such as:
if(formatStr.contains('y') && formatStr.contains('M') && m.matches())
{
String yMString = m.group(1);
... // other logic
}
As for the second regex, here's what it means:
(?:[^My]*)((?:[My]+[^\\w]*)+[My]+)(?:[^My]*) Second regex =
(?:[^My]*) Any amount of non-Ms and non-ys
(non-capturing)
( ) followed by
(?:[My]+ )+[My]+ at least two text segments consisting of
one or more Ms or ys, where each segment is
[^\\w]* optionally separated by non-word characters
(?:[^My]*) finished by any number of non-Ms and non-ys
(non-capturing)
This regex will match a slightly broader series of strings, but it still requires that any separations between Ms and ys be non-words ([^a-zA-Z_0-9]). Additionally, keep in mind that this regex will still match "yy", "MM", or similar strings like "yyy", "yyyy"..., so it would be useful to have a sanity check as described for the previous regular expression.
Additionally, here's a quick example of how one might use the above to manipulate a single date format string:
LocalDateTime date = LocalDateTime.now();
String dateFormatString = "dd/MM/yyyy H:m:s";
System.out.println("Old Format: \"" + dateFormatString + "\" = " +
date.format(DateTimeFormatter.ofPattern(dateFormatString)));
Pattern p = Pattern.compile("(?:[^My]*)([My]+[^\\w]*[My]+)(?:[^My]*)");
Matcher m = p.matcher(dateFormatString);
if(dateFormatString.contains("y") && dateFormatString.contains("M") && m.matches())
{
dateFormatString = m.group(1);
System.out.println("New Format: \"" + dateFormatString + "\" = " +
date.format(DateTimeFormatter.ofPattern(dateFormatString)));
}
else
{
throw new IllegalArgumentException("Couldn't shorten date format string!");
}
Output:
Old Format: "dd/MM/yyyy H:m:s" = 14/08/2019 16:55:45
New Format: "MM/yyyy" = 08/2019
I'll try to answer with the understanding of my question : how do I remove from a list/table/array of String, elements that does not exactly follow the patern 'dd/MM'.
so I'm looking for a function that looks like
public List<String> removeUnWantedDateFormat(List<String> input)
We can expect, from my knowledge on Dateformat, only 4 possibilities that you would want, hoping i dont miss any, which are "MM/yyyy", "MMM/yyyy", "MM/yy", "MM/yyyy". So that we know what we are looking for we can do an easy function.
public List<String> removeUnWantedDateFormat(List<String> input) {
String s1 = "MM/yyyy";
string s2 = "MMM/yyyy";
String s3 = "MM/yy";
string s4 = "MMM/yy";
for (String format:input) {
if (!s1.equals(format) && s2.equals(format) && s3.equals(format) && s4.equals(format))
input.remove(format);
}
return input;
}
Better not to use regex if you can, it costs a lot of resources. And great improvement would be to use an enum of the date format you accept, like this you have better control over it, and even replace them.
Hope this will help, cheers
edit: after i saw the comment, i think it would be better to use contains instead of equals, should work like a charm and instead of remove,
input = string expected.
so it would looks more like:
public List<String> removeUnWantedDateFormat(List<String> input) {
List<String> comparaisons = new ArrayList<>();
comparaison.add("MMM/yyyy");
comparaison.add("MMM/yy");
comparaison.add("MM/yyyy");
comparaison.add("MM/yy");
for (String format:input) {
for(String comparaison: comparaisons)
if (format.contains(comparaison)) {
format = comparaison;
break;
}
}
return input;
}

N-th indexOf in String?

I need to extract a sub-string of a URL.
URLs
/service1/api/v1.0/foo -> foo
/service1/api/v1.0/foo/{fooId} -> foo/{fooId}
/service1/api/v1.0/foo/{fooId}/boo -> foo/{fooId}/boo
And some of those URLs may have request parameters.
Code
String str = request.getRequestURI();
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1, str.indexOf("?"));
Is there a better way to extract the sub-string instead of recurrent usage of indexOf method?
There are many alternative ways:
Use Java-Stream API on splitted String with \ delimiter:
String str = "/service1/api/v1.0/foo/{fooId}/boo";
String[] split = str.split("\\/");
String url = Arrays.stream(split).skip(4).collect(Collectors.joining("/"));
System.out.println(url);
With the elimination of the parameter, the Stream would be like:
String url = Arrays.stream(split)
.skip(4)
.map(i -> i.replaceAll("\\?.+", ""))
.collect(Collectors.joining("/"));
This is also where Regex takes its place! Use the classes Pattern and Matcher.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
Pattern pattern = Pattern.compile("\\/.*?\\/api\\/v\\d+\\.\\d+\\/(.+)");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
If you rely on the indexOf(..) usage, you might want to use the while-loop.
String str = "/service1/api/v1.0/foo/{fooId}/boo?parameter=value";
String string = str;
while(!string.startsWith("v1.0")) {
string = string.substring(string.indexOf("/") + 1);
}
System.out.println(string.substring(string.indexOf("/") + 1, string.indexOf("?")));
Other answers include a way that if the prefix is not mutable, you might want to use only one call of idndexOf(..) method (#JB Nizet):
string.substring("/service1/api/v1.0/".length(), string.indexOf("?"));
All these solutions are based on your input and fact, the pattern is known, or at least the number of the previous section delimited with \ or the version v1.0 as a checkpoint - the best solution might not appear here since there are unlimited combinations of the URL. You have to know all the possible combinations of input URL to find the best way to handle it.
Path is quite useful for that :
public static void main(String[] args) {
Path root = Paths.get("/service1/api/v1.0/foo");
Path relativize = root.relativize(Paths.get("/service1/api/v1.0/foo/{fooId}/boo"));
System.out.println(relativize);
}
Output :
{fooId}/boo
How about this:
String s = "/service1/api/v1.0/foo/{fooId}/boo";
String[] sArray = s.split("/");
StringBuilder sb = new StringBuilder();
for (int i = 4; i < sArray.length; i++) {
sb.append(sArray[i]).append("/");
}
sb.deleteCharAt(sb.length() - 1);
System.out.println(sb.toString());
Output:
foo/{fooId}/boo
If the url prefix is always /service1/api/v1.0/, you just need to do s.substring("/service1/api/v1.0/".length()).
There are a few good options here.
1) If you know "foo" will always be the 4th token, then you have the right idea already. The only issue with your way is that you have the information you need to be efficient, but you aren't using it. Instead of copying the String multiple times and looping anew from the beginning of the new String, you could just continue from where you left off, 4 times, to find the starting point of what you want.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
// start at the beginning
int start = 0;
// get the 4th index of '/' in the string
for (int i = 0; i != 4; i++) {
// get the next index of '/' after the index 'start'
start = str.indexOf('/',start);
// increase the pointer to the next character after this slash
start++;
}
// get the substring
str = str.substring(start);
This will be far, far more efficient than any regex pattern.
2) Regex: (java.util.regex.*). This will work if you what you want is always preceded by "service1/api/v1.0/". There may be other directories before it, e.g. "one/two/three/service1/api/v1.0/".
// \Q \E will automatically escape any special chars in the path
// (.+) will capture the matched text at that position
// $ marks the end of the string (technically it matches just before '\n')
Pattern pattern = Pattern.compile("/service1/api/v1\\.0/(.+)$");
// get a matcher for it
Matcher matcher = pattern.matcher(str);
// if there is a match
if (matcher.find()) {
// get the captured text
str = matcher.group(1);
}
If your path can vary some, you can use regex to account for it. e.g.: service/api/v3/foo/{bar}/baz/" (note varying number formats and trailing '/') could be matched as well by changing the regex to "/service\\d*/api/v\\d+(?:\\.\\d+)?/(.+)(?:/|$)"

Regular expression in java that encloses some url

i have this problem:
i have to make a regular expression which take this urls:
http://www.amazon.it/TP-LINK-TL-WR841N-Wireless-300Mbps-Ethernet/dp/B001FWYGJS?ie=UTF8&redirect=true&ref_=s9_simh_gw_p147_d0_i2
http://www.amazon.it/gp/product/B014KMQWU0/
http://www.amazon.it/gp/product/glance/B014KMQWU0/
I need a regular expression which matches the full url until the ASIN of the product (ASIN is a word of 10 capital letters)
I have write this regex but not make what i want:
String regex="http:\\/\\/(?:www\\.|)amazon\\.com\\/(?:gp\\ product|| gp\\ product\\ glance || [^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
while (urlAmazonMatcher.find()) {
System.out.println("PROVA "+urlAmazonMatcher.group(0));
}
This is my solution. Finally it works :D
String regex="(http|www\\.)amazon\\.(com|it|uk|fr|de)\\/(?:gp\\/product|gp\\/product\\/glance|[^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
String toReturn = null;
while (urlAmazonMatcher.find()) {
toReturn=urlAmazonMatcher.group(0);
}
How about
/[^/?]{10}(/$|\?)
This matches 10 characters that are neither / nor ? following a slash if those characters are followed by a final slash or a question mark.
You can get the part that precedes or follows the ASIN using one of the various Matcher functions.
Here is my work from a previous project that was to extract URLs from text:
private Pattern getUriPattern() {
if(uriPattern == null) {
// taken from http://labs.apache.org/webarch/uri/rfc/rfc3986.html
//TODO implement the full URI syntax
String genDelims = "\\:\\/\\?\\#\\[\\]\\#";
String subDelims = "\\!\\$\\&\\'\\*\\+\\,\\;\\=";
String reserved = genDelims + subDelims;
String unreserved = "\\w\\-\\.\\~"; // i.e. ALPHA / DIGIT / "-" / "." / "_" / "~"
String allowed = reserved + unreserved;
// ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
uriPattern = Pattern.compile("((?:[^\\:/\\?\\#]+:)?//[" + allowed + "&&[^\\?\\#]]*(?:\\?([" + allowed + "&&[^\\#]]*))?(?:\\#[" + allowed + "]*)?).*");
}
return uriPattern;
}
You can use the above method as follows:
Matcher uriMatcher =
getUriPattern().matcher(text);
if(uriMatcher.matches()) {
String candidateUriString = uriMatcher.group(1);
try {
new URI(candidateUriString); // check once again if you matched a URL
// your code here
} catch (Exception e) {
// error handling
}
}
This will catch the whole URL, including params. You can then split it up to the first occurence of '?' (if any) and take the first part. Of course, you can rework the regex too.

If a string contains a letter, return the entire String

Weird one but:
Let's say you've a huge html page and if the page contains an email address (looking for an # sign) you want to return that email.
So far I know I need something like this:
String email;
if (myString.contains("#")) {
email = myString.substring("#")
}
I know how to get to the # but how do I go back in the string to find what's before it etc?
if the myString is the string for email you received from html page then ,
you can return the same string if it has # right. something like below
String email;
if (myString.contains("#")) {
email = myString;
}
whats the challenge here.. can you explain any challenge if so ?
This method will give you a list of all the email addresses contained in a string.
static ArrayList<String> getEmailAdresses(String str) {
ArrayList<String> result = new ArrayList<>();
Matcher m = Pattern.compile("\\S+?#[^. ]+(\\.[^. ]+)*").matcher(str.replaceAll("\\s", " "));
while(m.find()) {
result.add(m.group());
}
return result;
}
String email;
if (myString.contains("#")) {
// Locate the #
int atLocation = myString.indexOf("#");
// Get the string before the #
String start = myString.substring(0, atLocation);
// Substring from the last space before the end
start = start.substring(start.lastIndexOf(" "), start.length);
// Get the string after the #
String end = myString.substring(atLocation, myString.length);
// Substring from the first space after the start (of the end, lol)
end = end.substring(end.indexOf(" "), end.length);
// Stick it all together
email = start + "#" + end;
}
This may be a little off as I've been writing javascript all day. :)
Rather than exact code, I would like to give you an approach.
Checking just by # symbol might not be appropriate as it might be possible in other cases as well.
Search through internet or create your own, a regex pattern which matches an email.
(if you want, you can add a check for email providers as well) [here is a link] (http://www.mkyong.com/regular-expressions/how-to-validate-email-address-with-regular-expression/)
Get the index of a pattern in a string using regex and find out the substring (email in your case).

Using regex on XML, avoid greedy match

It looks simple problem , but I'll apprisiate any help here :
I need to swap password value (can be any value) to "****"
The origunal sting is string resived from xml
The problem is that I getting as output only line:
<parameter><value>*****</value></parameter>
But I need the whole string as output only with password value replaced
Thank you in advance
String originalString = "<parameter>" +
"<name>password</name>"+
"<value>my123pass</value>"+
"</parameter>"+
"<parameter>"+
"<name>LoginAttempt</name>"+
"<value>1</value>"+
"</parameter>";
System.out.println("originalString: "+originalString);
Pattern pat = Pattern.compile("<name>password</name><value>.*</value>");
Matcher mat = pat.matcher(originalString);
System.out.println("NewString: ");
System.out.print(mat.replaceFirst("<value>***</value>"));
mat.reset();
If I'm not mistaken, you want to change the password in the string with *'s. You can do it by using String methods directly. Just get the last index of the starting value tag and iterate until you reach a "<", replacing the value between those two with *'s. Something like this:
int from = originalString.lastIndexOf("<name>password</name><value>");
bool endIteration = false;
for(i = from + 1 ; i < originalString.length() && !endIteration ; i ++) {
if(originalString.toCharArray()[i] == '<')
endIteration = true;
else {
originalString.toCharArray()[i] = '*';
}
}
EDIT: There is another way making a proper use of all the String class goodies:
int from = originalString.lastIndexOf("<name>password</name><value>");
int to = originalString.indexOf("</value>", from);
Arrays.fill(originalString.toCharArray(), from, to, '*');

Categories

Resources