Masking email using java with out regex

Masking email using java with out regex - java

I want to mask the email in this format xxx####xx###x####.x##.
x -> to display the character , # -> to hide the character.
For example:
input: testemail#gmail.com,
output: tes##ma###g####.c##
This is the code I have written
private static String maskEmailId(String email, String format) {
int start = format.indexOf("#");
int end = email.indexOf("#");
String result = maskPattern(start, end, email);
start = email.indexOf('#')+ (format.indexOf("#", format.indexOf("#")) - format.indexOf("#") - 1);
end = email.indexOf('.');
result = maskPattern(start + 1, end, result);
start = email.indexOf(".") + (format.indexOf("#", format.indexOf(".")) - format.indexOf(".") - 1);
end = email.length();
result = maskPattern(start + 1, end, result);
return result;
}
private static String maskPattern(int start, int end, String email) {
StringBuilder sb = new StringBuilder(email);
for (int i = start; i < end; i++) {
sb.setCharAt(i, '#');
}
return sb.toString();
}
I am bit confused with the logic if I want to display and hide in between after hiding first time.
Can anyone help on this?

Related

Replace part of substring with specific characters based on delimiter

String s = "abc//jason:1234567#123.123.213.212/";
I want to replace all the substring before and after ":" delimiter with "......."
I want my final output to be :
"abc//.....:.......#123.123.213:212/"
I tried doing this since there is a second : in the string it gets messed up, is it there better way to be able to get my output:
String [] headersplit;
headersplit = s.split(":");

If you want to locate only symbols between "//" and "#" then algorithm is simple, provided that mention symbols are compulsory.
public class Main {
public static void main(String[] args) {
String s = "abc//jason:1234567#123.123.213.212/";
System.out.println(replaceSensitiveInfo(s));
}
static String replaceSensitiveInfo(String src) {
int slashes = src.indexOf("//");
int colon = src.indexOf(":", slashes);
int at = src.indexOf("#", colon);
StringBuilder sb = new StringBuilder(src);
sb.replace(slashes + 2, colon, ".".repeat(colon - slashes - 2));
sb.replace(colon + 1, at, ".".repeat(at - colon - 1));
return sb.toString();
}
}

Not the best way but it works for your example and should work for others:
String s = "abc//jason:1234567#123.123.213:212/";
String result = replaceSensitiveInfo(s);
private String replaceSensitiveInfo(String info){
StringBuilder sb = new StringBuilder(info);
String substitute = ".";
int start = sb.indexOf("//") + 2;
int end = sb.indexOf(":");
String firstReplace = substitute.repeat(end - start);
sb.replace(start, end, firstReplace);
int start2 = sb.indexOf(":") + 1;
int end2 = sb.indexOf("#");
String secondReplace = substitute.repeat(end2 - start2);
sb.replace(start2, end2, secondReplace);
return sb.toString();
}

Split String from the last iteration

This post is an update to this one : get specific character in a string with regex and remove unused zero
In the first place, i wanted to remove with an regular expression the unused zero in the last match.
I found that the regular expression is a bit overkill for what i need.
Here is what i would like now,
I would like to use split() method
to get from this :
String myString = "2020-LI50532-3329-00100"
this :
String data1 = "2020"
String data2 = "LI50532"
String data3 = "3329"
String data4 = "00100"
So then i can remove from the LAST data the unused Zero
to convert "00100" in "100"
And then concatenate all the data to get this
"2020-LI50532-3329-100"
Im not familiar with the split method, if anyone can enlight me about this ^^

You can use substring method to get rid of the leading zeros...
String myString = "2020-LI50532-3329-00100";
String[] data = myString.split("-");
data[3] = data[3].substring(2);
StringBuilder sb = new StringBuilder();
sb.append(data[0] + "-" + data[1] + "-" + data[2] + "-" + data[3]);
String result = sb.toString();
System.out.println(result);

Assuming that we want to remove the leading zeroes of ONLY the last block, maybe we can:
Extract the last block
Convert it to Integer and back to String to remove leading zeroes
Replace the last block with the String obtained in above step
Something like this:
public String removeLeadingZeroesFromLastBlock(String text) {
int indexOfLastDelimiter = text.lastIndexOf('-');
if (indexOfLastDelimiter >= 0) {
String lastBlock = text.substring(indexOfLastDelimiter + 1);
String lastBlockWithoutLeadingZeroes = String.valueOf(Integer.valueOf(lastBlock)); // will throw exception if last block is not an int
return text.substring(0, indexOfLastDelimiter + 1).concat(lastBlockWithoutLeadingZeroes);
}
return text;
}

Solution using regex:
public class Main {
public static void main(String[] args) {
// Test
System.out.println(parse("2020-LI50532-3329-00100"));
System.out.println(parse("2020-LI50532-3329-00001"));
System.out.println(parse("2020-LI50532-03329-00100"));
System.out.println(parse("2020-LI50532-03329-00001"));
}
static String parse(String str) {
return str.replaceAll("0+(?=[1-9]\\d*$)", "");
}
}
Output:
2020-LI50532-3329-100
2020-LI50532-3329-1
2020-LI50532-03329-100
2020-LI50532-03329-1
Explanation of the regex:
One or more zeros followed by a non-zero digit which can be optionally followed by any digit(s) until the end of the string (specified by $).
Solution without using regex:
You can do it also by using Integer.parseInt which can parse a string like 00100 into 100.
public class Main {
public static void main(String[] args) {
// Test
System.out.println(parse("2020-LI50532-3329-00100"));
System.out.println(parse("2020-LI50532-3329-00001"));
System.out.println(parse("2020-LI50532-03329-00100"));
System.out.println(parse("2020-LI50532-03329-00001"));
}
static String parse(String str) {
String[] parts = str.split("-");
try {
parts[parts.length - 1] = String.valueOf(Integer.parseInt(parts[parts.length - 1]));
} catch (NumberFormatException e) {
// Do nothing
}
return String.join("-", parts);
}
}
Output:
2020-LI50532-3329-100
2020-LI50532-3329-1
2020-LI50532-03329-100
2020-LI50532-03329-1

you can convert the last string portion to integer type like below for removing unused zeros:
String myString = "2020-LI50532-3329-00100";
String[] data = myString.split("-");
data[3] = data[3].substring(2);
StringBuilder sb = new StringBuilder();
sb.append(data[0] + "-" + data[1] + "-" + data[2] + "-" + Integer.parseInt(data[3]));
String result = sb.toString();
System.out.println(result);

You should avoid String manipulation where possible and rely on existing types in the Java language. One such type is the Integer. It looks like your code consists of 4 parts - Year (Integer) - String - Integer - Integer.
So to properly validate it I would use the following code:
Scanner scan = new Scanner("2020-LI50532-3329-00100");
scan.useDelimiter("-");
Integer firstPart = scan.nextInt();
String secondPart = scan.next();
Integer thirdPart = scan.nextInt();
Integer fourthPart = scan.nextInt();
Or alternatively something like:
String str = "00100";
int num = Integer.parseInt(str);
System.out.println(num);
If you want to reconstruct your original value, you should probably use a NumberFormat to add the missing 0s.
The main points are:
Always try to reuse existing code and tools available in your language
Always try to use available types (LocalDate, Integer, Long)
Create your own types (classes) and use the expressiveness of the Object Oriented language

public class Test {
public static void main(String[] args) {
System.out.println(trimLeadingZeroesFromLastPart("2020-LI50532-03329-00100"));
}
private static String trimLeadingZeroesFromLastPart(String input) {
String delem = "-";
String result = "";
if (input != null && !input.isEmpty()) {
String[] data = input.split(delem);
StringBuilder tempStrBldr = new StringBuilder();
for (int idx = 0; idx < data.length; idx++) {
if (idx == data.length - 1) {
tempStrBldr.append(trimLeadingZeroes(data[idx]));
} else {
tempStrBldr.append(data[idx]);
}
tempStrBldr.append(delem);
}
result = tempStrBldr.substring(0, tempStrBldr.length() - 1);
}
return result;
}
private static String trimLeadingZeroes(String input) {
int idx;
for (idx = 0; idx < input.length() - 1; idx++) {
if (input.charAt(idx) != '0') {
break;
}
}
return input.substring(idx);
}
}
Output:
2020-LI50532-3329-100

Finding Pattern from one string to apply to another string - Java

So I have a string like this: <em>1234</em>56.70
it's basically a number where the em tags help identify what to highlight in the string
I need to first convert the string to an actual number with the current locale format. So I remove the em tags (replaceAll by emptyString) and then use the numberFormat java API to get a string like: $123,456.70
The problem with this is, I lost the highlight (em) tags. So I need to put it back in the string that is formatted, something like this: <em>$123,4</em>56.70
highlightValue = "<em>1234</em>56.70";
highlightValue = highlightValue.replaceAll("<em>", "").replaceAll("</em>", ""); // highlightValue is now 123456.70
highlightValue = numberFormat.convertToFormat(highlightValue, currencyCode); // highlightValue is now $123,456.70
highlightValue = someFunction(highlightValue); // this function needs to return <em>$123,4</em>56.70
I am not sure what approach to use. I was trying pattern matching but didn't know how to achieve it.
All help appreciated !

I am assuming that you want to highlight the number from starting up to some number of digits.This can be done.
In the initial string count the number of digits after which the tag is present. The starting tag will always be placed at the beginning. It is the ending tag you have to worry about. Now count the number of digits, excluding any other symbols.When the required number of digits have been passed, again place the tag. Either you can create a StringBuilder from the String highlighted and insert the tag string directly, or divide the string into two substrings and then join them together with the tag string in the middle.
Hope this helped.

I took an approach, where I count the numbers in front of the tag, in the middle of the tag - as I think no formatting will actually change the numbers(assuming you don't add leading zeroes) and after that I insert back the tag based on the numbers which were in front of the tag or for the closing tag in front and inside
so this is the code:
public static void main(String[] args) {
String input1 = "<em>1234</em>56.70";
String result1 = formatString(input1, "em");
System.out.printf("input1 = %s%n", input1);
System.out.printf("result1 = %s%n", result1);
String input2 = "<em>8127</em>29.12";
String result2 = formatString(input2, "em");
System.out.printf("input2 = %s%n", input2);
System.out.printf("result2 = %s%n", result2);
}
private static String formatString(String input, String tagName) {
String tagOpening = String.format("<%s>", tagName);
int tagOpeningLength = tagOpening.length();
String tagClosing = String.format("</%s>", tagName);
int tagClosingLength = tagClosing.length();
int inputLength = input.length();
int tagOpeningPos = input.indexOf(tagOpening);
int tagClosingPos = input.indexOf(tagClosing, tagOpeningPos);
String beforeTag;
if(tagOpeningPos > 0)
beforeTag = input.substring(0, tagOpeningPos);
else
beforeTag = "";
int digitsInBeforeTag = countNumbers(beforeTag);
String tagValue;
if(tagOpeningPos + tagOpeningLength < tagClosingPos)
tagValue = input.substring(tagOpeningPos + tagOpeningLength, tagClosingPos);
else
tagValue = "";
int digitsInTagValue = countNumbers(tagValue);
String afterTag;
if((tagClosingPos + tagClosingLength) < inputLength)
afterTag = input.substring(tagClosingPos + tagClosingLength);
else
afterTag = "";
String valueToBeFormatted = beforeTag + tagValue + afterTag;
double value = Double.parseDouble(valueToBeFormatted);
NumberFormat nf = NumberFormat.getInstance(Locale.ENGLISH);
String formattedValue = nf.format(value);
int newEmOpeningPos = findSubstringWithThisManyNumbers(formattedValue, digitsInBeforeTag);
int newEmClosingPos = findSubstringWithThisManyNumbers(formattedValue, digitsInBeforeTag+digitsInTagValue);
StringBuilder result = new StringBuilder();
result.append(formattedValue.substring(0, newEmOpeningPos));
result.append(tagOpening);
result.append(formattedValue.substring(newEmOpeningPos, newEmClosingPos));
result.append(tagClosing);
result.append(formattedValue.substring(newEmClosingPos));
return result.toString();
}
private static int findSubstringWithThisManyNumbers(String input, int digitCount) {
int pos = 0;
int counter = 0;
for(char c : input.toCharArray()) {
if(counter >= digitCount)
break;
if(Character.isDigit(c))
counter++;
pos++;
}
return pos;
}
private static int countNumbers(String str) {
int result = 0;
for(char c : str.toCharArray())
if(Character.isDigit(c))
result++;
return result;
}
the output was
input1 = <em>1234</em>56.70
result1 = <em>123,4</em>56.7
input2 = <em>8127</em>29.12
result2 = <em>812,7</em>29.12

I don't know how can this be practical. But anyway.
String highlightValue = "0<em>1234</em>56.70";
int startIndex = highlightValue.indexOf("<em>");
String startString = highlightValue.substring(0, startIndex);
String endString = highlightValue.substring(highlightValue.indexOf("</em>") + "</em>".length());
highlightValue = highlightValue.replaceAll("<em>", "").replaceAll("</em>", "");
highlightValue = numberFormat.convertToFormat(highlightValue, currencyCode);
// highlightValue is now $123,456.70
int endIndex = highlightValue.indexOf(endString);
highlightValue = startString + "<em>" + highlightValue.substring(0, endIndex) + "</em>" + endString;
System.out.println(highlightValue);
// 0<em>$123,4</em>56.70

How can I parse the content of file to variables?

I have text file which looks like this :
ABC=-1 Temp=2 Try=34 Message="some text" SYS=3
ABC=-1 Temp=5 Try=40 Message="some more and different text" SYS=6
and the pattern continues but only the numeric values and text inside the " " is changed.
NOTE: the Message= could have multiple quotes as well.
I want to store the value of ABC,Temp,Try and SYS to int variables
And Message to a String variable.
I am currently using:
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
int count = line.indexOf("ABC=");
if (count >= 0) {
int clear = line.charAt(count + 3);
}
}
scanner.close();
I thought of using the Scanner class and read line by line, but I am confused about how can I classify the line in different variables?

First make a class that represents the data:
public static class MyData { // please pick a better name
final int abc;
final int temp;
final int tryNumber; // try is a keyword
final String message;
final int sys;
public MyData(int abc, int temp, int tryNumber, String message, int sys) {
this.abc = abc;
this.temp = temp;
this.tryNumber = tryNumber;
this.message = message;
this.sys = sys;
}
}
Then make a method that transforms a String into this class using Regex capture groups:
private static Pattern p =
Pattern.compile("ABC=([^ ]+) Temp=([^ ]+) Try=([^ ]+) Message=\"(.+)\" SYS=([^ ]+)");
private static MyData makeData(String input) {
int abc = 0, temp = 0, tryNumber = 0, sys = 0;
String message = "";
Matcher m = p.matcher(input);
if (!(m.find()) return null;
abc = Integer.parseInt(m.group(1));
temp = Integer.parseInt(m.group(2));
tryNumber = Integer.parseInt(m.group(3));
message = m.group(4);
sys = Integer.parseInt(m.group(5));
return new MyData(abc, temp, tryNumber, message, sys);
}
Then read the file using a scanner:
public static void main (String... args) throws Exception {
File file = new File("/path/to/your/file.txt");
List<MyData> dataList = new ArrayList<>();
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
MyData data = makeData(line);
if(data != null) dataList.add(data);
}
scanner.close();
}
Here's a completely working demo on ideone

You can use regex for this kind of parsing with a pattern of:
"ABC=([+-]?\\d+) Temp=([+-]?\\d+) Try=([+-]?\\d+) Message=\"(.+)\" SYS=([+-]?\\d+)"
Pattern Breakdown (Pattern Reference):
ABC= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 1
Temp= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 2
Try= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 3
Message= - literal string
\"(.+)\" captures a string in between double quotes in capture group 4
SYS= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 5
If the String matches the pattern you can extract your values like this:
public static void main(String[] args) throws Exception {
List<String> data = new ArrayList() {{
add("ABC=-1 Temp=2 Try=34 Message=\"some text\" SYS=3");
add("ABC=-1 Temp=5 Try=40 Message=\"some more \"and\" different text\" SYS=6");
}};
String pattern = "ABC=([+-]?\\d+) Temp=([+-]?\\d+) Try=([+-]?\\d+) Message=\"(.+)\" SYS=([+-]?\\d+)";
int abc = 0;
int temp = 0;
int tryNum = 0;
String message = "";
int sys = 0;
for (String d : data) {
Matcher matcher = Pattern.compile(pattern).matcher(d);
if (matcher.matches()) {
abc = Integer.parseInt(matcher.group(1));
temp = Integer.parseInt(matcher.group(2));
tryNum = Integer.parseInt(matcher.group(3));
message = matcher.group(4);
sys = Integer.parseInt(matcher.group(5));
System.out.printf("%d %d %d %s %d%n", abc, temp, tryNum, message, sys);
}
}
}
Results:
-1 2 34 some text 3
-1 5 40 some more "and" different text 6

If you are already using the indexOf approach, the following code will work
String a = "ABC=-1 Temp=2 Try=34 Message=\"some text\" SYS=3";
int abc_index = a.indexOf("ABC");
int temp_index = a.indexOf("Temp");
int try_index = a.indexOf("Try");
int message_index = a.indexOf("Message");
int sys_index = a.indexOf("SYS");
int length = a.length();
int abc = Integer.parseInt(a.substring(abc_index + 4, temp_index - 1));
int temp = Integer.parseInt(a.substring(temp_index + 5, try_index - 1));
int try_ = Integer.parseInt(a.substring(try_index + 4, message_index - 1));
String message = a.substring(message_index + 9, sys_index - 2);
int sys = Integer.parseInt(a.substring(sys_index + 4, length));
System.out.println("abc : " + abc);
System.out.println("temp : " + temp);
System.out.println("try : " + try_);
System.out.println("message : " + message);
System.out.println("sys : " + sys);
This will give you the following
abc : -1
temp : 2
try : 34
message : some text
sys : 3
This will work only if the string data you get has this exact syntax, ie, it contains ABC, Temp, Try, Message, and SYS. Hope this helps.

detect incomplete patterns in strings

i have a string containing nested repeating patterns, for example:
String pattern1 = "1234";
String pattern2 = "5678";
String patternscombined = "1234|1234|5678|9"//added | for reading pleasure
String pattern = (pattern1 + pattern1 + pattern2 + "9")
+(pattern1 + pattern1 + pattern2 + "9")
+(pattern1 + pattern1 + pattern2 + "9")
String result = "1234|1234|5678|9|1234|1234|56";
As you can see in the above example, the result got cut off. But when knowing the repeating patterns, you can predict, what could come next.
Now to my question:
How can i predict the next repetitions of this pattern, to get a resulting string like:
String predictedresult = "1234|1234|5678|9|1234|1234|5678|9|1234|1234|5678|9";
Patterns will be smaller that 10 characters, the predicted result will be smaller than 1000 characters.
I am only receiving the cutoff result string and a pattern recognition program is already implemented and working. In the above example, i would have result, pattern1, pattern2 and patternscombined.
EDIT:
I have found a solution working for me:
import java.util.Arrays;
public class LRS {
// return the longest common prefix of s and t
public static String lcp(String s, String t) {
int n = Math.min(s.length(), t.length());
for (int i = 0; i < n; i++) {
if (s.charAt(i) != t.charAt(i))
return s.substring(0, i);
}
return s.substring(0, n);
}
// return the longest repeated string in s
public static String lrs(String s) {
// form the N suffixes
int N = s.length();
String[] suffixes = new String[N];
for (int i = 0; i < N; i++) {
suffixes[i] = s.substring(i, N);
}
// sort them
Arrays.sort(suffixes);
// find longest repeated substring by comparing adjacent sorted suffixes
String lrs = "";
for (int i = 0; i < N - 1; i++) {
String x = lcp(suffixes[i], suffixes[i + 1]);
if (x.length() > lrs.length())
lrs = x;
}
return lrs;
}
public static int startingRepeats(final String haystack, final String needle)
{
String s = haystack;
final int len = needle.length();
if(len == 0){
return 0;
}
int count = 0;
while (s.startsWith(needle)) {
count++;
s = s.substring(len);
}
return count;
}
public static String lrscutoff(String s){
String lrs = s;
int length = s.length();
for (int i = length; i > 0; i--) {
String x = lrs(s.substring(0, i));
if (startingRepeats(s, x) < 10 &&
startingRepeats(s, x) > startingRepeats(s, lrs)){
lrs = x;
}
}
return lrs;
}
// read in text, replacing all consecutive whitespace with a single space
// then compute longest repeated substring
public static void main(String[] args) {
long time = System.nanoTime();
long timemilis = System.currentTimeMillis();
String s = "12341234567891234123456789123412345";
String repeat = s;
while(repeat.length() > 0){
System.out.println("-------------------------");
String repeat2 = lrscutoff(repeat);
System.out.println("'" + repeat + "'");
int count = startingRepeats(repeat, repeat2);
String rest = repeat.substring(count*repeat2.length());
System.out.println("predicted: (rest ='" + rest + "')" );
while(count > 0){
System.out.print("'" + repeat2 + "' + ");
count--;
}
if(repeat.equals(repeat2)){
System.out.println("''");
break;
}
if(rest!="" && repeat2.contains(rest)){
System.out.println("'" + repeat2 + "'");
}else{
System.out.println("'" + rest + "'");
}
repeat = repeat2;
}
System.out.println("Time: (nano+millis):");
System.out.println(System.nanoTime()-time);
System.out.println(System.currentTimeMillis()-timemilis);
}
}

If your predict String is always piped(|) the numbers then you can easily split them using pipe and then keep track of the counts on a HashMap. For example
1234 = 2
1344 = 1
4411 = 5
But if not, then you have to modify the Longest Repeated Substring algorithm. As you need to have all repeated substrings so keep track of all instead of only the Longest one. Also, you have to put a checking for minimum length of substring along with overlapping substring. By searching google you'll find lot of reference of this algorithm.

You seem to need something like an n-gram language model, which is a statistical model that is based on counts of co-occurring events. If you are given some training data, you can derive the probabilities from counts of seen patterns. If not, you can try to specify them manually, but this can get tricky. Once you have such a language model (where the digit patterns correspond to words), you can always predict the next word by picking one with the highest probability given some previous words ("history").

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Masking email using java with out regex - java

Related

Replace part of substring with specific characters based on delimiter

Split String from the last iteration

Finding Pattern from one string to apply to another string - Java

How can I parse the content of file to variables?

detect incomplete patterns in strings

Categories

Resources