Java String.replaceAll method to sanitize phone numbers

Java String.replaceAll method to sanitize phone numbers - java

I have databasefield called TelephoneName. In this field, I got different formats of telephone number.
What I need now is to seperate them into countrycode and subscribernumber.
For example, I saw a telephone number +49 (0)711 / 61947-xx.
I want to remove all the slash,brackets,minus,space. The result could be +49 (countrycode) and 071161947**(subsribernumber).
How can I do that with replaceAll method?
replaceAll("//()-","") is that correct?
The thing is I got a lot of unformatted telephone number such as:
+49 04261 85120
+32027400050
It is different to apply every telephone number with same algorithms

The replaceAll method takes a regular expression as argument. To remove everything except digits and +, you could thus do
str = str.replaceAll("[^0-9+]", "")
Here's a more complete example that also figures out the country code (based on the index of the ( symbol):
String str = "+49 (0)711 / 61947-12";
int lpar = str.indexOf('(');
String countryCode = str.substring(0, lpar).trim();
String subscriber = str.substring(lpar).trim();
subscriber = subscriber.replaceAll("[^0-9]", "");
System.out.println(countryCode); // prints +49
System.out.println(subscriber); // prints 07116194712
replaceAll("//()-","") is that correct?
No, not quite. That will remove all //- substrings. To remove those characters you need to put them in [...], like this: replaceAll("[/()-]", "") (and / does not need to be escaped).

The first argument of replaceAll() is a regex pattern, so what you want to do is make it match all non digits (and +). You can do this using the "[^...]" (not one of...) construct :
mystring.replaceAll("[^0-9+]", "")

No, that doesn't work.
ReplaceAll() Replaces each substring of this string that matches the given regular expression with the given replacement.
So your expression would replace all instances in the number that look like /()' with an empty space.
You need to do something like
String output = "+49 (0)711 / 61947-xx".replaceAll("[//()-]","");
The square brackets make it a regex character class ('Either slash or open bracket or close bracket or hypen'), rather than a literal ('slash followed by open bracket followed by close bracket followed by hypen.').

This can be done simply by using :
s=s.replace("/","");
s=s.replace("(","");
s=s.replace(")","");
Then substring it to get country code.

Related

Replace a nth character using regex in Java

I'm trying to learn regex in Java.
So far, I've been trying some little mini challenges and I'm wondering if there is a way to define a nth character.
For instance, let's say I have this string: todayiwasnotagoodday
If I want to replace the third (fourth or seventh) character, how I can define a regex in order to change an specific "index", for this example the 'd' for an empty space "".
I've been searching about it, but so far my implementations match from the first element to the third: ^[a-z]{3}
¿Is it possible to define this regex?
Thanks in advance.

If you want to replace the third character with a space via regex, you could try a regex replace all:
String input = "todayiwasnotagoodday";
String output = input.replaceAll("^(.{2}).(.*)$", "$1 $2");
System.out.println(output); // to ayiwasnotagoodday
Note that you could also avoid regex here, and just use substring operations:
String output = input.substring(0, 2) + " " + input.substring(3);
System.out.println(output); // to ayiwasnotagoodday

How to check and replace a sequence of characters in a String?

Here what the program is expectiong as the output:
if originalString = "CATCATICATAMCATCATGREATCATCAT";
Output should be "I AM GREAT".
The code must find the sequence of characters (CAT in this case), and remove them. Plus, the resulting String must have spaces in between words.
String origString = remixString.replace("CAT", "");
I figured out I have to use String.replace, But what could be the logic for finding out if its not cat and producing the resulting string with spaces in between the words.

First off, you probably want to use the replaceAll method instead, to make sure you replace all occurrences of "CAT" within the String. Then, you want to introduce spaces, so instead of an empty String, replace "CAT" with " " (space).
As pointed out by the comment below, there might be multiple spaces between words - so we use a regular expression to replace multiple instances of "CAT" with a single space. The '+' symbol means "one or more",.
Finally, trim the String to get rid of leading and trailing white space.
remixString.replaceAll("(CAT)+", " ").trim()

You can use replaceAll which accepts a regular expression:
String remixString = "CATCATICATAMCATCATGREATCATCAT";
String origString = remixString.replaceAll("(CAT)+", " ").trim();
Note: the naming of replace and replaceAll is very confusing. They both replace all instances of the matching string; the difference is that replace takes a literal text as an argument, while replaceAll takes a regular expression.

Maybe this will help
String result = remixString.replaceAll("(CAT){1,}", " ");

How to check if a word ends and starts with a common symbol and replace it as many times it appears with 1

I am facing a little challenge, here's what I've been trying to do.
Assuming I have these 2 variables
String word1 ="hello! hello!! %can you hear me%? Yes I can.";
And then this one
String word2 ="*Java is awesome* Do you % agree % with us?";
I want to be able to check if a variable contains a word that begins and ends with a particular symbol(s) like % and * that I am using and replace with; with '1' (one). Here's what I tried.
StringTokenizer st = new StringTokenizer(word1);
while(st.hasMoreTokens()){
String block = st.nextToken();
if( (block.startsWith("%") && block.endsWith("%") ||(block.startsWith("*") && block.endsWith("*")){
word1.replace (block,"1");
}
}
//output
'hello!hello!!%canyouhearme%?YesIcan."
//expected
"hello! hello!! 1? Yes I can.";
It just ended up trimming it. I guess this is because of the delimiter used is Space and since the last % ends with %? It read it as a single block.
When I tried the same for word2
I got "1Doyou%agree%withus?"
//expected
"1 Do you 1 with us?"
And assuming I have another word like
String word3 ="%*hello*% friends";
I want to be able to produce
//output
"1friends"
//expected
"11 friends"
Since it has 4-symbols
Any help would be truly appreciated, just sharpening my java skills. Thanks.

You can use a Regular Expression (RegEx) within the String.matches() method for determining if a string contains the specific criteria, for example:
if (word1.matches(".*\\*.*\\*.*|.*\\%.*\\%.*")) {
// Replace desired test with the value of 1 here...
}
If you want the full explanation of this regular expression then go to rexex101.com and enter the following expression: .*\*.*\*.*|.*\%.*\%.*.
The above if statement condition utilizes the String.matches() method to validate whether or not the string contains text (or no text) between either asterisks (*) or between percent (%) characters. If it does we simply use the String.replaceAll() method to replace those string sections (between and including *...* and %...%) with the value of 1, something like this:
String word1 = "hello! hello!! %can you hear me%? Yes I can.";
if (word1.matches(".*\\*.*\\*.*|.*\\%.*\\%.*")) {
String newWord1 = word1.replaceAll("\\*.*\\*|%.*%", "1");
System.out.println(newWord1);
}
The Console window will display:
hello! hello!! 1? Yes I can.
If you were to play this string: "*Java is awesome* Do you % agree % with us?" into with the above code your console window will display:
1 Do you 1 with us?
Keep in mind that this will provide the same output to console if your supplied string was "** Do you %% with us?". If you don't really want this then you will need to modify the RegEx within the matches() method a wee bit to something like this:
".*\\*.+\\*.*|.*\\%.+\\%.*"
and you will need to modify the the RegEx within the replaceAll() method to this:
"\\*.+\\*|%.+%"
With this change there now must be text between both the asterisks and or the Percent characters before validation is successful and a change is made.

The question isn't clear (not sure about how %*hello*% somehow translates to 11, and didn't understand what you mean by Since it has 4-symbols), but wouldn't regular expressions work?
Can't you simply do:
String replaced = word1.replaceAll("\\*[^\\*]+\\*", "1")
.replaceAll("\\%[^\\%]+\\%", "1");

I would say your presumption that special characters will be replaced twice is wrong. Replace function only works with case when you are trying to replace occurance of String, which doesn't seem to work with special characters. Only replaceAll, seems to work in that case. In your code you are trying to replace special characters along with other strings inside that, so only replaceAll function will do so.
In other words, when replaceAll function is executed it checks occurance of special characters , and replaces it once. You wouldn't require effort of using StringTokenizer, which is part of Scanner library, it is only required if you are taking user's input. So, no matter what you do you would only see 1 friends instead of 11 friends , also , you wouldn't need if statement. Credit goes to jbx above for regex. Now, you could shorten your code like this, still bearing in mind that 1 is printed replacing whatever is inside special character is replaced by single number 1.
You will need if-statement to search , replaceAll, or replace function already searches in String you specify to search on, so that if-statement is redundant, it's just making code end up being verbose.
package object_list_stackoverflow;
import java.util.StringTokenizer;
public class Object_list_stackoverflow {
public static void main(String[] args) {
String word1 = "hello! hello!! %can you hear me%? Yes I can.";
String word2 ="*Java is awesome* Do you % agree % with us?";
String word3 ="%*hello*% friends";
String regex = "\\*[^\\*]+\\*";
String regex1= "\\%[^\\%]+\\%";
System.out.println(word3.replaceAll(regex, "1").replaceAll(regex1, "1"));
}
}
Also read similar question by going to : Find Text between special characters and replace string
You can also get rid of alphanumeric characters by looking at dhuma1981's answer: How to replace special characters in a string?
Syntax to replace alphanumerics in String :
replaceAll("[^a-zA-Z0-9]", "");

Java regex negative lookahead to replace non-triple characters

I'm trying to take a number, convert it into a string and replace all characters that are not a triple.
Eg. if I pass in 1222331 my replace method should return 222. I can find that this pattern exists but I need to get the value and save it into a string for additional logic. I don't want to do a for loop to iterate through this string.
I have the following code:
String first = Integer.toString(num1);
String x = first.replaceAll("^((?!([0-9])\\3{2})).*$","");
But it's replacing the triple digits also. I only need it to replace the rest of the characters. Is my approach wrong?

You can use
first = first.replaceAll("((\\d)\\2{2})|\\d", "$1");
See regex demo
The regex - ((\d)\2{2})|\d - matches either a digit that repeats thrice (and captures it into Group 1), or just matches any other digit. $1 just restores the captured text in the resulting string while removing all others.

Remove repeating set of characters in a string

I want to remove the sequesnce "-~-~-" if it repeats in a string, but only if they are together.
I have tried to create a regex based on the removing of multiple white spaces regex:
test.replaceAll("\\s+", " ");
Unfortunately I was unsuccessful. Can someone please help me write the correct regex? thanks.
Example:
string test = "hello-~-~--~-~--~-~-"
output:
hello-~-~-
Another example
string test = "-~-~--~-~--~-~-hello-~-~--~-~--~-~-"
output:
-~-~-hello-~-~-

The regex is:
test.replaceAll("(-~-~-){2,}", "-~-~-")
replaceAll replaces all occurrences matched by the regex (the first parameter) with the second parameter.
the () groups the expression -~-~- together, {2,} means two or more occurrences.
EDIT
Like #anubhava said, instead of using -~-~- for the replacement string, you could also use $1 which backreferences the first capturing group (i.e. the expression in the regex surrounded by ()).

test.replaceAll("(-~-~-)+", "-~-~-");

This is the regex you need:
(-~-~-){2}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java String.replaceAll method to sanitize phone numbers - java

The first argument of replaceAll() is a regex pattern, so what you want to do is make it match all non digits (and +). You can do this using the "[^...]" (not one of...) construct : mystring.replaceAll("[^0-9+]", "")

This can be done simply by using : s=s.replace("/",""); s=s.replace("(",""); s=s.replace(")",""); Then substring it to get country code.

Related

Replace a nth character using regex in Java

How to check and replace a sequence of characters in a String?

How to check if a word ends and starts with a common symbol and replace it as many times it appears with 1

Java regex negative lookahead to replace non-triple characters

Remove repeating set of characters in a string

Categories

Resources