Replace a nth character using regex in Java - java

I'm trying to learn regex in Java.
So far, I've been trying some little mini challenges and I'm wondering if there is a way to define a nth character.
For instance, let's say I have this string: todayiwasnotagoodday
If I want to replace the third (fourth or seventh) character, how I can define a regex in order to change an specific "index", for this example the 'd' for an empty space "".
I've been searching about it, but so far my implementations match from the first element to the third: ^[a-z]{3}
¿Is it possible to define this regex?
Thanks in advance.

If you want to replace the third character with a space via regex, you could try a regex replace all:
String input = "todayiwasnotagoodday";
String output = input.replaceAll("^(.{2}).(.*)$", "$1 $2");
System.out.println(output); // to ayiwasnotagoodday
Note that you could also avoid regex here, and just use substring operations:
String output = input.substring(0, 2) + " " + input.substring(3);
System.out.println(output); // to ayiwasnotagoodday

Related

How to check and replace a sequence of characters in a String?

Here what the program is expectiong as the output:
if originalString = "CATCATICATAMCATCATGREATCATCAT";
Output should be "I AM GREAT".
The code must find the sequence of characters (CAT in this case), and remove them. Plus, the resulting String must have spaces in between words.
String origString = remixString.replace("CAT", "");
I figured out I have to use String.replace, But what could be the logic for finding out if its not cat and producing the resulting string with spaces in between the words.
First off, you probably want to use the replaceAll method instead, to make sure you replace all occurrences of "CAT" within the String. Then, you want to introduce spaces, so instead of an empty String, replace "CAT" with " " (space).
As pointed out by the comment below, there might be multiple spaces between words - so we use a regular expression to replace multiple instances of "CAT" with a single space. The '+' symbol means "one or more",.
Finally, trim the String to get rid of leading and trailing white space.
remixString.replaceAll("(CAT)+", " ").trim()
You can use replaceAll which accepts a regular expression:
String remixString = "CATCATICATAMCATCATGREATCATCAT";
String origString = remixString.replaceAll("(CAT)+", " ").trim();
Note: the naming of replace and replaceAll is very confusing. They both replace all instances of the matching string; the difference is that replace takes a literal text as an argument, while replaceAll takes a regular expression.
Maybe this will help
String result = remixString.replaceAll("(CAT){1,}", " ");

String.split() not working as intended

I'm trying to split a string, however, I'm not getting the expected output.
String one = "hello 0xA0xAgoodbye";
String two[] = one.split(" |0xA");
System.out.println(Arrays.toString(two));
Expected output: [hello, goodbye]
What I got: [hello, , , goodbye]
Why is this happening and how can I fix it?
Thanks in advance! ^-^
If you'd like to treat consecutive delimiters as one, you could modify your regex as follows:
"( |0xA)+"
This means "a space or the string "0xA", repeated one or more times".
(\\s|0xA)+ This will match one or more number of space or 0xA in the text and split them
This result is caused by multiple consecutive matches in the string. You may wrap the pattern with a grouping construct and apply a + quantifier to it to match multiple matches:
String one = "hello 0xA0xAgoodbye";
String two[] = one.split("(?:\\s|0xA)+");
System.out.println(Arrays.toString(two));
A (?:\s|0xA)+ regex matches 1 or more whitespace symbols or 0XA literal character sequences.
See the Java online demo.
However, you will still get an empty value as the first item in the resulting array if the 0xA or whitespaces appear at the start of the string. Then, you will have to remove them first:
String two[] = one.replaceFirst("^(?:\\s|0xA)+", "").split("(?:\\s+|0xA)+");
See another Java demo.

Java regex contained within

How can I write a regex to see if a certain string is contained within two characters.
For example, I want to see which part of the string is contained within a quotation mark.
Java"_virtual_"machine
If I run my regex through this, I want to get _virtual_
How can I achieve this using regexes?
Supposing the text contains only one such pair of characters and there are no escaping tricks or so, you can use this regex
.*"([^"]*)".*
Try it online
I wouldn't do it with regex but with method which finds first and second occurrence of a character so it's more flexible to use in the
String s = "Java\"_virtual_\"machine";
String find = "\"";
int first = s.indexOf(find) + 1;
int second = s.indexOf(find, s.indexOf(find) + 1);
s.substring(first, second);
You can reuse this code on other characters and I bet it's faster than regex.
You can use this regex. I'm not sure if this is the best regex you can use though. It will capture every string between quotes.
.*?"([^"]*)"
Demo here
If you want to get the first string that matches the regex, use :
.*?"([^"]*)".*

Remove repeating set of characters in a string

I want to remove the sequesnce "-~-~-" if it repeats in a string, but only if they are together.
I have tried to create a regex based on the removing of multiple white spaces regex:
test.replaceAll("\\s+", " ");
Unfortunately I was unsuccessful. Can someone please help me write the correct regex? thanks.
Example:
string test = "hello-~-~--~-~--~-~-"
output:
hello-~-~-
Another example
string test = "-~-~--~-~--~-~-hello-~-~--~-~--~-~-"
output:
-~-~-hello-~-~-
The regex is:
test.replaceAll("(-~-~-){2,}", "-~-~-")
replaceAll replaces all occurrences matched by the regex (the first parameter) with the second parameter.
the () groups the expression -~-~- together, {2,} means two or more occurrences.
EDIT
Like #anubhava said, instead of using -~-~- for the replacement string, you could also use $1 which backreferences the first capturing group (i.e. the expression in the regex surrounded by ()).
test.replaceAll("(-~-~-)+", "-~-~-");
This is the regex you need:
(-~-~-){2}

Java String.replaceAll method to sanitize phone numbers

I have databasefield called TelephoneName. In this field, I got different formats of telephone number.
What I need now is to seperate them into countrycode and subscribernumber.
For example, I saw a telephone number +49 (0)711 / 61947-xx.
I want to remove all the slash,brackets,minus,space. The result could be +49 (countrycode) and 071161947**(subsribernumber).
How can I do that with replaceAll method?
replaceAll("//()-","") is that correct?
The thing is I got a lot of unformatted telephone number such as:
+49 04261 85120
+32027400050
It is different to apply every telephone number with same algorithms
The replaceAll method takes a regular expression as argument. To remove everything except digits and +, you could thus do
str = str.replaceAll("[^0-9+]", "")
Here's a more complete example that also figures out the country code (based on the index of the ( symbol):
String str = "+49 (0)711 / 61947-12";
int lpar = str.indexOf('(');
String countryCode = str.substring(0, lpar).trim();
String subscriber = str.substring(lpar).trim();
subscriber = subscriber.replaceAll("[^0-9]", "");
System.out.println(countryCode); // prints +49
System.out.println(subscriber); // prints 07116194712
replaceAll("//()-","") is that correct?
No, not quite. That will remove all //- substrings. To remove those characters you need to put them in [...], like this: replaceAll("[/()-]", "") (and / does not need to be escaped).
The first argument of replaceAll() is a regex pattern, so what you want to do is make it match all non digits (and +). You can do this using the "[^...]" (not one of...) construct :
mystring.replaceAll("[^0-9+]", "")
No, that doesn't work.
ReplaceAll() Replaces each substring of this string that matches the given regular expression with the given replacement.
So your expression would replace all instances in the number that look like /()' with an empty space.
You need to do something like
String output = "+49 (0)711 / 61947-xx".replaceAll("[//()-]","");
The square brackets make it a regex character class ('Either slash or open bracket or close bracket or hypen'), rather than a literal ('slash followed by open bracket followed by close bracket followed by hypen.').
This can be done simply by using :
s=s.replace("/","");
s=s.replace("(","");
s=s.replace(")","");
Then substring it to get country code.

Categories

Resources