Java String replaceAll - works in a wierd way for \

Java String replaceAll - works in a wierd way for \ - java

I have a string like this :
My word is "I am busy" message
Now when I assign this string to a pojo field, I get is escaped as below :
String test = "My word is \"I am busy\" message";
I have some other data in which I want something to be replaced by above string :
Let say my base string is :
String s = "There is some __data to be replaced here";
Now I when I use replaceAll :
String s1 = s.replaceAll("__data", test);
System.out.println(s1);
This returns me the output as :
There is some My word is "I am busy" message to be replaced here
Why that "\" is not appearing in after I replace. Do I need to escape it 2 times?
Also when use it like this :
String test = "My word is \\\"I am busy\\\" message";
then also it gives the same output as :
There is some My word is "I am busy" message to be replaced here
My expected output is :
There is some My word is \"I am busy\" message to be replaced here

Try this:
String test = "My word is \\\\\"I am busy\\\\\" message";
String s = "There is some __data to be replaced here";
System.out.println(s.replaceAll("__data", test));
To get the \ in your output you need to use \\\\\
From the docs:
Note that backslashes () and dollar signs ($) in the replacement
string may cause the results to be different than if it were being
treated as a literal replacement string; see Matcher.replaceAll. Use
Matcher.quoteReplacement(java.lang.String) to suppress the special
meaning of these characters, if desired.
So you can use Matcher.quoteReplacement(java.lang.String)
String test = "My word is \"I am busy\" message";
String s = "There is some __data to be replaced here";
System.out.println(s.replaceAll("__data", test), Matcher.quoteReplacement(test));

You need to use four backslashes to print a single backslash.
String test = "My word is \\\\\"I am busy\\\\\" message";
String s = "There is some __data to be replaced here";
System.out.println(s.replaceAll("__data", test));
OR
String test = "My word is \"I am busy\" message";
String s = "There is some __data to be replaced here";
System.out.println(s.replaceAll("__data", test.replace("\"", "\\\\\"")));
Output:
There is some My word is \"I am busy\" message to be replaced here

Related

Replace multiple whitespace in string with multiple special char in Java

String str = "hello there, what are you doing";
System.out.println(str.replaceAll("\\s{2,}", "?"));
output: hello?there what are?you doing
expected output: hello??there what are???you doing

Try this.
Pattern pat = Pattern.compile("\\s{2,}");
String str = "hello there, what are you doing";
System.out.println(pat.matcher(str).replaceAll(m -> "?".repeat(m.group().length())));
output:
hello??there, what are???you doing

A one line regex which uses replaces whitespace using look ahead (?=\s) and look behind (?<=\s) would be:
String str = "hello there, what are you doing here";
System.out.println(str.replaceAll("(\s(?=\s)|(?<=\s)\s)", "?"));
=>
hello??there, what are???you doing??????here
So the above matches whitespace followed by whitespace , or whitespace preceeded by whitespace and replaces this with "?".

How to ignore special characters ($ ^ + () {} etc.) from string, with the help of regex expression by using replaceAll() method

I am using java replaceAll() method to replace part of String with another String and its working great but, the problem comes when my file name contains characters like $ ^ + ( ) { } [ ] etc. In this case pattern matching fails and the original String remains as it is.
Sample code to show case my use case is as follow:
String messageBody = "src=\"http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file+name.jpeg\" style=\"height:225px\"";
messageBody = messageBody.replaceAll("(http|https)://(?:[^\\s]*)/FileDownloader/4/outbound/31358/file+name.jpeg", "cid: 14890411127853");
System.out.println(messageBody);
The expected output is:
src="cid: 14890411127853" style="height:225px"
but it gives:
src="http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file+name.jpeg" style="height:225px"
How can I get it working by ignoring special characters that we use to form regex expression from my file name.
Thanks in advance!

You have unescaped metacharacters in your URL pattern, including a plus and a literal dot. Escape them, using the following pattern:
(http|https)://(?:[^\\s]*)/FileDownloader/4/outbound/31358/file\\+name\\.jpeg
^^^ escape dot and plus sign
Full code:
String messageBody = "src=\"http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file+name.jpeg\" style=\"height:225px\"";
messageBody = messageBody.replaceAll("(http|https)://(?:[^\\s]*)/FileDownloader/4/outbound/31358/file\\+name\\.jpeg", "cid: 14890411127853");
System.out.println(messageBody);
Output:
src="cid: 14890411127853" style="height:225px"
Update:
If you don't know in advance what the exact pattern will be, but you know it might have metacharacters, which would require escaping for use in a replacement, then Java provides a method for this: Pattern.quote()
To see how it works, we can split your pattern into two parts:
String part1 = "(http|https)://(?:[^\\s]*)";
String part2 = Pattern.quote("/FileDownloader/4/outbound/31358/file+name.jpeg");
messageBody = messageBody.replaceAll(part1 + part2, "cid: 14890411127853");
From the documentation for Pattern.quote():
This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.
Metacharacters or escape sequences in the input sequence will be given no special meaning.

You just have to escape those characters using a backslash (\)
example:
String messageBody = "src=\"http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file+name.jpeg\" style=\"height:225px\"";
messageBody = messageBody.replaceAll("(http|https)://(?:[^\\s]*)/FileDownloader/4/outbound/31358/file\\+name\\.jpeg", "cid: 14890411127853");
similarly
String messageBody = "src=\"http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file$name.jpeg\" style=\"height:225px\"";
messageBody = messageBody.replaceAll("(http|https)://(?:[^\\s]*)/FileDownloader/4/outbound/31358/file\\$name\\.jpeg", "cid: 14890411127853");

Did it this way.
final String[] metaCharacters = {"^","$","{","}","[","]","(",")",".","+","-","&"};
String filePath = "/4/outbound/31358/file+name.jpeg";
for(String c: metaCharacters){
if(filePath.contains(c)){
filePath = filePath.replace(c, "\\"+c);
}
}
String messageBody = "src=\"http://thinconnect.interactcrm.com:36061/FileDownloader/4/outbound/31358/file+name.jpeg\" style=\"height:225px\"";
System.out.println(messageBody);
messageBody = messageBody.replaceAll("(http|https)://(?:[^\\s]*)/FileDownloader"+filePath, "cid: 14890411127853");
System.out.println(messageBody);

How to find the delimiter encountered in a string in JAVA

I have written simple program in Java which does manipulation of a given string.
The input string has some delimiters which are non-alphabets. I have used String Tokenizer to read and manipulate the individual words in a string.
Now I need to reconstruct this manipulated string with the same set of delimiters. Appreciate if any one can suggest me how to identify the delimiter.
In other words, this is what input is:
Text1 Delimiter1 Text2 Delimiter2 Text3 Delimiter3 Text4 Delimiter4
This is what my code does:
NewText1 NewText2 NewText3 NewText4
I made use of string tokenizer to identify the next token in this manner:
StringTokenizer st = new StringTokenizer(str, ", 0123456789(*&^%$##!-_)");
But now I would like to identify the delimiter that was encountered so that I can build my new string.
This is what I actually want:
NewText1 Delimiter1 NewText2 Delimiter2 NewText3 Delimiter3 NewText4 Delmiter4

You can proceed according to this:
String dels = "-, 0123456789(*&^%$##!_)";
String specs = "[" + dels + "]+";
String letts = "[^" + dels + "]+";
String text = "one, two - three! four";
String[] words = text.split( specs );
String[] delim = text.split( letts );
Note that in dels the hyphen must be up front. If you ever add [ or ] or ^ more care must be taken - check the javadoc in java.util.regex.Pattern.
There is no particular problem with composing the original string.
The disadvantage with StringTokenizer with a third argument is that it returns each delimiter as a separate token of length 1.

Replace function in Java

If i have these three different string:
String line = "A man walking down the road";
String word = "the road";
String sub = "the street";
and basically want to return this:
"A man walking down the street"
Can this be done with contains (to check if the string 'word' is included in the string 'text', and then replace to replace the text ? Because i've been trying to work this out for a while now, and i haven't got anywhere.

This can be done like this:
String line = "A man walking down the road";
String word = "the road";
String sub = "the street";
System.out.println(line.replace(word,sub));

Regex convert to convert a string to tab delimited field

I want to convert a string to get tab delimited format. In my opinion option 1 should do it. But it looks like option 2 is actually producing the desired result. Can someone explain why?
public class test {
public static void main(String[] args) {
String temp2 = "My name\" is something";
System.out.println(temp2);
System.out.println( "\"" + temp2.replaceAll("\"", "\\\"") +"\""); //option 1
System.out.println( "\"" + temp2.replaceAll("\"", "\\\\\"") +"\""); //option 2
if(temp2.contains("\"")) {
System.out.println("Identified");
}
}
}
and the output is:
My name" is something
"My name" is something"
"My name\" is something"
Identified

If you want an Excel compatible CSV format, the escaping of the double quote is two double quotes, so called self-escaping.
String twoColumns = "\"a nice text\"\t\"with a \"\"quote\"\".";
String s = "Some \"quoted\" text.";
String s2 = "\"" + s.replace("\"", "\"\"") + "\"";
And ... no head-ache counting the backslashes.

Use String#replace(CharSequence, CharSequence) instead of String#replaceAll(). The former is a simple string replacement, so it works as you'd expect if you haven't read any documentation or don't know about regular expressions. The latter interprets its arguments differently because it's a regex find-and-replace:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string.
You'll get this output:
My name" is something
"My name\" is something"
"My name\\" is something"
Identified

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java String replaceAll - works in a wierd way for \ - java

Related

Replace multiple whitespace in string with multiple special char in Java

How to ignore special characters ($ ^ + () {} etc.) from string, with the help of regex expression by using replaceAll() method

How to find the delimiter encountered in a string in JAVA

Replace function in Java

Regex convert to convert a string to tab delimited field

Categories

Resources