java regex expression escape characters - java

Hi i'm trying to split a string separated by vertical bars. for example:
String str = "a=1|b=2";
In java, we should do like this:
str.split("\\|");
If I use a single slash:
str.split("\|");
compiler gives errors:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \ )
Can anyone explain me why this happens? thanks!

Backslash \ is a special character. In the Java world it is used to escape a character.
The pipe | is a special character in the Regex world, which means "OR".
To use the pipe as a separator you need to escape it(so it can be recognized during the regex parsing), so you need to get this in your regex: \|.
But as backshlash is a special character in Java and that you are using a String object, you have to escape the backslash so it can be interpreted as the final expected final result: \|
To do so, you simply escape backslash with another backslash: \\|
The first backslash escapes the second backslash (java requirement) which escapes the pipe (regex requirement).

In Java strings, a backslash needs to be escaped with another backslash. So, while the "canonical" form of the regex is indeed \|, as a Java string, this must be written "\\|".

Related

How Java replaceAll operation works with backslashes?

Why do I need four backslashes (\) to add one backslash into a String?
String replacedValue = neName.replaceAll(",", "\\\\,");
Here in above code you can check I have to replace all commas (,) from \, but I have to add three more backslash (\) ?
Can anybody explain this concept?
Escape once for Java, and a second time for regexp.
\ -> \\ -> \\\\
Or since you're not actually using regular expressions, take khelwood's advice and use replace(String,String) so you need to only escape once.
The documentation of String.replaceAll(regex, replacement) states:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll.
The documentation of Matcher.replaceAll(replacement) then states:
backslashes are used to escape literal characters in the replacement string
So to put this more clearly, when you replace with \,, it is as if you were escaping the comma. But what you want is really the \ character, so you should escape it with \\,. Since that in Java, \ also needs to be escaped, the replacement String becomes \\\\,.
If you are having a hard time remembering all this, you can use the method Matcher.quoteReplacement(s), whose goal is to correctly escape the replacement part. Your code would become:
String replacedValue = neName.replaceAll(",", Matcher.quoteReplacement("\\,"));
\ is used for escape sequence
For example
go to next line then use \n or \r
for tab \t
likewise to print \ which is special in string literal you have to escape it with another \ which gives us \\
Now replaceAll should be used with a regex, since you're not using a regex, use replace as suggested in the comments.
String s = neName.replace(",", "\\,");
You have to first escape the backslash because it's a literal (giving \\), and then escape it again because of the regular expression (giving \\\\).
Therefore this -
String replacedValue = neName.replaceAll(",", "\\\\,"); // you need ////
You can use replace instead of replaceAll-
String replacedValue = neName.replace(",", "\\,");

How to escape + character in java?

How to escape + character while using split function call in java?
split declaration
String[] split(String regularExpression)
thats what i did
services.split("+"); //says dongling metacharacter
services.split("\+"); //illegal escape character in string literal
But it allows to do something like this
String regExpr="+";
Since the + is a regex meta-character (denoting an occurrence of 1 or more times), you will have to escape it with \ (which also has to be escaped because it's a meta-character that's being used when describing the tab character, the new line character(s) \r\n and others), so you have to do:
services.split("\\+");
Java and Regex both have special escape sequences, and both of them begin with \.
Your issue lies in writing a string literal in Java. Java's escape sequences are resolved at compile time, long before the string is passed into your Regex engine for parsing.
The sequence "\+" would throw an error as this is not a valid Java string.
If you want to pass \+ into your Regex engine you have to explicitly let Java know you want to pass in a backslash character using "\\+".
All valid Java escape sequences are as follows:
\t Insert a tab in the text at this point.
\b Insert a backspace in the text at this point.
\n Insert a newline in the text at this point.
\r Insert a carriage return in the text at this point.
\f Insert a formfeed in the text at this point.
\' Insert a single quote character in the text at this point.
\" Insert a double quote character in the text at this point.
\\ Insert a backslash character in the text at this point.
should be like this :
services.split("\\+");

How to split a string with double quotes " as the delimiter?

I tried splitting like this-
tableData.split("\\"")
but it does not work.
It seems that you tried to escape it same way as you would escape | which is "\\|". But difference between | and " is that
| is metacharacter in regex engine (it represents OR operator)
" is metacharacter in Java language in string literal (it represents start/end of the string)
To escape any String metacharacter (like ") you need to place before it other String metacharacter responsible for escaping which is \1. So to create String which would contain " like this is "quote" you would need to write it as
String s = "this is \"quote\"";
// ^^ ^^ these represent " literal, not end of string
Same idea is applied if we would like to create \ literal (we would need to escape it by placing another \ before it). For instance if we would want to create string representing c:\foo\bar we would need to write it as
String s = "c:\\foo\\bar";
// ^^ ^^ these will represent \ literal
So as you see \ is used to escape metacharacters (make them simple literals).
This character is used in Java language for Strings, but it also is used in regex engine to escape its metacharacters:
\, ^, $, ., |, ?, *, +, (, ), [, {.
If you would like to create regex which will match [ character you will need to use regex \[ but String representing this regex in Java needs to be written as
String leftBracketRegex = "\\[";
// ^^ - Remember what was said earlier?
// To create \ literal in String we need to escape it
So to split on [ we would need to invoke split("\\[") because regex representing [ is \[ which needs to be written as "\\[" in Java.
Since " is not special character in regex but it is special in String we need to escape it only in string literal by writing it as
split("\"");
1) \ is also used to create other characters line separators \n, tab \t. It can also be used to create Unicode characters like \uXXXX where XXXX is index of character in Unicode table in hexadecimal form.
You have escaped the \ by putting in \ twice, try
tableData.split("\"")
Why does this happen?
A backslash escapes the following character. Since the next character is another backslash, the second backslash will be escaped, thus the doublequote won't.
Your resulting escaped string is \", where it should really be just ".
Edit:
Also keep in mind, that String.split() interprets its pattern parameter as a regular expression, which has several special characters, which have to be escaped in the resulting string.
So if you want split by a .(which is a special regex character), you need to specify it as String.split("\\."). The first backslash escapes the escaping function of the second backlash and would result in "\.".
In case of regex characters you could also just use Pattern.quote(); to escape your desired delimiter, but this is far out of the scope the question orignally had.
Try with single backslash \
tableData.split("\"")
Try like this by escaping " with single backslash \ :
tableData.split("\"")
You are not escaping properly. The snippet code will not even compile because of it. The correct way to do it is
tableData.split("\"");
A single backslash will do the trick.
Like this:
tableData.split("\"");
You can actually split without the backward slash. You only have to use single quote
tableData.split('"');

System.setProperty: Invalid Escape Sequence

Im trying to make a reference to a bin.
System.setProperty("mbrola.base", "C:\Users\Name\Desktop\FreeTTS\MBrola Project");
But Im getting this error:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )
You want actual backslashes, which are usually part of escape sequences. You must escape the backslashes themselves, with another backslash.
System.setProperty("mbrola.base", "C:\\Users\\Name\\Desktop\\FreeTTS\\MBrola Project");
Yes, because this isn't a valid string literal:
"C:\Users\Name\Desktop\FreeTTS\MBrola Project"
You need to escape the backslashes:
"C:\\Users\\Name\\Desktop\\FreeTTS\\MBrola Project"
The string itself will only have the single backslashes though - you're just escaping it in source code.

How to write regular expressions in eclipse without compile time errors

Eclipse keeps on indicating there is an error in my code when I write a regular expression.
For example,
String regex = "/\((.+)\)/";
This causes eclipse to warn with a red flag:
Invalid escape sequence (valid ones are \b \t \n \f \r \" \'
\ )
How do I change this?
You must escape backslashes
String regex = "/\\((.+)\\)/";
if you want to put backslash within quotes you must use the escape sequence, \\, on the interior quotes to convey that it is part of the String literal and doesn't have any other special meaning
You need to escape all backslashes, so special characters appear "double escaped" - once for the String, once for the regular expression.

Categories

Resources