replaceAll regular expression Replacing $ - java

I am trying to replace all $ characters in a String expression like this
an example of a string with s$ and another with $s and here is the end.
so that the $ characters are surrounded by spaces.
I've tried string.replaceAll("$", " $ ");
This results in a illegal Argument Exception.
When I try escaping the $ character like this:
string.replaceAll("\$", " $ "); I get an invalid escape sequence error before I even build.
When I try the following:
string.replaceAll("\\$", " $ "); I get an illegal argument exception again.
Finally when I try this:
string.replaceAll("\\\\$", " $ ");
It has no effect on the string at all. I know this is something stupid that I'm just not getting. Can anyone help here?

You'll need two slashes on both sides
string.replaceAll("\\$", " \\$ ");
The first one escapes the second slash that will be passed to the regular expression. The expression is then "\$" which matches the $ sign. And you want to replace it with the same.
You have to escape the second parameter as well because allthough its not a regular expression the \ and the $ sign are a specical case here according to the documentation:
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.

If you don't need parameters to be treated as regexps, use replace() instead of replaceAll() (it replaces all occurences of the first parameter as well, but doesn't treat it as regexp):
string.replace("$", " $ ");

Try string.replaceAll("\\$", " \\$ ").

The reason why you have to esacpe the $ sign in the replacement string is because the String#replaceAll() method uses Matcher#replaceAll() underneath the hood. Straight from the latter's Javadoc:
Note that backslashes (\) and
dollar signs ($) in the
replacement string may cause the
results to be different than if it
were being treated as a literal
replacement string. Dollar signs may
be treated as references to captured
subsequences as described above, and
backslashes are used to escape literal
characters in the replacement string.

Related

Java replaceAll fails with dollar sign in source string

Say I have the following code
String test = "$abc<>";
test = test.replaceAll("[^A-Za-z0-9./,#-' ]", "");
test is now "$abc".
Why does it keep the dollar sign?
Your list of characters to preserve includes #-', which is a range from Unicode U+0023 (the # symbol) to U+0027 (the ' symbol), including $ (U+0024).
If you meant #-' to be interpreted as a list of three characters, just escape it:
test = test.replaceAll("[^A-Za-z0-9./,#\\-' ]", "");
or put it at the end of the list:
test = test.replaceAll("[^A-Za-z0-9./,#' -]", "");
Because you must put the - as the last character in your character class.
Try
test.replaceAll("[^A-Za-z0-9./,#' -]", "");
It'll work :)
See also In a java regex, how can I get a character class e.g. [a-z] to match a - minus sign?
and the Javadoc for Pattern (Ctrl-F "Character classes")
Note that a different set of metacharacters are in effect inside a character class than outside a character class. For instance, the regular expression . loses its special meaning inside a character class, while the expression - becomes a range forming metacharacter.

Underlined backslash IntelliJ

I am using a backslash as an escape character for a serialization format I am working on. I have it as a constant but IntelliJ is underlining it and highlighting it red. On hover it gives no error messages or any information as to why it does not like it.
What is the reason for this and how do I fix it?
IntelliJ is smarter than I am and realised that I was using this character in a regular expression where 2 backslashes would be needed, however, IntelliJ also assumed that my puny mind could find the problem without giving me any information about it.
If it's being used as a regular expression, then the "\" must be escaped.
If you're escaping a "\" as "\" like traditional regular expressions require, then you also need to add two more \\ for a total of \\\\.
This is because of the way Java interprets "\":
In literal Java strings the backslash is an escape character. The
literal string "\" is a single backslash. In regular expressions, the
backslash is also an escape character. The regular expression \
matches a single backslash. This regular expression as a Java string,
becomes "\\". That's right: 4 backslashes to match a single one.
The regex \w matches a word character. As a Java string, this is
written as "\w".
The same backslash-mess occurs when providing replacement strings for
methods like String.replaceAll() as literal Java strings in your Java
code. In the replacement text, a dollar sign must be encoded as \$ and
a backslash as \ when you want to replace the regex match with an
actual dollar sign or backslash. However, backslashes must also be
escaped in literal Java strings. So a single dollar sign in the
replacement text becomes "\$" when written as a literal Java string.
The single backslash becomes "\\". Right again: 4 backslashes to
insert a single one.

Java remove escaped double-quote

I have a long Java String that contains lots of escaped double-quotes:
// Prints: \"Hello my name is Sam.\" \"And I am a good boy.\"
System.out.println(bigString);
I want to remove all the escaped double-quotes (\") and replace them with normal double-quotes (") so that I get:
// Prints: "Hello my name is Sam." "And I am a good boy."
System.out.println(bigString);
I thought this was a no-brainer. My best attempt of:
bigString = bigString.replaceAll("\\", "");
Throws the following exception:
Unexpected internal error near index 1
Any ideas? Thanks in advance.
Everybody is telling you to use replaceAll, the better answer is really to use replace.
replaceAll - requires regular expression
replace [javadoc]- is just a string search and replace
So like this:
bigString = bigString.replace("\\\"", "\"");
Note that this is also faster because regular expression is not needed.
Replace all uses Regular expressions, so add another set of \\
bigString = bigString.replaceAll("\\\\\"", "\"");
Explanation why:
"\" is interpretad by java as a normal \. However if you would use only that in the parameter, it becomes the regular expression \. A \ in a regular expression escapes the next character. Since none is found, it throws an exception.
When you write in Java "\\\\\"", it is first treated by java as the regular expression \\". Which is then treated by the regular expression implementation as "a backslash followed by a double-quote".
String str="\"Hello my name is Sam.\" \"And I am a good boy.\"";
System.out.println(str.replaceAll("\\\"", "\""));
Output:
"Hello my name is Sam." "And I am a good boy."
The first argument to replaceAll is a regular expression. You pass \ which is not a valid regex. Try:
bigString.replaceAll("\\\\", "");

Illegal escape character error in Java regex

I've read the manual, and at the end there was an exercise:
Use a backreference to write an expression that will match a person's name only if that person's first name and last name are the same.
I've written the next program http://pastebin.com/YkuUuP5M
But when I compile it, I'm getting an error:
PersonName.java:18: illegal escape character
p = Pattern.compile("([A-Z][a-zA-Z]+)\s+\1");
^
If I rewrite 18 line in this way:
pattern = Pattern.compile(console.readLine("%nEnter your regex: "));
and write the pattern in the console, then the program works fine. Why I can't use the pattern as in the 1st program case and is there some way to fix it?
You want to get this text into a string:
([A-Z][a-zA-Z]+)\s+\1
However, \ in a string literal in Java source code is the character used for escaping (e.g. "\t" for tab). Therefore you need to use "\" in a string literal to end up with a single backslash in the resulting string. So you want:
"([A-Z][a-zA-Z]+)\\s+\\1"
Note that there's nothing regular-expression-specific to this. Any time you want to express a string containing a backslash in a Java string literal, you'll need to escape that backslash. Regular expressions and Windows filenames are just the most common cases for that.

replaceFirst() fails when replacing with "$"

I don't understand why the "$" is special.
String str = "bla aa";
String tag = "$";
str = str.replaceFirst("aa", tag);
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 1
If I change the tag = "\\$", then it works fine. But why does it need to be escaped? thanks in advance.
Because it is a special regex symbol (in results it's about capturing groups), and replaceFirst takes regex arguments. The documentation explicitly warns you:
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceFirst(java.lang.String). Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.
Now a bit more about $. In the regex pattern it means "end of line".
In the replacement string, $g means "the g th group". So for a regex a([a-z]+)([0-9]+), you have two groups - $1 and $2, and you can refer to them when replacing. See the explanation here
Replace first takes regular expression. According to Pattern javadoc $ matches The end of a line.
$ matches the end of the line in a regex. So if you need it as a simple character, you need to escape it. You can find more at JAVA Pattern

Categories

Resources