This question already has answers here:
Java doesn't work with regex \s, says: invalid escape sequence
(3 answers)
Closed 2 years ago.
I have a very long regular expression that seems to be having issues, but only when imported from a text file. I've narrowed it down to the following section (shown here as a literal String):
"(?i)(?<!\\w)\\w{2,3}(?=\\))"
As you can see, near the end, I am trying to escape a closing parenthesis for a lookahead. Now, if this is hard-coded, like:
Pattern myPattern = Pattern.compile("(?i)(?<!\\w)\\w{2,3}(?=\\))");
It works completely as expected. If, however, I read it from a text file, like:
File patternFile = new File("patterns.txt");
List<String> patternText = FileUtils.readLines(patternFile);
String ucText = patternText.get(0).trim();
Pattern myPattern = Pattern.compile(ucText);
Then I get the error message:
Exception in thread "Thread-4" java.util.regex.PatternSyntaxException: Unmatched closing ')' near index 25
(?i)(?<!\\w)\\w{2,3}(?=\\))
^
So, why is this happening? Why is escaping a closing parenthesis legal when hard-coded, but not when reading from a text file?
You're writing a Java string literal. \) is not a legal escape code for Java string literals.
You need to escape every backslash with \\ to create a string with a single backslash for the regex.
only when imported from a text file
You have to print that to the console.
If it prints out (?i)(?<!\w)\w{2,3}(?=\)) its ok,
if it prints out with it double escaped, you have to un-escape those
A good way to un-escape the escape character is do a global find/replace
(this is %90 of the parsing)
Find "(?x)\\\\ \\\\"
Replace "\\\\"
Un-escape non-escapes is a relative approach.
And it depends upon the character and the substitution,
or no action on either. This is mostly language specific,
but you can roll your own. For this, the basic's are ...
Find "(?xs)\\\\ (.)"
Replace roll your own"
Related
This question already has answers here:
Java regular expressions and dollar sign
(5 answers)
Closed 3 years ago.
How to replace all "$$$" present in a String?
I tried
story.replaceAll("$$$","\n")
This displays a warning: Anchor $ in unexpected position and the code fails to work. The code takes the "$" symbol as an anchor for a regular expression. I just need to replace that symbol.
Is there any way to do this?
"$" is a special character for regular expressions.
Try the following:
System.out.println(story.replaceAll("\\$\\$\\$", "\n"));
We are escaping the "$" character with a '\' in the above code.
There are several ways you can do this. It depends on what you want to do, and how elegant your solution is:
String replacement = "\n"; // The replacement string
// The first way:
story.replaceAll("[$]{3}", replacement);
// Second way:
story.replaceAll("\\${3}", replacement);
// Third way:
story.replaceAll("\\$\\$\\$", replacement);
You can replace any special characters (Regular Expression-wise) by escaping that character with a backslash. Since Java-literals use the backslash as escaping-character too, you need to escape the backslash itself.
story.replaceAll("\\${3}", something);
By using {3}behind the $, you say, that it should be found exactly three times. Looks a bit more elegant than "\\$\\$\\$".
something is thus your replacement, for example "" or \n, depending on what you want.
this will surely work..
story.replaceAll("\\$\\$\\$","\n")
YOu can do this for any special character.
This question already has answers here:
How to replace dollar character with backslash dollar in a string
(3 answers)
Closed 5 years ago.
So when I run the following,
String thing = "y$xx$sss$$aaa";
thing = thing.replaceAll("$", "\\$");
System.out.println(thing);
I still get "y$xx$sss$$aaa" as the output. I've also tried
String thing = "y$xx$sss$$aaa";
thing = thing.replaceAll("$", "\\\\$");
System.out.println(thing);
and
String thing = "y$xx$sss$$aaa";
thing = thing.replaceAll("$", "\\\\\\\\$");
System.out.println(thing);
per some existing answers, but I just kept getting the error Illegal group reference: group index is missing.
Basically, I'm trying to replace all $ with an escaped dollar sign \$
You're nearly there:
thing = thing.replaceAll("\\$", "\\\\\\$");
You need to escape the first $, otherwise it's a regex command character signifying end of input.
The second arguments requires a lot of escaping too:
1st double-escape to avoid replacing with literal $
2nd and 3rd double escape to prevent referencing a group number (the escaped $ character) and add an actual back-slash
Then again, easier solution without regular expressions:
thing = thing.replace("$", "\\$");
Note: the latter example does still use Patterns, but it quotes the arguments as literals internally.
I'm attempting to make a program to parse an output in Eclipse, but when I enter the regular expression like so:
Pattern signaturePattern = Pattern.compile("[A-Z0-9_]+[" "]+[A-Za-z0-9\.]+[" "]+[A-Za-z0-9\.]+[" "]+[A-Za-z0-9\.]+[" "]+[A-Za-z0-9\.]+[" "]+");
The compiler gives me an error that says "invalid escape sequence." However, when I do what many answers to this question recommend - that is, to add an extra backslash to the dots - and I enter this instead:
Pattern signaturePattern = Pattern.compile("[A-Z0-9_]+[" "]+[A-Za-z0-9\\.]+[" "]+[A-Za-z0-9\\.]+[" "]+[A-Za-z0-9\\.]+[" "]+[A-Za-z0-9\\.]+[" "]+");
The compiler instead says "Syntax error on tokens, delete these tokens." How can I get it to simply read the regular expression as-is?
You forgot to escape your double quotes, as such (one escape only): \".
Here is your escaped Pattern (both code and Pattern compile, but I'm not guaranteeing it does what you want).
Pattern signaturePattern = Pattern.compile("[A-Z0-9_]+[\" \"]+[A-Za-z0-9\\.]+[\" \"]+[A-Za-z0-9\\.]+[\" \"]+[A-Za-z0-9\\.]+[\" \"]+[A-Za-z0-9\\.]+[\" \"]+");
This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 9 years ago.
I'm trying to split a string at every '.' (period), but periods are a symbol used by java regexes. Example code,
String outstr = "Apis dubli hre. Agro duli demmos,".split(".");
I can't escape the period character, so how else do I get Java to ignore it?
Use "\\." instead. Just using . means 'any character'.
I can't escape the period character, so how else do I get Java to ignore it?
You can escape the period character, but you must first consider how the string is interpreted.
In a Java string (that is fed to Pattern.compile(s))...
"." is a regex meaning any character.
"\." is an illegally-escaped string. This won't compile. As a regex in a text editor, however, this is perfectly legitimate, and means a literal dot.
"\\." is a Java string that, once interpreted, becomes the regular expression \., which is again the escaped (literal) dot.
What you want is
String outstr = "Apis dubli hre. Agro duli demmos,".split("\\.");
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
The split() method in Java does not work on a dot (.)
I'm new to java. I want to split a String from "." (dot) and get those names one by one. But this program gives error: "Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0"
please help me
String input1 = "van.bus.car";
System.out.println(input.split(".")[0]+"");
System.out.println(input.split(".")[1]+"");
System.out.println(input.split(".")[2]+"");
In regex, Dot(.) is a special meta-character which matches everything.
Since String.split works on Regex, so you need to escape it with backslash if you want to match a dot.
System.out.println(input.split("\\.")[0]+"");
To learn more about Regex, refer to following sites: -
http://docs.oracle.com/javase/tutorial/essential/regex/
http://www.vogella.com/articles/JavaRegularExpressions/article.html
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
The argument to split is a regex, and so the full stop/dot/. has a special meaning: match any character. To use it literally in your split, you'll need to escape it:
String[] splits = input1.split("\\.");
That should give you an array of length 3 for your input string.
For more about regex and which characters are special, see the docs for Pattern.