A function receives something like this with special sign (,>_$' and Java replaceAll throwns error.
SAMPLE INPUT
I got an error if input something like this:
[ FAILED ] appendtext variable has with System.lineSeparator():
$model_fsdfdsfdsfdsfdsfds->load('fsdfdsfdsfdsfdsfds','dsfsdfsd');
$model_fsdfdsfdsfdsfdsfds->fsdfdsfdsfdsfdsfds->index();
No error if input as:
[ OKAY ] appendtext variable have simple input with System.lineSeparator():
mysomethingmodel
blabla
EXPLANATIONS
appendtext goes into String with other combinations:
String allappend = "Something simple var" + System.lineSeparator() + "\t{" + System.lineSeparator() + appendtext;
Okay. Than it goes into replaceAll with regex and thrown an error:
str_list = rs.replaceAll(regex_newlinebracket, allappend);
regex_newlinebracket is something regex from another function:
public String RegexPatternsFunction(String types, String function_name)
{
// get a rid of special sign
String function_name_quoted = Pattern.quote(function_name);
switch (types) {
case "newlinebracket":
return function_name_quoted + "(\\s|\\t|\\n)+[{]";
}
return null;
}
ERRORS
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:808)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
or exactly insider appendReplacement function from Matcher.java:
// The first number is always a group
refNum = (int)nextChar - '0';
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
PROBLEM
Using special characters as for the
$model_fsdfdsfdsfdsfdsfds->load('fsdfdsfdsfdsfdsfds','dsfsdfsd');
$model_fsdfdsfdsfdsfdsfds->fsdfdsfdsfdsfdsfds->index();
throwns an error in combination of replaceAll as Regex pattern.
A PROJECT WORKS IF NO SPECIAL SIGN.
I'm using Pattern.quote to escaping special characters in other words it will not works if come input like () and replaceAll using regex.
In C++ Qt, it's works well, in Java not.
Solutions?
It's fine (and necessary) that you use Pattern.quote. But what's causing the actual problem is the replacement string, since it contains $ (which is the relevant referencing-character in replacement strings). Luckily, Java provides you with another quoting function just to make replacement strings safe: Matcher.quoteReplacement()
So just try
allappend = Matcher.quoteReplacement(allappend);
str_list = rs.replaceAll(regex_newlinebracket, allappend);
Related
This question already has answers here:
Why does replaceAll fail with "illegal group reference"?
(8 answers)
Closed 4 years ago.
I've been trying to create a regex that matches the following pattern:
÷x%
here is my code:
String string = "÷x%2%x#3$$#";
String myregex = "all the things I've tried";
string = string.replaceAll(myregex,"÷1x#1$%");
I've tried the following regexes: (÷x%) , [÷][x][%] , [÷]{1}[x]{1}[%]{1}
I am using NetBeans IDE and it gives me an
Illegal group reference
However, when I change the value of string to something else, a word for example.
NetBeans does not give me an exception.
any thoughts, thanks
To replace all occurrences of a sub-string you don't need a pattern. You can use String.replace():
String input = "÷x%abc÷x%def÷x%";
String output = input.replace("÷x%", "÷1x#1$%");
System.out.println(output); // ÷1x#1$%abc÷1x#1$%def÷1x#1$%
As per method javadoc:
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.
As per the comments in the question, I am hoping that this will shed some light on how the replaceAll works.
As per the JavaDoc, the replaceAll takes in a regular expression as first argument. In your case, the regular expression appears to be sound, so there is no issue there.
The second argument that the replaceAll accepts, is the string that will be used to replace whatever the regular expression matches.
In some cases, you will need to replace the same pattern with the same (hard coded, if you will) string:
String myString = "123abc1344";
myString = myString.replaceAll("\\d+", "number");
myString = myString.replaceAll("\\w+", "word");
System.out.println(myString); //Would yield something of the sort: numberwordnumber
BUT, there are situations were you want use chunks of what you are replacing in the replacement string itself. This is where the $ comes in:
String myString = "Age:9;Gender:Male";
Let us say that you want to change the format of the string to the following: "I am a {Gender} and I am {Age} years of age.".
In this case, your replacement string needs to extract information from the string to be replaced and inject it in the replacement itself. You do this by using the following:
String myString = "Age:9;Gender:Male";
myString = myString.replaceAll("Age:(\\d+);Gender:(\\w+)", "I am a $2 and I am $1 years of age.";
The above should yield the string that you are after. Notice that I am using $1 and $2 to access regular expression groups. In regular expression language, the 0th group is whatever it is matched by the entire regular expression. Any other round parenthesis denotes another regular expression group which you can access through the $ keyword.
This is why it needs to be escaped.
In the Java Regex you have to escape the $ sign.
If you write $% you would refer to the group % which is not existant.
You can try:
try {
String string = "÷x%2%x#3$$#";
String myregex = "÷x%";
String replace = "÷1x#1\\$%";
String resultString = string.replaceAll(myregex, replace);
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
} catch (IllegalArgumentException ex) {
// Syntax error in the replacement text (unescaped $ signs?)
} catch (IndexOutOfBoundsException ex) {
// Non-existent backreference used the replacement text
}
I have this code snippet of a code base I am supposed to maintain.
String number = "1";
String value = "test";
String output = "";
output = value.replaceAll("\\Q{#}", number);
The value of output stays as "test" and I can only guess what this code is supposed to do: the value of numbershould be appended to whatever is in value. Maybe something like this: test1 or replace the value with the number entirely.
I found out that \\Q is the regex option to quote everything until \\E but there is no \\E. Anyway it is not doing anything at all and I am wondering if I oversee something?
Your regex just matches a literal {#}. It is true that after \Q the pattern is considered to have literal symbols (all the symbols after \Q get "quoted" or "escaped"), and \E stops this escaping/quoting, and if it is missing, the whole pattern will get quoted/escaped.
If your value variable holds test{#} value, the {#} will get replaced with the number.
See this demo:
String number = "1";
String value = "test{#}";
String output = "";
output = value.replaceAll("\\Q{#}", number);
System.out.println(output); // => test1
Note that without \Q, your regex ({#}) would throw a java.util.regex.PatternSyntaxException: Illegal repetition error because Java regex engine is not "smart" enough to disambiguate the braces (PCRE, JS, .NET can easily guess that since there is no number inside, it is not a limiting/bound quantifier).
I have been taking a look at the regular expressions and how to use it in Java for the problem I have to solve. I have to insert a \ before every ". This is what I have:
public class TestExpressions {
public static void main (String args[]) {
String test = "$('a:contains(\"CRUCERO\")')";
test = test.replaceAll("(\")","$1%");
System.out.println(test);
}
}
The ouput is:
$('a:contains("%CRUCERO"%)')
What I want is:
$('a:contains(\"CRUCERO\")')
I have changed % for \\ but have an error StringIndexOutofBounds don't know why. If someone can help me I would appreciate it, thank you in advance.
I have to insert a \ before every "
You can try with replace which automatically escapes all regex metacharacters and doesn't use any special characters in replacement part so you can simply use String literals you want to be put in matched part.
So lets just replace " with \" literal. You can write it as
test = test.replace("\"", "\\\"");
If you want to insert backspace before quote then use:
test = test.replaceAll("(\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
Or if you want to avoid already escaped quote then use negative lookbehind:
String test = "$('a:contains(\\\"CRUCERO\")')";
test = test.replaceAll("((?<!\\\\)\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
String result = subject.replaceAll("(?i)\"CRUCERO\"", "\\\"CRUCERO\\\"");
EXPLANATION:
Match the character string “"CRUCERO"” literally (case insensitive) «"CRUCERO"»
Ignore unescaped backslash «\»
Insert the character string “"CRUCERO” literally «"CRUCERO»
Ignore unescaped backslash «\»
Insert the character “"” literally «"»
If your goal is escape text for Java strings, then instead of regular expressions, consider using
String escaped = org.apache.commons.lang.StringEscapeUtils.
escapeJava("$('a:contains(\"CRUCERO\")')");
System.out.println(escaped);
Output:
$('a:contains(\"CRUCERO\")')
JavaDoc: http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html#escapeJava(java.lang.String)
I have a string and when I try to run the replaceAll method, I am getting this strange error:
String str = "something { } , op";
str = str.replaceAll("o", "\n"); // it works fine
str = str.replaceAll("{", "\n"); // does not work
and i get a strange error:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal repetition {
How can I replace the occurrences of "{" ?
A { is a regex meta-character used for range repetitions as {min,max}. To match a literal { you need to escape it by preceding it with a \\:
str = str.replaceAll("\\{", "\n"); // does work
If you really intend to replace single characters and not regexes (which is what you seem to want to do here), you should use .replace(), not .replaceAll(). In spite of its name, .replace() will replace ALL occurrences, not just the first one.
And in case you wonder, String implements CharSequence, so .replace("{", "\n") will work.
Escape it:
str = str.replaceAll("\\{", "\n");
This is needed since the first argument to replaceAll() is a regular expression, and { has a special meaning in Java regular expressions (it's a repetition operator, hence the error message).
I need to build a regular expression that finds the word "int" only if it's not part of some string.
I want to find whether int is used in the code. (not in some string, only in regular code)
Example:
int i; // the regex should find this one.
String example = "int i"; // the regex should ignore this line.
logger.i("int"); // the regex should ignore this line.
logger.i("int") + int.toString(); // the regex should find this one (because of the second int)
thanks!
It's not going to be bullet-proof, but this works for all your test cases:
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
It does a look behind and look ahead to assert that there's either none or two preceding/following quotes "
Here's the code in java with the output:
String regex = "(?<=^([^\"]*|[^\"]*\"[^\"]*\"[^\"]*))\\bint\\b(?=([^\"]*|[^\"]*\"[^\"]*\"[^\"]*)$)";
System.out.println(regex);
String[] tests = new String[] {
"int i;",
"String example = \"int i\";",
"logger.i(\"int\");",
"logger.i(\"int\") + int.toString();" };
for (String test : tests) {
System.out.println(test.matches("^.*" + regex + ".*$") + ": " + test);
}
Output (included regex so you can read it without all those \ escapes):
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
true: int i;
false: String example = "int i";
false: logger.i("int");
true: logger.i("int") + int.toString();
Using a regex is never going to be 100% accurate - you need a language parser. Consider escaped quotes in Strings "foo\"bar", in-line comments /* foo " bar */, etc.
Not exactly sure what your complete requirements are but
$\s*\bint\b
perhaps
Assuming input will be each line,
^int\s[\$_a-bA-B\;]*$
it follows basic variable naming rules :)
If you think to parse code and search isolated int word, this works:
(^int|[\(\ \;,]int)
You can use it to find int that in code can be only preceded by space, comma, ";" and left parenthesis or be the first word of line.
You can try it here and enhance it http://www.regextester.com/
PS: this works in all your test cases.
$[^"]*\bint\b
should work. I can't think of a situation where you can use a valid int identifier after the character '"'.
Of course this only applies if the code is limited to one statement per line.