Regex that matches the string ÷x% [duplicate] - java

This question already has answers here:
Why does replaceAll fail with "illegal group reference"?
(8 answers)
Closed 4 years ago.
I've been trying to create a regex that matches the following pattern:
÷x%
here is my code:
String string = "÷x%2%x#3$$#";
String myregex = "all the things I've tried";
string = string.replaceAll(myregex,"÷1x#1$%");
I've tried the following regexes: (÷x%) , [÷][x][%] , [÷]{1}[x]{1}[%]{1}
I am using NetBeans IDE and it gives me an
Illegal group reference
However, when I change the value of string to something else, a word for example.
NetBeans does not give me an exception.
any thoughts, thanks

To replace all occurrences of a sub-string you don't need a pattern. You can use String.replace():
String input = "÷x%abc÷x%def÷x%";
String output = input.replace("÷x%", "÷1x#1$%");
System.out.println(output); // ÷1x#1$%abc÷1x#1$%def÷1x#1$%
As per method javadoc:
Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.

As per the comments in the question, I am hoping that this will shed some light on how the replaceAll works.
As per the JavaDoc, the replaceAll takes in a regular expression as first argument. In your case, the regular expression appears to be sound, so there is no issue there.
The second argument that the replaceAll accepts, is the string that will be used to replace whatever the regular expression matches.
In some cases, you will need to replace the same pattern with the same (hard coded, if you will) string:
String myString = "123abc1344";
myString = myString.replaceAll("\\d+", "number");
myString = myString.replaceAll("\\w+", "word");
System.out.println(myString); //Would yield something of the sort: numberwordnumber
BUT, there are situations were you want use chunks of what you are replacing in the replacement string itself. This is where the $ comes in:
String myString = "Age:9;Gender:Male";
Let us say that you want to change the format of the string to the following: "I am a {Gender} and I am {Age} years of age.".
In this case, your replacement string needs to extract information from the string to be replaced and inject it in the replacement itself. You do this by using the following:
String myString = "Age:9;Gender:Male";
myString = myString.replaceAll("Age:(\\d+);Gender:(\\w+)", "I am a $2 and I am $1 years of age.";
The above should yield the string that you are after. Notice that I am using $1 and $2 to access regular expression groups. In regular expression language, the 0th group is whatever it is matched by the entire regular expression. Any other round parenthesis denotes another regular expression group which you can access through the $ keyword.
This is why it needs to be escaped.

In the Java Regex you have to escape the $ sign.
If you write $% you would refer to the group % which is not existant.
You can try:
try {
String string = "÷x%2%x#3$$#";
String myregex = "÷x%";
String replace = "÷1x#1\\$%";
String resultString = string.replaceAll(myregex, replace);
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
} catch (IllegalArgumentException ex) {
// Syntax error in the replacement text (unescaped $ signs?)
} catch (IndexOutOfBoundsException ex) {
// Non-existent backreference used the replacement text
}

Related

replace regex string with parameter / token

I have this code snippet of a code base I am supposed to maintain.
String number = "1";
String value = "test";
String output = "";
output = value.replaceAll("\\Q{#}", number);
The value of output stays as "test" and I can only guess what this code is supposed to do: the value of numbershould be appended to whatever is in value. Maybe something like this: test1 or replace the value with the number entirely.
I found out that \\Q is the regex option to quote everything until \\E but there is no \\E. Anyway it is not doing anything at all and I am wondering if I oversee something?
Your regex just matches a literal {#}. It is true that after \Q the pattern is considered to have literal symbols (all the symbols after \Q get "quoted" or "escaped"), and \E stops this escaping/quoting, and if it is missing, the whole pattern will get quoted/escaped.
If your value variable holds test{#} value, the {#} will get replaced with the number.
See this demo:
String number = "1";
String value = "test{#}";
String output = "";
output = value.replaceAll("\\Q{#}", number);
System.out.println(output); // => test1
Note that without \Q, your regex ({#}) would throw a java.util.regex.PatternSyntaxException: Illegal repetition error because Java regex engine is not "smart" enough to disambiguate the braces (PCRE, JS, .NET can easily guess that since there is no number inside, it is not a limiting/bound quantifier).

Insert character before specific character Java

I have been taking a look at the regular expressions and how to use it in Java for the problem I have to solve. I have to insert a \ before every ". This is what I have:
public class TestExpressions {
public static void main (String args[]) {
String test = "$('a:contains(\"CRUCERO\")')";
test = test.replaceAll("(\")","$1%");
System.out.println(test);
}
}
The ouput is:
$('a:contains("%CRUCERO"%)')
What I want is:
$('a:contains(\"CRUCERO\")')
I have changed % for \\ but have an error StringIndexOutofBounds don't know why. If someone can help me I would appreciate it, thank you in advance.
I have to insert a \ before every "
You can try with replace which automatically escapes all regex metacharacters and doesn't use any special characters in replacement part so you can simply use String literals you want to be put in matched part.
So lets just replace " with \" literal. You can write it as
test = test.replace("\"", "\\\"");
If you want to insert backspace before quote then use:
test = test.replaceAll("(\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
Or if you want to avoid already escaped quote then use negative lookbehind:
String test = "$('a:contains(\\\"CRUCERO\")')";
test = test.replaceAll("((?<!\\\\)\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
String result = subject.replaceAll("(?i)\"CRUCERO\"", "\\\"CRUCERO\\\"");
EXPLANATION:
Match the character string “"CRUCERO"” literally (case insensitive) «"CRUCERO"»
Ignore unescaped backslash «\»
Insert the character string “"CRUCERO” literally «"CRUCERO»
Ignore unescaped backslash «\»
Insert the character “"” literally «"»
If your goal is escape text for Java strings, then instead of regular expressions, consider using
String escaped = org.apache.commons.lang.StringEscapeUtils.
escapeJava("$('a:contains(\"CRUCERO\")')");
System.out.println(escaped);
Output:
$('a:contains(\"CRUCERO\")')
JavaDoc: http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html#escapeJava(java.lang.String)

Regex with special received signs and replaceAll() throwns errors

A function receives something like this with special sign (,>_$' and Java replaceAll throwns error.
SAMPLE INPUT
I got an error if input something like this:
[ FAILED ] appendtext variable has with System.lineSeparator():
$model_fsdfdsfdsfdsfdsfds->load('fsdfdsfdsfdsfdsfds','dsfsdfsd');
$model_fsdfdsfdsfdsfdsfds->fsdfdsfdsfdsfdsfds->index();
No error if input as:
[ OKAY ] appendtext variable have simple input with System.lineSeparator():
mysomethingmodel
blabla
EXPLANATIONS
appendtext goes into String with other combinations:
String allappend = "Something simple var" + System.lineSeparator() + "\t{" + System.lineSeparator() + appendtext;
Okay. Than it goes into replaceAll with regex and thrown an error:
str_list = rs.replaceAll(regex_newlinebracket, allappend);
regex_newlinebracket is something regex from another function:
public String RegexPatternsFunction(String types, String function_name)
{
// get a rid of special sign
String function_name_quoted = Pattern.quote(function_name);
switch (types) {
case "newlinebracket":
return function_name_quoted + "(\\s|\\t|\\n)+[{]";
}
return null;
}
ERRORS
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:808)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
or exactly insider appendReplacement function from Matcher.java:
// The first number is always a group
refNum = (int)nextChar - '0';
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
PROBLEM
Using special characters as for the
$model_fsdfdsfdsfdsfdsfds->load('fsdfdsfdsfdsfdsfds','dsfsdfsd');
$model_fsdfdsfdsfdsfdsfds->fsdfdsfdsfdsfdsfds->index();
throwns an error in combination of replaceAll as Regex pattern.
A PROJECT WORKS IF NO SPECIAL SIGN.
I'm using Pattern.quote to escaping special characters in other words it will not works if come input like () and replaceAll using regex.
In C++ Qt, it's works well, in Java not.
Solutions?
It's fine (and necessary) that you use Pattern.quote. But what's causing the actual problem is the replacement string, since it contains $ (which is the relevant referencing-character in replacement strings). Luckily, Java provides you with another quoting function just to make replacement strings safe: Matcher.quoteReplacement()
So just try
allappend = Matcher.quoteReplacement(allappend);
str_list = rs.replaceAll(regex_newlinebracket, allappend);

Java String ReplaceAll method giving illegal repetition error?

I have a string and when I try to run the replaceAll method, I am getting this strange error:
String str = "something { } , op";
str = str.replaceAll("o", "\n"); // it works fine
str = str.replaceAll("{", "\n"); // does not work
and i get a strange error:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal repetition {
How can I replace the occurrences of "{" ?
A { is a regex meta-character used for range repetitions as {min,max}. To match a literal { you need to escape it by preceding it with a \\:
str = str.replaceAll("\\{", "\n"); // does work
If you really intend to replace single characters and not regexes (which is what you seem to want to do here), you should use .replace(), not .replaceAll(). In spite of its name, .replace() will replace ALL occurrences, not just the first one.
And in case you wonder, String implements CharSequence, so .replace("{", "\n") will work.
Escape it:
str = str.replaceAll("\\{", "\n");
This is needed since the first argument to replaceAll() is a regular expression, and { has a special meaning in Java regular expressions (it's a repetition operator, hence the error message).

How to remove special characters from a string?

I want to remove special characters like:
- + ^ . : ,
from an String using Java.
That depends on what you define as special characters, but try replaceAll(...):
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^ character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".
Another note: the - character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-, would mean "all characters in the range : to ,).
So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (, {, $ etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S} (keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}").
A third way could be something like this, if you can exactly define what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.
Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.
Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}] (upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.
Additional information on Unicode
Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.
This will replace all the characters except alphanumeric
replaceAll("[^A-Za-z0-9]","");
As described here
http://developer.android.com/reference/java/util/regex/Pattern.html
Patterns are compiled regular expressions. In many cases, convenience methods such as String.matches, String.replaceAll and String.split will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. The Pattern class and its companion, Matcher, also offer more functionality than the small amount exposed by String.
public class RegularExpressionTest {
public static void main(String[] args) {
System.out.println("String is = "+getOnlyStrings("!&(*^*(^(+one(&(^()(*)(*&^%$##!#$%^&*()("));
System.out.println("Number is = "+getOnlyDigits("&(*^*(^(+91-&*9hi-639-0097(&(^("));
}
public static String getOnlyDigits(String s) {
Pattern pattern = Pattern.compile("[^0-9]");
Matcher matcher = pattern.matcher(s);
String number = matcher.replaceAll("");
return number;
}
public static String getOnlyStrings(String s) {
Pattern pattern = Pattern.compile("[^a-z A-Z]");
Matcher matcher = pattern.matcher(s);
String number = matcher.replaceAll("");
return number;
}
}
Result
String is = one
Number is = 9196390097
Try replaceAll() method of the String class.
BTW here is the method, return type and parameters.
public String replaceAll(String regex,
String replacement)
Example:
String str = "Hello +-^ my + - friends ^ ^^-- ^^^ +!";
str = str.replaceAll("[-+^]*", "");
It should remove all the {'^', '+', '-'} chars that you wanted to remove!
To Remove Special character
String t2 = "!##$%^&*()-';,./?><+abdd";
t2 = t2.replaceAll("\\W+","");
Output will be : abdd.
This works perfectly.
Use the String.replaceAll() method in Java.
replaceAll should be good enough for your problem.
You can remove single char as follows:
String str="+919595354336";
String result = str.replaceAll("\\\\+","");
System.out.println(result);
OUTPUT:
919595354336
If you just want to do a literal replace in java, use Pattern.quote(string) to escape any string to a literal.
myString.replaceAll(Pattern.quote(matchingStr), replacementStr)

Categories

Resources