I have been taking a look at the regular expressions and how to use it in Java for the problem I have to solve. I have to insert a \ before every ". This is what I have:
public class TestExpressions {
public static void main (String args[]) {
String test = "$('a:contains(\"CRUCERO\")')";
test = test.replaceAll("(\")","$1%");
System.out.println(test);
}
}
The ouput is:
$('a:contains("%CRUCERO"%)')
What I want is:
$('a:contains(\"CRUCERO\")')
I have changed % for \\ but have an error StringIndexOutofBounds don't know why. If someone can help me I would appreciate it, thank you in advance.
I have to insert a \ before every "
You can try with replace which automatically escapes all regex metacharacters and doesn't use any special characters in replacement part so you can simply use String literals you want to be put in matched part.
So lets just replace " with \" literal. You can write it as
test = test.replace("\"", "\\\"");
If you want to insert backspace before quote then use:
test = test.replaceAll("(\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
Or if you want to avoid already escaped quote then use negative lookbehind:
String test = "$('a:contains(\\\"CRUCERO\")')";
test = test.replaceAll("((?<!\\\\)\")","\\\\$1"); // $('a:contains(\"CRUCERO\")')
String result = subject.replaceAll("(?i)\"CRUCERO\"", "\\\"CRUCERO\\\"");
EXPLANATION:
Match the character string “"CRUCERO"” literally (case insensitive) «"CRUCERO"»
Ignore unescaped backslash «\»
Insert the character string “"CRUCERO” literally «"CRUCERO»
Ignore unescaped backslash «\»
Insert the character “"” literally «"»
If your goal is escape text for Java strings, then instead of regular expressions, consider using
String escaped = org.apache.commons.lang.StringEscapeUtils.
escapeJava("$('a:contains(\"CRUCERO\")')");
System.out.println(escaped);
Output:
$('a:contains(\"CRUCERO\")')
JavaDoc: http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html#escapeJava(java.lang.String)
Related
String s = "=?ISO-2022-JP?B?GyRCRZi?= =?ISO-2022-JP?B?QDg7OlWiE8?= =?ISO-2022-JP?B?JTkkT=?= =?ISO-2022-JP?B?kjaHaA?=";
String replRegex = "[^=]\\?= =\\?ISO-2022\\-JP\\?B\\?";
stringtoDecode= s.replaceAll(replRegex, "" );
result what I got is
=?ISO-2022-JP?B?GyRCRZQDg7OlWiEJTkkT=?= =?ISO-2022-JP?B?kjaHaA?=
but what I am expecting is
=?ISO-2022-JP?B?GyRCRZiQDg7OlWiE8JTkkT=?= =?ISO-2022-JP?B?kjaHaA?=
the character ?= =?ISO-2022-JP?B? is missing in the result. I want "?= =?ISO-2022-JP?B?" to be replaced with empty string if it is not presided by "=".
Am I doing anything wrong here? Please suggest
The [^=] part is consuming the character i. You can capture this 'consumed' character and replace it:
String replRegex = "([^=])\\?= =\\?ISO-2022-JP\\?B\\?";
stringtoDecode= s.replaceAll(replRegex, "$1" );
ideone demo.
Note that you don't need to escape hyphens. (You need to escape hyphens only if you mean a literal hyphen in a character class and that the hyphen is somewhere in the middle of that character class.
OutPut:
=?ISO-2022-JP?B?GyRCRZiQDg7OlWiE8JTkkT=?= =?ISO-2022-JP?B?kjaHaA?=
I have to create a regular expression to check if the string contains only digits, alphabets and other symbols except \ > <. I am able to create for digits and alphabets. I have tried with [^\<>] to check if string doesn't contain \ > < special characters.
But it did't work. Can someone please suggest how it an be done.
Edit:
Might be a simple question, but I am just starting with regx.
\ is a special character in regex, even inside of a character class. If you want to use it as literal character, you have to escape it, so the regex would be
[^\\<>]
if you use it in Java you have to escape additionally for the string level, so it would appear as:
String regex = "[^\\\\<>]";
Matching against [^\<>] will succeed if the string contains even one character which is not a backslash or angle bracket. If you wish it to succeed only when no character in the string is one of the forbidden ones, use
^[^\\<>]+$
You can try the regular expression:
^[^\\<>]*$
e.g.
private static final Pattern REGEX_PATTERN =
Pattern.compile("^[^\\\\<>]*$");
public static void main(String[] args) {
System.out.println(
REGEX_PATTERN.matcher("Hello World!").matches()
); // prints "true"
System.out.println(
REGEX_PATTERN.matcher("<script ...></script>").matches()
); // prints "false"
}
I have tried the following example but it gives following out put
output[]. I have pass the string "1.0" to function calculatePayout() and want to store the 1 in s[0] and 0 in s[1]
import java.util.Arrays;
public class aps {
public void calculatePayout(String amount)
{
String[] s = amount.split(".");
System.out.println("output"+Arrays.toString(s));
}
public static void main(String args[])
{
new aps().calculatePayout("1.0");
}
}
Method split() accepts regular expression. Character . in regular expressions means "everything". To split your string with . you have to escape it, i.e. split("\\."). The second back slash is needed because the first one escapes dot for regular expression, the second escapes back slash for java compiler.
Try escaping the dot:
String[] s = amount.split("\\.");
As dot is "Any character" in regex.
Try amount.split("\\.")
Split method uses a regex so you need to use the correct syntax and escape the dot.
. is a metacharcter or special character in regex world. String#split(regex) expects regex as parameter, you either have to escape it with backslash or use character class in-order to treat it as a normal character
Either amount.split("\\."); or amount.split("[.]");
Try
amount.split("\\.")
split method accepts a regular expression
Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?
This would be very handy in dynamically building a regular expression, without having to manually escape each individual character.
For example, consider a simple regex like \d+\.\d+ that matches numbers with a decimal point like 1.2, as well as the following code:
String digit = "d";
String point = ".";
String regex1 = "\\d+\\.\\d+";
String regex2 = Pattern.quote(digit + "+" + point + digit + "+");
Pattern numbers1 = Pattern.compile(regex1);
Pattern numbers2 = Pattern.compile(regex2);
System.out.println("Regex 1: " + regex1);
if (numbers1.matcher("1.2").matches()) {
System.out.println("\tMatch");
} else {
System.out.println("\tNo match");
}
System.out.println("Regex 2: " + regex2);
if (numbers2.matcher("1.2").matches()) {
System.out.println("\tMatch");
} else {
System.out.println("\tNo match");
}
Not surprisingly, the output produced by the above code is:
Regex 1: \d+\.\d+
Match
Regex 2: \Qd+.d+\E
No match
That is, regex1 matches 1.2 but regex2 (which is "dynamically" built) does not (instead, it matches the literal string d+.d+).
So, is there a method that would automatically escape each regex meta-character?
If there were, let's say, a static escape() method in java.util.regex.Pattern, the output of
Pattern.escape('.')
would be the string "\.", but
Pattern.escape(',')
should just produce ",", since it is not a meta-character. Similarly,
Pattern.escape('d')
could produce "\d", since 'd' is used to denote digits (although escaping may not make sense in this case, as 'd' could mean literal 'd', which wouldn't be misunderstood by the regex interpeter to be something else, as would be the case with '.').
Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?
If you are looking for a way to create constants that you can use in your regex patterns, then just prepending them with "\\" should work but there is no nice Pattern.escape('.') function to help with this.
So if you are trying to match "\\d" (the string \d instead of a decimal character) then you would do:
// this will match on \d as opposed to a decimal character
String matchBackslashD = "\\\\d";
// as opposed to
String matchDecimalDigit = "\\d";
The 4 slashes in the Java string turn into 2 slashes in the regex pattern. 2 backslashes in a regex pattern matches the backslash itself. Prepending any special character with backslash turns it into a normal character instead of a special one.
matchPeriod = "\\.";
matchPlus = "\\+";
matchParens = "\\(\\)";
...
In your post you use the Pattern.quote(string) method. This method wraps your pattern between "\\Q" and "\\E" so you can match a string even if it happens to have a special regex character in it (+, ., \\d, etc.)
I wrote this pattern:
Pattern SPECIAL_REGEX_CHARS = Pattern.compile("[{}()\\[\\].+*?^$\\\\|]");
And use it in this method:
String escapeSpecialRegexChars(String str) {
return SPECIAL_REGEX_CHARS.matcher(str).replaceAll("\\\\$0");
}
Then you can use it like this, for example:
Pattern toSafePattern(String text)
{
return Pattern.compile(".*" + escapeSpecialRegexChars(text) + ".*");
}
We needed to do that because, after escaping, we add some regex expressions. If not, you can simply use \Q and \E:
Pattern toSafePattern(String text)
{
return Pattern.compile(".*\\Q" + text + "\\E.*")
}
The only way the regex matcher knows you are looking for a digit and not the letter d is to escape the letter (\d). To type the regex escape character in java, you need to escape it (so \ becomes \\). So, there's no way around typing double backslashes for special regex chars.
The Pattern.quote(String s) sort of does what you want. However it leaves a little left to be desired; it doesn't actually escape the individual characters, just wraps the string with \Q...\E.
There is not a method that does exactly what you are looking for, but the good news is that it is actually fairly simple to escape all of the special characters in a Java regular expression:
regex.replaceAll("[\\W]", "\\\\$0")
Why does this work? Well, the documentation for Pattern specifically says that its permissible to escape non-alphabetic characters that don't necessarily have to be escaped:
It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language. A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.
For example, ; is not a special character in a regular expression. However, if you escape it, Pattern will still interpret \; as ;. Here are a few more examples:
> becomes \> which is equivalent to >
[ becomes \[ which is the escaped form of [
8 is still 8.
\) becomes \\\) which is the escaped forms of \ and ( concatenated.
Note: The key is is the definition of "non-alphabetic", which in the documentation really means "non-word" characters, or characters outside the character set [a-zA-Z_0-9].
Use this Utility function escapeQuotes() in order to escape strings in between Groups and Sets of a RegualrExpression.
List of Regex Literals to escape <([{\^-=$!|]})?*+.>
public class RegexUtils {
static String escapeChars = "\\.?![]{}()<>*+-=^$|";
public static String escapeQuotes(String str) {
if(str != null && str.length() > 0) {
return str.replaceAll("[\\W]", "\\\\$0"); // \W designates non-word characters
}
return "";
}
}
From the Pattern class the backslash character ('\') serves to introduce escaped constructs. The string literal "\(hello\)" is illegal and leads to a compile-time error; in order to match the string (hello) the string literal "\\(hello\\)" must be used.
Example: String to be matched (hello) and the regex with a group is (\(hello\)). Form here you only need to escape matched string as shown below. Test Regex online
public static void main(String[] args) {
String matched = "(hello)", regexExpGrup = "(" + escapeQuotes(matched) + ")";
System.out.println("Regex : "+ regexExpGrup); // (\(hello\))
}
Agree with Gray, as you may need your pattern to have both litrals (\[, \]) and meta-characters ([, ]). so with some utility you should be able to escape all character first and then you can add meta-characters you want to add on same pattern.
use
pattern.compile("\"");
String s= p.toString()+"yourcontent"+p.toString();
will give result as yourcontent as is
Let's say have a string...
String myString = "my*big*string*needs*parsing";
All I want is to get an split the string into "my" , "big" , "string", etc.
So I try
myString.split("*");
returns
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
* is a special character in regex so I try escaping....
myString.split("\\*");
same exception. I figured someone would know a quick solution. Thanks.
split("\\*") works with me.
One escape \ will not do the trick in Java 6 on Mac OSX, as \ is reserved for \b \t \n \f \r \'\" and \\. What you have seems to work for me:
public static void main(String[] args) {
String myString = "my*big*string*needs*parsing";
String[] a = myString.split("\\*");
for (String b : a) {
System.out.println(b);
}
}
outputs:
my big string needs parsing
http://arunma.com/2007/08/23/javautilregexpatternsyntaxexception-dangling-meta-character-near-index-0/
Should do exactly what you need.
myString.split("\\*"); is working fine on Java 5. Which JRE do you use.
You can also use a StringTokenizer.
StringTokenizer st = new StringTokenizer("my*big*string*needs*parsing", "\*");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
This happens because the split method takes a regular expression, not a plain string.
The '*' character means match the previous character zero or more times, thus it is not valid to specify it on its own.
So it should be escaped, like following
split("\\*")