I have this Java code
String cookies = TextUtils.join(";", LoginActivity.msCookieManager.getCookieStore().getCookies());
Log.d("TheCookies", cookies);
Pattern csrf_pattern = Pattern.compile("csrf_cookie=(.+)(?=;)");
Matcher csrf_matcher = csrf_pattern.matcher(cookies);
while (csrf_matcher.find()) {
json.put("csrf_key", csrf_matcher.group(1));
Log.d("CSRF KEY", csrf_matcher.group(1));
}
The String contains something like this:
SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e
Im trying to get the csrf_cookie data by using this Regular Expression:
csrf_cookie=(.+)(?=;)
I expect a result like this in the code:
csrf_matcher.group(1);
e18d027da2fb95e888ebede711f1bc39
instead I get a:
3492f8670f4b09a6b3c3cbdfcc59e512;ci_session=8d823b309a361587fac5d67ad4706359b40d7bd0
What is the possible work around for this problem?
Here is a one-liner using String#replaceAll:
String input = "SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e";
String cookie = input.replaceAll(".*csrf_cookie=([^;]*).*", "$1");
System.out.println(cookie);
e18d027da2fb95e888ebede711f1bc39
Demo
Note: We could have used a formal regex pattern matcher, and in face you may want to do this if you need to do this search/replacement often in your code.
You are getting more data than expected because you are using an greedy '+' (It will match as long as it can)
For example the pattern a+ could match on aaa the following: a, aa, and aaa. Where the later is 'preferred' if the pattern is greedy.
So you are matching
csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e;
as long as it ends with a ';'. The first ';' is skipped with .+ and the last ';' is found with the possitive lookahead
To make a patter ungreedy/lazy use +? instead of + (so a+? would match a (three times) on aaa string)
So try with:
csrf_cookie=(.+?);
or just match anything that is not a ';'
csrf_cookie=([^;]*);
that way you don't need to make it lazy.
Does Java have a built-in way to escape arbitrary text so that it can be included in a regular expression? For example, if my users enter "$5", I'd like to match that exactly rather than a "5" after the end of input.
Since Java 1.5, yes:
Pattern.quote("$5");
Difference between Pattern.quote and Matcher.quoteReplacement was not clear to me before I saw following example
s.replaceFirst(Pattern.quote("text to replace"),
Matcher.quoteReplacement("replacement text"));
It may be too late to respond, but you can also use Pattern.LITERAL, which would ignore all special characters while formatting:
Pattern.compile(textToFormat, Pattern.LITERAL);
I think what you're after is \Q$5\E. Also see Pattern.quote(s) introduced in Java5.
See Pattern javadoc for details.
First off, if
you use replaceAll()
you DON'T use Matcher.quoteReplacement()
the text to be substituted in includes a $1
it won't put a 1 at the end. It will look at the search regex for the first matching group and sub THAT in. That's what $1, $2 or $3 means in the replacement text: matching groups from the search pattern.
I frequently plug long strings of text into .properties files, then generate email subjects and bodies from those. Indeed, this appears to be the default way to do i18n in Spring Framework. I put XML tags, as placeholders, into the strings and I use replaceAll() to replace the XML tags with the values at runtime.
I ran into an issue where a user input a dollars-and-cents figure, with a dollar sign. replaceAll() choked on it, with the following showing up in a stracktrace:
java.lang.IndexOutOfBoundsException: No group 3
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:748)
at java.util.regex.Matcher.replaceAll(Matcher.java:823)
at java.lang.String.replaceAll(String.java:2201)
In this case, the user had entered "$3" somewhere in their input and replaceAll() went looking in the search regex for the third matching group, didn't find one, and puked.
Given:
// "msg" is a string from a .properties file, containing "<userInput />" among other tags
// "userInput" is a String containing the user's input
replacing
msg = msg.replaceAll("<userInput \\/>", userInput);
with
msg = msg.replaceAll("<userInput \\/>", Matcher.quoteReplacement(userInput));
solved the problem. The user could put in any kind of characters, including dollar signs, without issue. It behaved exactly the way you would expect.
To have protected pattern you may replace all symbols with "\\\\", except digits and letters. And after that you can put in that protected pattern your special symbols to make this pattern working not like stupid quoted text, but really like a patten, but your own. Without user special symbols.
public class Test {
public static void main(String[] args) {
String str = "y z (111)";
String p1 = "x x (111)";
String p2 = ".* .* \\(111\\)";
p1 = escapeRE(p1);
p1 = p1.replace("x", ".*");
System.out.println( p1 + "-->" + str.matches(p1) );
//.*\ .*\ \(111\)-->true
System.out.println( p2 + "-->" + str.matches(p2) );
//.* .* \(111\)-->true
}
public static String escapeRE(String str) {
//Pattern escaper = Pattern.compile("([^a-zA-z0-9])");
//return escaper.matcher(str).replaceAll("\\\\$1");
return str.replaceAll("([^a-zA-Z0-9])", "\\\\$1");
}
}
Pattern.quote("blabla") works nicely.
The Pattern.quote() works nicely. It encloses the sentence with the characters "\Q" and "\E", and if it does escape "\Q" and "\E".
However, if you need to do a real regular expression escaping(or custom escaping), you can use this code:
String someText = "Some/s/wText*/,**";
System.out.println(someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
This method returns: Some/\s/wText*/\,**
Code for example and tests:
String someText = "Some\\E/s/wText*/,**";
System.out.println("Pattern.quote: "+ Pattern.quote(someText));
System.out.println("Full escape: "+someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
^(Negation) symbol is used to match something that is not in the character group.
This is the link to Regular Expressions
Here is the image info about negation:
I have a base String "abc def", I am trying to replace my base string with "abc$ def$" using replaceFirst(), which is running into errors as $ is not escaped.
I tried doing it with Pattern and Matcher APIs, as given below,
newValue = "abc$ def$";
if(newValue.contains("$")){
Pattern specialCharacters = Pattern.compile("$");
Matcher newMatcherValue = specialCharacters.matcher(newValue) ;
newValue = newMatcherValue.replaceAll("\\\\$") ;
}
This runs into an error. Is there any elegant way of replacing my second string "abc$ def$" with "abc\\\\$ def\\\\$" so as to use the replacefirst() API successfully?
Look at Pattern.quote() to quote a regex and Matcher.quoteReplacement() to quote a replacement string.
That said, does this do what you want it to?
System.out.println("abc def".replaceAll("([\\w]+)\\b", "$1\\$"));
This prints out abc$ def$
You can use replaceAll just in one step:
String newValueScaped = newValue.replaceAll("\\$", "\\\\$")
$ has a special mining in regex, so you need to scape it. It's used to match the end of the data.
I'm trying to create a regex for a string I write down.
My string is like :
'AUR HALAA /PART="PROJECT" /ROLE="VR_ANALYST" /TYPE="C" /CAPABILITY="S" /ADD' (SUC)
The constant part in regex is :
'AUR
/ROLE=""
The inputs are:
HALAA
VR_ANALYST
I tried the regex like this:
\'(AUR) HALAA .* /ROLE="(.)" .
but it doesnt work.
Could you please show me some tricks to how to do this ?
Try this:
^AUR (\\w+).*?/ROLE="(\\w+)".*$
This regex might work for you
^AUR (\\w+) .*? /ROLE="(\\w+)" .*$
And, you can then use "groups" in Matcher class to get the matching groups which will give you HALAA at group(1) and VR_ANALYST at group(2)
I have a String say
String s = "|India| vs Aus";
In this case result should be only India.
Second case :
String s = "Aus vs |India|";
In this case result should be only India.
3rd case:
String s = "|India| vs |Aus|"
Result shouls contain only India, Aus. vs should not present in output.
And in these scenarios, there can be any other word in place of vs. e.g. String can be like this also |India| in |Aus|. and the String can be like this also |India| and |Sri Lanka| in |Aus|. I want those words that are present in between two pipes like India, Sri Lanka , Aus.
I want to do it in Java.
Any pointer will be helpful.
You would use a regex like...
\|[^|]+\|
...or...
\|.+?\|
You must escape the pipe because the pipe has special meaning in a regex as or.
You are looking at something similar to this:
String s = "|India| vs |Aus|";
Pattern p = Pattern.compile("\\|(.*?)\\|");
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(m.group(1));
}
You need to use the group to get the contents inside the paranthesis in the regexp.