I want a regular expression for a patterns which
1) String can contain atleast one '/' and one digit(/2/) or digits with spaces(//232 232/) or only one space(/// ////)
2) text is not allowed
**valid inputs:**
/1 323////
///////323 3232
//4343//4343
3/
**Invalid inputs:**
/////
121
///////3434dsds344//
//dsd///232
I have used ^/*(?:\\d[\\d ]*/*)*$ but this is failing for few of valid inputs like 232/////232
Can any one help ?
This one should work :
(?=.*\d)(?=.*\/)^[\d\/ ]+$
A simple alternation should suffice:
^(?:\d+ */+|/+ *\d+)[\d/ ]*$
Related
I have this Java code
String cookies = TextUtils.join(";", LoginActivity.msCookieManager.getCookieStore().getCookies());
Log.d("TheCookies", cookies);
Pattern csrf_pattern = Pattern.compile("csrf_cookie=(.+)(?=;)");
Matcher csrf_matcher = csrf_pattern.matcher(cookies);
while (csrf_matcher.find()) {
json.put("csrf_key", csrf_matcher.group(1));
Log.d("CSRF KEY", csrf_matcher.group(1));
}
The String contains something like this:
SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e
Im trying to get the csrf_cookie data by using this Regular Expression:
csrf_cookie=(.+)(?=;)
I expect a result like this in the code:
csrf_matcher.group(1);
e18d027da2fb95e888ebede711f1bc39
instead I get a:
3492f8670f4b09a6b3c3cbdfcc59e512;ci_session=8d823b309a361587fac5d67ad4706359b40d7bd0
What is the possible work around for this problem?
Here is a one-liner using String#replaceAll:
String input = "SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e";
String cookie = input.replaceAll(".*csrf_cookie=([^;]*).*", "$1");
System.out.println(cookie);
e18d027da2fb95e888ebede711f1bc39
Demo
Note: We could have used a formal regex pattern matcher, and in face you may want to do this if you need to do this search/replacement often in your code.
You are getting more data than expected because you are using an greedy '+' (It will match as long as it can)
For example the pattern a+ could match on aaa the following: a, aa, and aaa. Where the later is 'preferred' if the pattern is greedy.
So you are matching
csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e;
as long as it ends with a ';'. The first ';' is skipped with .+ and the last ';' is found with the possitive lookahead
To make a patter ungreedy/lazy use +? instead of + (so a+? would match a (three times) on aaa string)
So try with:
csrf_cookie=(.+?);
or just match anything that is not a ';'
csrf_cookie=([^;]*);
that way you don't need to make it lazy.
Does Java have a built-in way to escape arbitrary text so that it can be included in a regular expression? For example, if my users enter "$5", I'd like to match that exactly rather than a "5" after the end of input.
Since Java 1.5, yes:
Pattern.quote("$5");
Difference between Pattern.quote and Matcher.quoteReplacement was not clear to me before I saw following example
s.replaceFirst(Pattern.quote("text to replace"),
Matcher.quoteReplacement("replacement text"));
It may be too late to respond, but you can also use Pattern.LITERAL, which would ignore all special characters while formatting:
Pattern.compile(textToFormat, Pattern.LITERAL);
I think what you're after is \Q$5\E. Also see Pattern.quote(s) introduced in Java5.
See Pattern javadoc for details.
First off, if
you use replaceAll()
you DON'T use Matcher.quoteReplacement()
the text to be substituted in includes a $1
it won't put a 1 at the end. It will look at the search regex for the first matching group and sub THAT in. That's what $1, $2 or $3 means in the replacement text: matching groups from the search pattern.
I frequently plug long strings of text into .properties files, then generate email subjects and bodies from those. Indeed, this appears to be the default way to do i18n in Spring Framework. I put XML tags, as placeholders, into the strings and I use replaceAll() to replace the XML tags with the values at runtime.
I ran into an issue where a user input a dollars-and-cents figure, with a dollar sign. replaceAll() choked on it, with the following showing up in a stracktrace:
java.lang.IndexOutOfBoundsException: No group 3
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:748)
at java.util.regex.Matcher.replaceAll(Matcher.java:823)
at java.lang.String.replaceAll(String.java:2201)
In this case, the user had entered "$3" somewhere in their input and replaceAll() went looking in the search regex for the third matching group, didn't find one, and puked.
Given:
// "msg" is a string from a .properties file, containing "<userInput />" among other tags
// "userInput" is a String containing the user's input
replacing
msg = msg.replaceAll("<userInput \\/>", userInput);
with
msg = msg.replaceAll("<userInput \\/>", Matcher.quoteReplacement(userInput));
solved the problem. The user could put in any kind of characters, including dollar signs, without issue. It behaved exactly the way you would expect.
To have protected pattern you may replace all symbols with "\\\\", except digits and letters. And after that you can put in that protected pattern your special symbols to make this pattern working not like stupid quoted text, but really like a patten, but your own. Without user special symbols.
public class Test {
public static void main(String[] args) {
String str = "y z (111)";
String p1 = "x x (111)";
String p2 = ".* .* \\(111\\)";
p1 = escapeRE(p1);
p1 = p1.replace("x", ".*");
System.out.println( p1 + "-->" + str.matches(p1) );
//.*\ .*\ \(111\)-->true
System.out.println( p2 + "-->" + str.matches(p2) );
//.* .* \(111\)-->true
}
public static String escapeRE(String str) {
//Pattern escaper = Pattern.compile("([^a-zA-z0-9])");
//return escaper.matcher(str).replaceAll("\\\\$1");
return str.replaceAll("([^a-zA-Z0-9])", "\\\\$1");
}
}
Pattern.quote("blabla") works nicely.
The Pattern.quote() works nicely. It encloses the sentence with the characters "\Q" and "\E", and if it does escape "\Q" and "\E".
However, if you need to do a real regular expression escaping(or custom escaping), you can use this code:
String someText = "Some/s/wText*/,**";
System.out.println(someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
This method returns: Some/\s/wText*/\,**
Code for example and tests:
String someText = "Some\\E/s/wText*/,**";
System.out.println("Pattern.quote: "+ Pattern.quote(someText));
System.out.println("Full escape: "+someText.replaceAll("[-\\[\\]{}()*+?.,\\\\\\\\^$|#\\\\s]", "\\\\$0"));
^(Negation) symbol is used to match something that is not in the character group.
This is the link to Regular Expressions
Here is the image info about negation:
I have a problem in getting the correct Regular expression.I have below xml as string
<user_input>
<UserInput Question="test Q?" Answer=<value>0</value><sam#testmail.com>"
</user_input>
Now I need to remove the xml character from Answer attribute only.
So I need the below:-
<user_input>
<UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
I have tried the below regex but did not worked out:-
str1.replaceAll("Answer=.*?<([^<]*)>", "$1");
its removing all the text before..
Can anyone help please?
You need to put ? within the first group to make it none greedy, also you dont need Answer=.*?:
str1.replaceAll("<([^<]*?)>", "$1")
DEMO
httpRequest.send("msg="+data+"&TC="+TC); try like this
Although variable width look-behinds are not supported in Java, you can work around it with .{0,1000} that should suffice.
Please check out this approach using 2 regexes, or 1 regex and 1 replace. Choose the one that suits best (I removed the \n line break from the first input string to show the flaw with using simple replace):
String input = "<user_input><UserInput Question=\"test Q?\" Answer=<value>0</value><sam#testmail.com>\"\n</user_input>";
String st = input.replace("><", " ").replaceAll("(?<=Answer=.{0,1000})[<>/]+(?=[^\"]*\")", "");
String st1 = input.replaceAll("(?<=Answer=.{0,1000})><(?=[^\"]*\")", " ").replaceAll("(?<=Answer=.{0,1000})[<>/]+(?=[^\"]*\")", "");
System.out.println(st + "\n" + st1);
Output of a sample program:
<user_input UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
<user_input><UserInput Question="test Q?" Answer=value0value sam#testmail.com"
</user_input>
First off, in your sample above, there is a trailing " after the email and > which I do not know if it was placed by error.
However, I will keep it there as according to your expected result, you need it to still be present.
This is my hack.
(Answer=)(<)(value)(>)(.+?([^<]*))(</)(value)(><)(.+?([^>]*))(>) to replace it with
$1$3$5$8 $10
The explanation...
(Answer=)(<)(value)(>) matches from Answer to the start of the value 0
(.+?([^<]*) matches the result from 0 or more right to the beginning < which starts the closing value tag
(</) here, I still select this since it was dropped in the previous expression
(><) I will later replace this with a space
(.+?([^>]*) This matches from the start of the email and excludes the > after the .com
(>) this one selects the last > which I will later drop when replacing.
The trailing " is not selected as I will rather not touch it as requested.
I'm trying to create a regex for a string I write down.
My string is like :
'AUR HALAA /PART="PROJECT" /ROLE="VR_ANALYST" /TYPE="C" /CAPABILITY="S" /ADD' (SUC)
The constant part in regex is :
'AUR
/ROLE=""
The inputs are:
HALAA
VR_ANALYST
I tried the regex like this:
\'(AUR) HALAA .* /ROLE="(.)" .
but it doesnt work.
Could you please show me some tricks to how to do this ?
Try this:
^AUR (\\w+).*?/ROLE="(\\w+)".*$
This regex might work for you
^AUR (\\w+) .*? /ROLE="(\\w+)" .*$
And, you can then use "groups" in Matcher class to get the matching groups which will give you HALAA at group(1) and VR_ANALYST at group(2)
Need regular expression to extract the values between >xxxxx<. Can anybody help me in this.
<ChangeID type="String">C10286</ChangeID>
<ChangeID type="String">C10296</ChangeID>
Is it possible to get the two values in a comma separated format like C10286,C10296 in a single regex expression?
Thanks and Regards
Riyas Hussain A
try this:
(?<=>)[^<]*
test it with grep -Po:
kent$ echo '<ChangeID type="String">C10286</ChangeID>
<ChangeID type="String">C10296</ChangeID>'|grep -Po '(?<=>)[^<]*'
C10286
C10296
My idea would be to lookup for all words and remove the ones we don't need (in case you have more than 1 value inside your tag):
(?!ChangeID\b)(?!type\b)(?!String\b)\b\w+
You can try it out on : http://regexpal.com/