the String is:"LinksImagesListCodeHt1233ddmlImagesConsider112dd2Download",I want to get "ImagesConsider112dd2Download". so I used this expression "Images.*?Download".but it matches "ImagesListCodeHt1233ddmlImagesConsider112dd2Download".what's the correct expression should be?
Temporarily,there is a ugly way to solve this problem:
Pattern p = Pattern.compile(StringUtils.reverse("Download")+ ".*?" + StringUtils.reverse("Images") );
String s = "LinksImagesListCodeHt1233ddmlImagesConsider112dd2Download";
s = StringUtils.reverse(s);
Matcher m = p.matcher(s);
while (m.find()){
m.end();
System.out.println(StringUtils.reverse(m.group()));
}
To match the text between Images to Download which does not contain the word Images inside you can use negative lookaround like this
Images((?!Images).)*Download
Explanation
Images -- Match literal string Images
(?!Images). -- Match a character that does not follow Images word
((?!Images).)* -- Match zero or more times
Related
What would be the best way to parse the following string in Java using a single regex?
String:
someprefix foo=someval baz=anotherval baz=somethingelse
I need to extract someprefix, someval, anotherval and somethingelse. The string always contains a prefix value (someprefix in the example) and can have from 0 to 4 key-value pairs (foo=someval baz=anotherval baz=somethingelse in the example)
You can use this regex for capturing your intended text,
(?<==|^)\w+
Which captures a word that is preceded by either an = character or is at ^ start of string.
Sample java code for same,
Pattern p = Pattern.compile("(?<==|^)\\w+");
String s = "someprefix foo=someval baz=anotherval baz=somethingelse";
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
Prints,
someprefix
someval
anotherval
somethingelse
Live Demo
I am trying to extract everything that is after this string path /share/attachments/docs/. All my strings are starting with /share/attachments/docs/
For example: /share/attachments/docs/image2.png
Number of characters after ../docs/ is not static!
I tried with
Pattern p = Pattern.compile("^(.*)/share/attachments/docs/(\\d+)$");
Matcher m = p.matcher("/share/attachments/docs/image2.png");
m.find();
String link = m.group(2);
System.out.println("Link #: "+link);
But I am getting Exception that: No match found.
Strange because if I use this:
Pattern p = Pattern.compile("^(.*)ABC Results for draw no (\\d+)$");
Matcher m = p.matcher("ABC Results for draw no 2888");
then it works!!!
Also one thing is that in some very rare cases my string does not start with /share/attachments/docs/ and then I should not parse anything but that is not related directly to the issue, but it will be good to handle.
I am getting Exception that: No match found.
This is because image2.png doesn't match with \d+ use a more appropriate pattern like .+ assuming that you want to extract image2.png.
Your regular expression will then be ^(.*)/share/attachments/docs/(.+)$
In case of ABC Results for draw no 2888, the regexp ^(.*)ABC Results for draw no (\\d+)$ works because you have several successive digits at the end of your String while in the first case you had image2.png that is a mix of letters and digits which is the reason why there were no match found.
Generally speaking to avoid getting an IllegalStateException: No match found, you need first to check the result of find(), if it returns true the input String matches:
if (m.find()) {
// The String matches with the pattern
String link = m.group(2);
System.out.println("Draw #: "+link);
} else {
System.out.println("Input value doesn't match with the pattern");
}
The regular expression \d+ (expressed as \\d+ inside a string literal) matches a run of one or more digits. Your example input does not have a corresponding digit run, so it is not matched. The regex metacharacter . matches any character (+/- newline, depending on regex options); it seems like that may be what you're really after.
Additionally, when you use Matcher.find() it is unnecessary for the pattern to match the whole string, so it is needless to include .* to match leading context. Furthermore, find() returns a value that tells you whether a match to the pattern was found. You generally want to use this return value, and in your particular case you can use it to reject those rare non-matching strings.
Maybe this is more what you want:
Pattern p = Pattern.compile("/share/attachments/docs/(.+)$");
Matcher m = p.matcher("/share/attachments/docs/image2.png");
String link;
if (m.find()) {
link = m.group(1);
System.out.println("Draw #: " + link);
} else {
link = null;
System.out.println("Draw #: (not found)");
}
Java Code:
String imagesArrayResponse = xmlNode.getChildText("files");
Matcher m = Pattern.compile("path\":\"([^\"]*)").matcher(imagesArrayResponse);
while (m.find()) {
String path = m.group(0);
}
String:
[{"path":"upload\/files\/56727570aaa08922_0.png","dir":"files","name":"56727570aaa08922_0","original_name":"56727570aaa08922_0.png"}{"path":"upload\/files\/56727570aaa08922_0.png","dir":"files","name":"56727570aaa08922_0","original_name":"56727570aaa08922_0.png"}{"path":"upload\/files\/56727570aaa08922_0.png","dir":"files","name":"56727570aaa08922_0","original_name":"56727570aaa08922_0.png"}{"path":"upload\/files\/56727570aaa08922_0.png","dir":"files","name":"56727570aaa08922_0","original_name":"56727570aaa08922_0.png"}]
m.group returns
path":"upload\/files\/56727570aaa08922_0.png"
instead of captured value of path. Where I am wrong?
See the documentation of group( int index ) method
When called with 0, it returns the entire string. Group 1 is the first.
To avoid such a trap, you should use named group with syntax :
"path\":\"(?<mynamegroup>[^\"]*)"
javadoc:
Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().
m.group(1) will give you the Match. If there are more than one matchset (), it will be m.group(2), m.group(3),...
By convention, AFAIK in regex engines the 0th group is always the whole matched string. Nested groups start at 1.
Check out the grouping options in Matcher.
Matcher m =
Pattern.compile(
//<- (0) -> that's group(0)
// <-(1)-> that's group(1)
"path\":\"([^\"]*)").matcher(imagesArrayResponse);
Change your code to
while (m.find()) {
String path = m.group(1);
}
And you should be okay. This is also worth checking out: What is a non-capturing group? What does a question mark followed by a colon (?:) mean?
What would be the correct regular expression (that I can use in Java) if I want to extract a value from the string below?
<Name_id = bob>
I know that \<(.*?)\> will extract everything between the angle brackets but I only need to extract "bob".
The only part of the string that will change will be "bob". I also want to make sure that if someone enters =bob as the Name_id, the string that pulled out will be just that and doesn't mess up the regular expression.
Use capturing groups to capture the characters you want.
"<Name_id\\s+=\\s+([^>]+)>"
OR
"<Name_id\\s+=\\s+([\w]+)>"
And then print group index 1 at the last. \s+ matches one or more space characters and \w+ matches one or more word characters.
String i = "<Name_id = bob>";
Matcher m = Pattern.compile("<Name_id\\s+=\\s+([^>]+)>").matcher(i);
while(m.find())
{
System.out.println(m.group(1));
}
Output:
bob
UPDATE: Thanks for all the great responses! I tried many different regex patterns but didn't understand why m.matches() was not doing what I think it should be doing. When I switched to m.find() instead, as well as adjusting the regex pattern, I was able to get somewhere.
I'd like to match a pattern in a Java string and then extract the portion matched using a regex (like Perl's $& operator).
This is my source string "s": DTSTART;TZID=America/Mexico_City:20121125T153000
I want to extract the portion "America/Mexico_City".
I thought I could use Pattern and Matcher and then extract using m.group() but it's not working as I expected. I've tried monkeying with different regex strings and the only thing that seems to hit on m.matches() is ".*TZID.*" which is pointless as it just returns the whole string. Could someone enlighten me?
Pattern p = Pattern.compile ("TZID*:"); // <- change to "TZID=([^:]*):"
Matcher m = p.matcher (s);
if (m.matches ()) // <- change to m.find()
Log.d (TAG, "looking at " + m.group ()); // <- change to m.group(1)
You use m.match() that tries to match the whole string, if you will use m.find(), it will search for the match inside, also I improved a bit your regexp to exclude TZID prefix using zero-width look behind:
Pattern p = Pattern.compile("(?<=TZID=)[^:]+"); //
Matcher m = p.matcher ("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group());
}
This should work nicely:
Pattern p = Pattern.compile("TZID=(.*?):");
Matcher m = p.matcher(s);
if (m.find()) {
String zone = m.group(1); // group count is 1-based
. . .
}
An alternative regex is "TZID=([^:]*)". I'm not sure which is faster.
You are using the wrong pattern, try this:
Pattern p = Pattern.compile(".*?TZID=([^:]+):.*");
Matcher m = p.matcher (s);
if (m.matches ())
Log.d (TAG, "looking at " + m.group(1));
.*? will match anything in the beginning up to TZID=, then TZID= will match and a group will begin and match everything up to :, the group will close here and then : will match and .* will match the rest of the String, now you can get what you need in group(1)
You are missing a dot before the asterisk. Your expression will match any number of uppercase Ds.
Pattern p = Pattern.compile ("TZID[^:]*:");
You should also add a capturing group unless you want to capture everything, including the "TZID" and the ":"
Pattern p = Pattern.compile ("TZID=([^:]*):");
Finally, you should use the right API to search the string, rather than attempting to match the string in its entirety.
Pattern p = Pattern.compile("TZID=([^:]*):");
Matcher m = p.matcher("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group(1));
}
This prints
America/Mexico_City
Why not simply use split as:
String origStr = "DTSTART;TZID=America/Mexico_City:20121125T153000";
String str = origStr.split(":")[0].split("=")[1];