I am trying to extract the values between `[::[' and ']::]'. The problem I am having is there are multiple instances of this in the same string and it is only picking up the first one. Any help with my regex? here is my code:
Sample input: line = "TEST [::[NAME]::] HERE IS SOMETHING [::[DATE]::] WITH SOME MORE [::[Last]::]";
Pattern p = Pattern.compile("\\[::\\[(.*?)\\]::\\]");
Matcher m = p.matcher(line);
if (m.find()) {
System.out.println(m.group(1));
}
Your regex is OK. What you need to do is cycle through the matches, a Matcher can match several times!
while (m.find())
System.out.println(m.group(1));
A Matcher will try again from the end of the last match (unless you use \G but that's pretty special)
Related
I'm trying to parse a double out of a string. I have the code:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
Matcher m = p.matcher("reciproc(2.00000000000)");
System.out.println(Double.parseDouble(m.group()));
This code throws a java.lang.IllegalStateException. I want the output to be 2.00000000000. I got the regex from Java: Regex for Parsing Positive and Negative Doubles where it seemed to work for them. I tried a few other regexs as well and they all threw the same error. Am I missing something here?
It's not a problem with your regex but in how you are using the Matcher class. You need to call find() first.
This should work:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
String text = "reciproc(2.00000000000)";
Matcher m = p.matcher(text);
if(m.find())
{
System.out.println(Double.parseDouble(text.substring(m.start(), m.end())));
}
Alternatively:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
Matcher m = p.matcher("reciproc(2.00000000000)");
if(m.find())
{
System.out.println(Double.parseDouble(m.group()));
}
For more information, see the docs.
p.matcher("2.000000000000");
Your pattern should match the regex provided in Pattern.compile()
For more information on regex and patterns:
https://docs.oracle.com/javase/tutorial/essential/regex/
https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
I am trying to use Pattern and Matcher to determine if a given string has a space between 2 digits. For example "5 1" should come back as true, "51" should come back as false. At first I was using string.replaceAll with the regex and it worked great, but moveing to Pattern I can't seem to get it to work.
String findDigit = "5 1/3";
String regex = "(\\d) +(\\d)";
findDigit = findDigit.replaceAll(regex, "$1 $2");
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
I first started with this. The replaceAll works without a hitch and removes the extra spaces, but the m.matches and the m.hitEnd both return false. Then I thought I might be doing something wrong so I simplified the case to just
String findDigit = "5";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
and matches comes back true (obviously) but when I change it to this
String findDigit = "5 3";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
comes back both false. So I guess my main question is how to I determine that there is ANY digit in my string first and then more specifically, how do I deteremine if there is a digit space digit in my string. I thought that was the hitEnd, but I guess I am mistaken. Thanks in advance.
If you're looking for a match with multiple spaces but would like to preserve the formatting of the output you could use groups and back-references.
For instance:
String input = "blah 5 6/7";
Pattern p = Pattern.compile("(\\d)\\s+(\\d)");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.printf("Whole match: %s\n\tFirst digit: %s\n\tSecond digit: %s\n", m.group(), m.group(1), m.group(2));
}
Output
Whole match: 5 6
First digit: 5
Second digit: 6
The answer is of course m.find() sorry for being stupid this morning. Thanks to all who even looked at this :)
UPDATE: Thanks for all the great responses! I tried many different regex patterns but didn't understand why m.matches() was not doing what I think it should be doing. When I switched to m.find() instead, as well as adjusting the regex pattern, I was able to get somewhere.
I'd like to match a pattern in a Java string and then extract the portion matched using a regex (like Perl's $& operator).
This is my source string "s": DTSTART;TZID=America/Mexico_City:20121125T153000
I want to extract the portion "America/Mexico_City".
I thought I could use Pattern and Matcher and then extract using m.group() but it's not working as I expected. I've tried monkeying with different regex strings and the only thing that seems to hit on m.matches() is ".*TZID.*" which is pointless as it just returns the whole string. Could someone enlighten me?
Pattern p = Pattern.compile ("TZID*:"); // <- change to "TZID=([^:]*):"
Matcher m = p.matcher (s);
if (m.matches ()) // <- change to m.find()
Log.d (TAG, "looking at " + m.group ()); // <- change to m.group(1)
You use m.match() that tries to match the whole string, if you will use m.find(), it will search for the match inside, also I improved a bit your regexp to exclude TZID prefix using zero-width look behind:
Pattern p = Pattern.compile("(?<=TZID=)[^:]+"); //
Matcher m = p.matcher ("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group());
}
This should work nicely:
Pattern p = Pattern.compile("TZID=(.*?):");
Matcher m = p.matcher(s);
if (m.find()) {
String zone = m.group(1); // group count is 1-based
. . .
}
An alternative regex is "TZID=([^:]*)". I'm not sure which is faster.
You are using the wrong pattern, try this:
Pattern p = Pattern.compile(".*?TZID=([^:]+):.*");
Matcher m = p.matcher (s);
if (m.matches ())
Log.d (TAG, "looking at " + m.group(1));
.*? will match anything in the beginning up to TZID=, then TZID= will match and a group will begin and match everything up to :, the group will close here and then : will match and .* will match the rest of the String, now you can get what you need in group(1)
You are missing a dot before the asterisk. Your expression will match any number of uppercase Ds.
Pattern p = Pattern.compile ("TZID[^:]*:");
You should also add a capturing group unless you want to capture everything, including the "TZID" and the ":"
Pattern p = Pattern.compile ("TZID=([^:]*):");
Finally, you should use the right API to search the string, rather than attempting to match the string in its entirety.
Pattern p = Pattern.compile("TZID=([^:]*):");
Matcher m = p.matcher("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group(1));
}
This prints
America/Mexico_City
Why not simply use split as:
String origStr = "DTSTART;TZID=America/Mexico_City:20121125T153000";
String str = origStr.split(":")[0].split("=")[1];
I want to use a regular expression that extracts a substring with the following properties in Java:
Beginning of the substring begins with 'WWW'
The end of the substring is a colon ':'
I have some experience in SQL with using the Like clause such as:
Select field1 from A where field2 like '%[A-Z]'
So if I were using SQL I would code:
like '%WWW%:'
How can I start this in Java?
Pattern p = Pattern.compile("WWW.*:");
Matcher m = p.matcher("zxdfefefefWWW837eghdehgfh:djf");
while (m.find()){
System.out.println(m.group());
}
Here's a different example using substring.
public static void main(String[] args) {
String example = "http://www.google.com:80";
String substring = example.substring(example.indexOf("www"), example.lastIndexOf(":"));
System.out.println(substring);
}
If you want to match only word character and ., then you may want to use the regular expression as "WWW[\\w.]+:"
Pattern p = Pattern.compile("WWW[\\w.]+:");
Matcher m = p.matcher("WWW.google.com:hello");
System.out.println(m.find()); //prints true
System.out.println(m.group()); // prints WWW.google.com:
If you want to match any character, then you may want to use the regular expression as "WWW[\\w\\W]+:"
Pattern p = Pattern.compile("WWW[\\w\\W]+:");
Matcher m = p.matcher("WWW.googgle_$#.com:hello");
System.out.println(m.find());
System.out.println(m.group());
Explanation: WWW and : are literals. \\w - any word character i.e. a-z A-Z 0-9. \\W - Any non word character.
If I understood it right
String input = "aWWW:bbbWWWa:WWW:aWWWaaa:WWWa:WWWabc:WWW:";
Pattern p = Pattern.compile("WWW[^(WWW)|^:]*:");
Matcher m = p.matcher(input);
while(m.find()) {
System.out.println(m.group());
}
Output:
WWW:
WWWa:
WWW:
WWWaaa:
WWWa:
WWWabc:
WWW:
I'm trying to parse some text, but for some strange reason, Java regex doesn't work. For example, I've tried:
Pattern p = Pattern.compile("[A-Z][0-9]*,[0-9]*");
Matcher m = p.matcher("H3,4");
and it simply gives No match found exception, when I try to get the numbers m.group(1) and m.group(2). Am I missing something about how Java regex works?
Yes.
You must actually call matches() or find() on the matcher first.
Your regex must actually contain capturing groups
Example:
Pattern p = Pattern.compile("[A-Z](\\d*),(\\d*)");
matcher m = p.matcher("H3,4");
if (m.matches()) {
// use m.group(1), m.group(2) here
}
You also need the parenthesis to specify what is part of each group. I changed the leading part to be anything that's not a digit, 0 or more times. What's in each group is 1 or more digits. So, not * but + instead.
Pattern p = Pattern.compile("[^0-9]*([0-9]+),([0-9]+)");
Matcher m = p.matcher("H3,4");
if (m.matches())
{
String g1 = m.group(1);
String g2 = m.group(2);
}