Java regex between strings multiple times on same line - java

I am trying to extract the values between `[::[' and ']::]'. The problem I am having is there are multiple instances of this in the same string and it is only picking up the first one. Any help with my regex? here is my code:
Sample input: line = "TEST [::[NAME]::] HERE IS SOMETHING [::[DATE]::] WITH SOME MORE [::[Last]::]";
Pattern p = Pattern.compile("\\[::\\[(.*?)\\]::\\]");
Matcher m = p.matcher(line);
if (m.find()) {
System.out.println(m.group(1));
}

Your regex is OK. What you need to do is cycle through the matches, a Matcher can match several times!
while (m.find())
System.out.println(m.group(1));
A Matcher will try again from the end of the last match (unless you use \G but that's pretty special)

Related

How to pull double out of string with matcher

I'm trying to parse a double out of a string. I have the code:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
Matcher m = p.matcher("reciproc(2.00000000000)");
System.out.println(Double.parseDouble(m.group()));
This code throws a java.lang.IllegalStateException. I want the output to be 2.00000000000. I got the regex from Java: Regex for Parsing Positive and Negative Doubles where it seemed to work for them. I tried a few other regexs as well and they all threw the same error. Am I missing something here?
It's not a problem with your regex but in how you are using the Matcher class. You need to call find() first.
This should work:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
String text = "reciproc(2.00000000000)";
Matcher m = p.matcher(text);
if(m.find())
{
System.out.println(Double.parseDouble(text.substring(m.start(), m.end())));
}
Alternatively:
Pattern p = Pattern.compile("-?\\d+(\\.\\d+)?");
Matcher m = p.matcher("reciproc(2.00000000000)");
if(m.find())
{
System.out.println(Double.parseDouble(m.group()));
}
For more information, see the docs.
p.matcher("2.000000000000");
Your pattern should match the regex provided in Pattern.compile()
For more information on regex and patterns:
https://docs.oracle.com/javase/tutorial/essential/regex/
https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

Finding a Digit using Pattern and Matcher

I am trying to use Pattern and Matcher to determine if a given string has a space between 2 digits. For example "5 1" should come back as true, "51" should come back as false. At first I was using string.replaceAll with the regex and it worked great, but moveing to Pattern I can't seem to get it to work.
String findDigit = "5 1/3";
String regex = "(\\d) +(\\d)";
findDigit = findDigit.replaceAll(regex, "$1 $2");
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
I first started with this. The replaceAll works without a hitch and removes the extra spaces, but the m.matches and the m.hitEnd both return false. Then I thought I might be doing something wrong so I simplified the case to just
String findDigit = "5";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
and matches comes back true (obviously) but when I change it to this
String findDigit = "5 3";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
comes back both false. So I guess my main question is how to I determine that there is ANY digit in my string first and then more specifically, how do I deteremine if there is a digit space digit in my string. I thought that was the hitEnd, but I guess I am mistaken. Thanks in advance.
If you're looking for a match with multiple spaces but would like to preserve the formatting of the output you could use groups and back-references.
For instance:
String input = "blah 5 6/7";
Pattern p = Pattern.compile("(\\d)\\s+(\\d)");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.printf("Whole match: %s\n\tFirst digit: %s\n\tSecond digit: %s\n", m.group(), m.group(1), m.group(2));
}
Output
Whole match: 5 6
First digit: 5
Second digit: 6
The answer is of course m.find() sorry for being stupid this morning. Thanks to all who even looked at this :)

Pattern/Matcher group() to obtain substring in Java?

UPDATE: Thanks for all the great responses! I tried many different regex patterns but didn't understand why m.matches() was not doing what I think it should be doing. When I switched to m.find() instead, as well as adjusting the regex pattern, I was able to get somewhere.
I'd like to match a pattern in a Java string and then extract the portion matched using a regex (like Perl's $& operator).
This is my source string "s": DTSTART;TZID=America/Mexico_City:20121125T153000
I want to extract the portion "America/Mexico_City".
I thought I could use Pattern and Matcher and then extract using m.group() but it's not working as I expected. I've tried monkeying with different regex strings and the only thing that seems to hit on m.matches() is ".*TZID.*" which is pointless as it just returns the whole string. Could someone enlighten me?
Pattern p = Pattern.compile ("TZID*:"); // <- change to "TZID=([^:]*):"
Matcher m = p.matcher (s);
if (m.matches ()) // <- change to m.find()
Log.d (TAG, "looking at " + m.group ()); // <- change to m.group(1)
You use m.match() that tries to match the whole string, if you will use m.find(), it will search for the match inside, also I improved a bit your regexp to exclude TZID prefix using zero-width look behind:
Pattern p = Pattern.compile("(?<=TZID=)[^:]+"); //
Matcher m = p.matcher ("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group());
}
This should work nicely:
Pattern p = Pattern.compile("TZID=(.*?):");
Matcher m = p.matcher(s);
if (m.find()) {
String zone = m.group(1); // group count is 1-based
. . .
}
An alternative regex is "TZID=([^:]*)". I'm not sure which is faster.
You are using the wrong pattern, try this:
Pattern p = Pattern.compile(".*?TZID=([^:]+):.*");
Matcher m = p.matcher (s);
if (m.matches ())
Log.d (TAG, "looking at " + m.group(1));
.*? will match anything in the beginning up to TZID=, then TZID= will match and a group will begin and match everything up to :, the group will close here and then : will match and .* will match the rest of the String, now you can get what you need in group(1)
You are missing a dot before the asterisk. Your expression will match any number of uppercase Ds.
Pattern p = Pattern.compile ("TZID[^:]*:");
You should also add a capturing group unless you want to capture everything, including the "TZID" and the ":"
Pattern p = Pattern.compile ("TZID=([^:]*):");
Finally, you should use the right API to search the string, rather than attempting to match the string in its entirety.
Pattern p = Pattern.compile("TZID=([^:]*):");
Matcher m = p.matcher("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group(1));
}
This prints
America/Mexico_City
Why not simply use split as:
String origStr = "DTSTART;TZID=America/Mexico_City:20121125T153000";
String str = origStr.split(":")[0].split("=")[1];

Regular Expression strings in Java

I want to use a regular expression that extracts a substring with the following properties in Java:
Beginning of the substring begins with 'WWW'
The end of the substring is a colon ':'
I have some experience in SQL with using the Like clause such as:
Select field1 from A where field2 like '%[A-Z]'
So if I were using SQL I would code:
like '%WWW%:'
How can I start this in Java?
Pattern p = Pattern.compile("WWW.*:");
Matcher m = p.matcher("zxdfefefefWWW837eghdehgfh:djf");
while (m.find()){
System.out.println(m.group());
}
Here's a different example using substring.
public static void main(String[] args) {
String example = "http://www.google.com:80";
String substring = example.substring(example.indexOf("www"), example.lastIndexOf(":"));
System.out.println(substring);
}
If you want to match only word character and ., then you may want to use the regular expression as "WWW[\\w.]+:"
Pattern p = Pattern.compile("WWW[\\w.]+:");
Matcher m = p.matcher("WWW.google.com:hello");
System.out.println(m.find()); //prints true
System.out.println(m.group()); // prints WWW.google.com:
If you want to match any character, then you may want to use the regular expression as "WWW[\\w\\W]+:"
Pattern p = Pattern.compile("WWW[\\w\\W]+:");
Matcher m = p.matcher("WWW.googgle_$#.com:hello");
System.out.println(m.find());
System.out.println(m.group());
Explanation: WWW and : are literals. \\w - any word character i.e. a-z A-Z 0-9. \\W - Any non word character.
If I understood it right
String input = "aWWW:bbbWWWa:WWW:aWWWaaa:WWWa:WWWabc:WWW:";
Pattern p = Pattern.compile("WWW[^(WWW)|^:]*:");
Matcher m = p.matcher(input);
while(m.find()) {
System.out.println(m.group());
}
Output:
WWW:
WWWa:
WWW:
WWWaaa:
WWWa:
WWWabc:
WWW:

Java regex doesn't find numbers

I'm trying to parse some text, but for some strange reason, Java regex doesn't work. For example, I've tried:
Pattern p = Pattern.compile("[A-Z][0-9]*,[0-9]*");
Matcher m = p.matcher("H3,4");
and it simply gives No match found exception, when I try to get the numbers m.group(1) and m.group(2). Am I missing something about how Java regex works?
Yes.
You must actually call matches() or find() on the matcher first.
Your regex must actually contain capturing groups
Example:
Pattern p = Pattern.compile("[A-Z](\\d*),(\\d*)");
matcher m = p.matcher("H3,4");
if (m.matches()) {
// use m.group(1), m.group(2) here
}
You also need the parenthesis to specify what is part of each group. I changed the leading part to be anything that's not a digit, 0 or more times. What's in each group is 1 or more digits. So, not * but + instead.
Pattern p = Pattern.compile("[^0-9]*([0-9]+),([0-9]+)");
Matcher m = p.matcher("H3,4");
if (m.matches())
{
String g1 = m.group(1);
String g2 = m.group(2);
}

Categories

Resources