Finding a Digit using Pattern and Matcher - java

I am trying to use Pattern and Matcher to determine if a given string has a space between 2 digits. For example "5 1" should come back as true, "51" should come back as false. At first I was using string.replaceAll with the regex and it worked great, but moveing to Pattern I can't seem to get it to work.
String findDigit = "5 1/3";
String regex = "(\\d) +(\\d)";
findDigit = findDigit.replaceAll(regex, "$1 $2");
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
I first started with this. The replaceAll works without a hitch and removes the extra spaces, but the m.matches and the m.hitEnd both return false. Then I thought I might be doing something wrong so I simplified the case to just
String findDigit = "5";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
and matches comes back true (obviously) but when I change it to this
String findDigit = "5 3";
String regex = "\\d";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(findDigit);
System.out.println(m.matches());
System.out.println(m.hitEnd());
comes back both false. So I guess my main question is how to I determine that there is ANY digit in my string first and then more specifically, how do I deteremine if there is a digit space digit in my string. I thought that was the hitEnd, but I guess I am mistaken. Thanks in advance.

If you're looking for a match with multiple spaces but would like to preserve the formatting of the output you could use groups and back-references.
For instance:
String input = "blah 5 6/7";
Pattern p = Pattern.compile("(\\d)\\s+(\\d)");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.printf("Whole match: %s\n\tFirst digit: %s\n\tSecond digit: %s\n", m.group(), m.group(1), m.group(2));
}
Output
Whole match: 5 6
First digit: 5
Second digit: 6

The answer is of course m.find() sorry for being stupid this morning. Thanks to all who even looked at this :)

Related

Extracting a string using Regex

I have the following code to extract the string within double quotes using Regex.
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
The output I get now is java programming.But from the String str I want the content in the second double quotes which is programming. Can any one tell me how to do that using Regex.
If you take your example, and change it slightly to:
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
int i = 0
while(matcher.find()){
System.out.println("match " + ++i + ": " + matcher.group(1) + "\n");
}
You should find that it prints:
match 1: Java
match 2: programming
This shows that you are able to loop over all of the matches. If you only want the last match, then you have a number of options:
Store the match in the loop, and when the loop is finished, you have the last match.
Change the regex to ignore everything until your pattern, with something like: Pattern.compile(".*\"([^\"]*)\"")
If you really want explicitly the second match, then the simplest solution is something like Pattern.compile("\"([^\"]*)\"[^\"]*\"([^\"]*)\""). This gives two matching groups.
If you want the last token inside double quotes, add an end-of-line archor ($):
final Pattern pattern = Pattern.compile("\"([^\"]*)\"$");
In this case, you can replace while with if if your input is a single line.
Great answer from Paul. Well,You can also try this pattern
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
Java program
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Explanation
,\": matches a comma, followed by a quotation mark "
(\\w+): matches one or more words
\": matches the last quotation mark "
Then the group(\\w+) is captured (group 1 precisely)
Output
programming

Java regex between strings multiple times on same line

I am trying to extract the values between `[::[' and ']::]'. The problem I am having is there are multiple instances of this in the same string and it is only picking up the first one. Any help with my regex? here is my code:
Sample input: line = "TEST [::[NAME]::] HERE IS SOMETHING [::[DATE]::] WITH SOME MORE [::[Last]::]";
Pattern p = Pattern.compile("\\[::\\[(.*?)\\]::\\]");
Matcher m = p.matcher(line);
if (m.find()) {
System.out.println(m.group(1));
}
Your regex is OK. What you need to do is cycle through the matches, a Matcher can match several times!
while (m.find())
System.out.println(m.group(1));
A Matcher will try again from the end of the last match (unless you use \G but that's pretty special)

Get number in string with a regexp

I'm trying to get numbers from a String. Numbers only separated by space.
This code works a lot of case except I've got two numbers separated with only one space.
Pattern pattern = Pattern.compile("(^|\\s)[0-9]+(\\s|$)");
Matcher matcher = pattern.matcher(value);
while (matcher.find()) {
numericResquest.add(Integer.parseInt(matcher.group().trim()));
}
For example:
OK: 11 rre 12
OK: 11 12 (two spaces between the numbers)
Can't find 11 test 12 11 rre (only one space between the numbers)
Thank you
Why not just match the digit in your regex with lookarounds:
Pattern pattern = Pattern.compile("(?<=\\s)[0-9]+(?=\\s+|$)");
Matcher matcher = pattern.matcher(value);
while (matcher.find()) {
numericResquest.add(Integer.parseInt(matcher.group()));
}
String[] numbers = value.split(" ");
for(String number : numbers) {
numericRequest.add(Integer.parseInt(number));
}
If you get a NumberFormatException, the input was not formatted correctly.
The problem is that, not counting cases at the beginning or end of the string, your pattern requires a space both before and after the number. So if your string is "11 12", the first match will find "11 ", with a space at the end. The matcher's index will point after the pattern, i.e. to "12". But since the pattern also requires a space at the beginning, the next attempt to match won't work, because there's no space at the beginning of "12".
One way to solve this while using the same approach: Use matcher.lookingAt instead of matcher.find; lookingAt will match only if there's a pattern starting at the current index. Then you can fix your pattern so that there doesn't have to be a space at the beginning.
Pattern pattern = Pattern.compile("\\s*[0-9]+(\\s|$)");
Matcher matcher = pattern.matcher(value);
while (matcher.lookingAt()) {
numericResquest.add(Integer.parseInt(matcher.group().trim()));
}
This allows, but doesn't require, any number of spaces at the current index before the number occurs.
(Note: I haven't tested this.)
Regex
(^|(?<=\s))(\d+)(?=\s|$)
Iterate over all matches and capturing groups in a string
try {
Pattern regex = Pattern.compile("(^|(?<=\\s))(\\d+)(?=\\s|$)", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
for (int i = 1; i <= regexMatcher.groupCount(); i++) {
// matched text: regexMatcher.group(i)
// match start: regexMatcher.start(i)
// match end: regexMatcher.end(i)
}
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}

Java Regex for changing every ith index in every word of a string

I've written a regex \b\S\w(\S(?=.)) to find every third symbol in a word and replace it with '1'. Now I'm trying to use this expression but really don't know how to do it right.
Pattern pattern = Pattern.compile("\\b\\S\\w(\\S(?=.))");
Matcher matcher = pattern.matcher("lemon apple strawberry pumpkin");
while (matcher.find()) {
System.out.print(matcher.group(1) + " ");
}
So result is:
m p r m
And how can I use this to make a string like this
le1on ap1le st1awberry pu1pkin
You could use something like this:
"lemon apple strawberry pumpkin".replaceAll("(?<=\\b\\S{2})\\S", "1")
Would produce your example output. The regex would replace any non space character preceded by two non space characters and then a word boundary.
This means that "words" like 12345 would be changed into 12145 since 3 is matched by \\S (not space).
Edit:
Updated the regex to better cater to the revised question title, change 2 to i-1 to replace the ith letter of the word.
There is another way to access the index of the matcher
Like this:
Pattern pattern = Pattern.compile("\\b\\S\\w(\\S(?=.))");
String string = "lemon apple strawberry pumpkin";
char[] c = string.toCharArray();
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
c[matcher.end() - 1] = '1';////// may be it's not perfect , but this way in case of you want to access the index in which the **sring** is matches with the pattern
}
System.out.println(c);

Java regex doesn't find numbers

I'm trying to parse some text, but for some strange reason, Java regex doesn't work. For example, I've tried:
Pattern p = Pattern.compile("[A-Z][0-9]*,[0-9]*");
Matcher m = p.matcher("H3,4");
and it simply gives No match found exception, when I try to get the numbers m.group(1) and m.group(2). Am I missing something about how Java regex works?
Yes.
You must actually call matches() or find() on the matcher first.
Your regex must actually contain capturing groups
Example:
Pattern p = Pattern.compile("[A-Z](\\d*),(\\d*)");
matcher m = p.matcher("H3,4");
if (m.matches()) {
// use m.group(1), m.group(2) here
}
You also need the parenthesis to specify what is part of each group. I changed the leading part to be anything that's not a digit, 0 or more times. What's in each group is 1 or more digits. So, not * but + instead.
Pattern p = Pattern.compile("[^0-9]*([0-9]+),([0-9]+)");
Matcher m = p.matcher("H3,4");
if (m.matches())
{
String g1 = m.group(1);
String g2 = m.group(2);
}

Categories

Resources