Silly RegEx issue. What am I doing wrong? - java

String url = "hello world";
String p = "world";
Pattern pattern = Pattern.compile(p);
Matcher matcher = pattern.matcher(url);
if (matcher.matches()) {
int start = matcher.start();
int end = matcher.end();
}
What am I doing wrong? How comes the if statement never gets hit?

The matches() method attempts to match the entire string to the pattern. You want the find() method.

Try Matcher.find(). Matcher.matches() checks whether the whole string matches the pattern.

You need to use find because,
matches tries to match the patten against the entire string and
implicitly add a ^ at the start and $ at the end of your pattern.
So your pattern is equivalent to "^world$".

Try to change your pattern to ".*world.*":
String p = ".*world.*";
That way it'll match any string that contains "world".

I experienced same problem.
I don't know the reason.
If someone knows problem please post here.
I had solved problem with using find() repeatedly instead of matches().

Related

Pattern (string) allows characters only one time

I want to check if my string contains only allowed characters. Everything works properly for example 7B, 77B or 7BBBB, but when I input something like this 7B7 or 7BB2 it's not matching.
Everything work fine, but when integer is last character it's not working.
Could You tell me what is wrong with that code?
pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}
If you want to mix numbers and chars in a various order you need sth like:
Pattern pattern = Pattern.compile("[\\da-fA-F]*")
Why not try it this way?
// Compile this pattern.
Pattern pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*[0-9]*");
// See if this String matches.
Matcher m = pattern.matcher("num123");
if (m.matches()) {
System.out.println(true);
}
Source
Are you trying to verify that the string only has digits and letters and nothing else?
If so try using the following:
pattern = Pattern.compile("^[a-z-A-Z\\d]*$");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}

Matching everything after the first comma in a string

I am using java to do a regular expression match. I am using rubular to verify the match and ideone to test my code.
I got a regex from this SO solution , and it matches the group as I want it to in rubular, but my implementation in java is not matching. When it prints 'value', it is printing the value of commaSeparatedString and not matcher.group(1) I want the captured group/output of println to be "v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso"
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
//match everything after first comma
String myRegex = ",(.*)";
Pattern pattern = Pattern.compile(myRegex);
Matcher matcher = pattern.matcher(commaSeparatedString);
String value = "";
if (matcher.matches())
value = matcher.group(1);
else
value = commaSeparatedString;
System.out.println(value);
(edit: I left out that commaSeparatedString will not always contain 2 commas. Rather, it will always contain 0 or more commas)
If you don't have to solve it with regex, you can try this:
int size = commaSeparatedString.length();
value = commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1,size);
Namely, the code above returns the substring which starts from the first comma's index.
EDIT:
Sorry, I've omitted the simpler version. Thanks to one of the commentators, you can use this single line as well:
value = commaSeparatedString.substring( commaSeparatedString.indexOf(",") );
The definition of the regex is wrong. It should be:
String myRegex = "[^,]*,(.*)";
You are yet another victim of Java's misguided regex method naming.
.matches() automatically anchors the regex at the beginning and end (which is in total contradiction with the very definition of "regex matching"). The method you are looking for is .find().
However, for such a simple problem, it is better to go with #DelShekasteh's solution.
I would do this like
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1));
Here is another approach with limited split
String[] spl = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso".split(",", 2);
if (spl.length == 2)
System.out.println(spl[1]);
Byt IMHO Del's answer is best for your case.
I would use replaceFirst
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.replaceFirst(".*?,", ""));
prints
v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso
or you could use the shorter but obtuse
System.out.println(commaSeparatedString.split(",", 2)[1]);

Java String matches and replaceAll differ in matching parentheses

I have strings with parentheses and also escaped characters. I need to match against these characters and also delete them. In the following code, I use matches() and replaceAll() with the same regex, but the matches() returns false, while the replaceAll() seems to match just fine, because the replaceAll() executes and removes the characters. Can someone explain?
String input = "(aaaa)\\b";
boolean matchResult = input.matches("\\(|\\)|\\\\[a-z]+");
System.out.printf("matchResult=%s\n", matchResult);
String output = input.replaceAll("\\(|\\)|\\\\[a-z]+", "");
System.out.printf("INPUT: %s --> OUTPUT: %s\n", input, output);
Prints out:
matchResult=false
INPUT: (aaaa) --> OUTPUT: aaaa
matches matches the whole input, not part of it.
The regular expression \(|\)|\\[a-z]+ doesn't describe the whole word, but only parts of it, so in your case it fails.
What matches is doing has already been explained by Binyamin Sharet. I want to extend this a bit.
Java does not have a "findall" or a "g" modifier like other languages have it to get all matches at once.
The Java Matcher class knows only two methods to use a pattern against a string (without replacing it)
matches(): matches the whole string against the pattern
find(): returns the next match
If you want to get all things that fits your pattern, you need to use find() in a loop, something like this:
Pattern p = Pattern
.compile("\\(|\\)|\\\\[a-z]+");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group(0));
}
or if you are only interested if your pattern exists in the string
if (m.find()) {
System.out.println(m.group());
} else {
System.out.println("not found");
}

regular expression - parse classpath location

$JAR_REPO/nlb/grbox/smnt.jar
I want to get the string between $ and first / and this will be replaced with some other string.
What is the regex to get JAR_REPO alone from above?
Can I use Regex to get the actual string like the pattern match (any method) will return the string JAR_REPO?
Please help.
Thanks.
Wells
\$([^/]+)/.*
or, as a Java String:
"\\$([^/]+)/.*"
The JAR_REPO String will be the group(1):
Pattern pattern = Pattern.compile("\\$([^/]+)/.*");
Matcher matcher = pattern.matcher(yourstring);
if (matcher.find()) {
String jarRepo = matcher.group(1);
}
Such type of recursive parse approach can be resolved using Interpreter Pattern logic along with parsing approach.

extract substring in java using regex

I need to extract "URPlus1_S2_3" from the string:
"Last one: http://abc.imp/Basic2#URPlus1_S2_3,"
using regular expression in Java language.
Can someone please help me? I am using regex for the first time.
Try
Pattern p = Pattern.compile("#([^,]*)");
Matcher m = p.matcher(myString);
if (m.find()) {
doSomethingWith(m.group(1)); // The matched substring
}
String s = "Last one: http://abc.imp/Basic2#URPlus1_S2_3,";
Matcher m = Pattern.compile("(URPlus1_S2_3)").matcher(s);
if (m.find()) System.out.println(m.group(1));
You gotta learn how to specify your requirements ;)
You haven't really defined what criteria you need to use to find that string, but here is one way to approach based on '#' separator. You can adjust the regex as necessary.
expr: .*#([^,]*)
extract: \1
Go here for syntax documentation:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html
String s = Last one: http://abc.imp/Basic2#URPlus1_S2_3,"
String result = s.replaceAll(".*#", "");
The above returns the full String in case there's no "#". There are better ways using regex, but the best solution here is using no regex. There are classes URL and URI doing the job.
Since it's the first time you use regular expressions I would suggest going another way, which is more understandable for now (until you master regular expressions ;) and it will be easily modified if you will ever need to:
String yourPart = new String().split("#")[1];
Here's a long version:
String url = "http://abc.imp/Basic2#URPlus1_S2_3,";
String anchor = null;
String ps = "#(.+),";
Pattern p = Pattern.compile(ps);
Matcher m = p.matcher(url);
if (m.matches()) {
anchor = m.group(1);
}
The main point to understand is the use of the parenthesis, they are used to create groups which can be extracted from a pattern. In the Matcher object, the group method will return them in order starting at index 1, while the full match is returned by the index 0.
If you just want everything after the #, use split:
String s = "Last one: http://abc.imp/Basic2#URPlus1_S2_3," ;
System.out.println(s.split("#")[1]);
Alternatively, if you want to parse the URI and get the fragment component you can do:
URI u = new URI("http://abc.imp/Basic2#URPlus1_S2_3,");
System.out.println(u.getFragment());

Categories

Resources