Regex not finding string - java

I am having issues with this code:
For some reason, it always fails to match the code.
for (int i = 0; i < pluginList.size(); i++) {
System.out.println("");
String findMe = "plugins".concat(FILE_SEPARATOR).concat(pluginList.get(i));
Pattern pattern = Pattern.compile("("+name.getPath()+")(.*)");
Matcher matcher = pattern.matcher(findMe);
// Check if the current plugin matches the string.
if (matcher.find()) {
return !pluginListMode;
}
}

All you really need is
return ("plugins"+FILE_SEPARATOR+pluginName).indexOf(name.getPath()) != -1;
But your code also makes no sense due to the fact that there's no way for that for-loop to enter a second iteration -- it returns unconditionally. So more probably you need something like this:
for (String pluginName : pluginList)
if (("plugins"+FILE_SEPARATOR+pluginName).indexOf(name.getPath()) != -1)
return false;
return true;

Right now we can only guess since we don't know what name.getPath() might return.
I suspect it fails because that string might contain characters that have special meaning inside regexes. Try it again with
Pattern pattern = Pattern.compile("("+Pattern.quote(name.getPath())+")(.*)");
and see what happens then.
Also the (.*) part (and even the parentheses around your name.getPath() result) don't appear to matter at all since you're not doing anything with the result of the match itself. At which point the question is why you're using a regex in the first place.

Related

Replacing Strings with a number in it without a for loop

So I currently have this code;
for (int i = 1; i <= this.max; i++) {
in = in.replace("{place" + i + "}", this.getUser(i)); // Get the place of a user.
}
Which works well, but I would like to just keep it simple (using Pattern matching)
so I used this code to check if it matches;
System.out.println(StringUtil.matches("{place5}", "\\{place\\d\\}"));
StringUtil's matches;
public static boolean matches(String string, String regex) {
if (string == null || regex == null) return false;
Pattern compiledPattern = Pattern.compile(regex);
return compiledPattern.matcher(string).matches();
}
Which returns true, then comes the next part I need help with, replacing the {place5} so I can parse the number. I could replace "{place" and "}", but what if there were multiple of those in a string ("{place5} {username}"), then I can't do that anymore, as far as I'm aware, if you know if there is a simple way to do that then please let me know, if not I can just stick with the for-loop.
then comes the next part I need help with, replacing the {place5} so I can parse the number
In order to obtain the number after {place, you can use
s = s.replaceAll(".*\\{place(\\d+)}.*", "$1");
The regex matches arbitrary number of characters before the string we are searching for, then {place, then we match and capture 1 or more digits with (\d+), and then we match the rest of the string with .*. Note that if the string has newline symbols, you should append (?s) at the beginning of the pattern. $1 in the replacement pattern "restores" the value we need.

How to properly use java Pattern object to match string patterns

I wrote a code that does several string operations including checking whether a given string matches with a certain regular expression. It ran just fine with 70,000 input but it started to give me out of memory error when I iteratively ran it for five-fold cross validation. It just might be the case that I have to assign more memory, but I have a feeling that I might have written an inefficient code, so wanted to double check if I didn't make any obvious mistake.
static Pattern numberPattern = Pattern.compile("^[a-zA-Z]*([0-9]+).*");
public static boolean someMethod(String line) {
String[] tokens = line.split(" ");
for(int i=0; i<tokens.length; i++) {
tokens[i] = tokens[i].replace(",", "");
tokens[i] = tokens[i].replace(";", "");
if(numberPattern.matcher(tokens[i]).find()) return true;
}
return false;
}
and I have also many lines like below:
token.matches("[a-z]+[A-Z][a-z]+");
Which way is more memory efficient? Do they look efficient enough? Any advice is appreciated!
Edited:
Sorry, I had a wrong code, which I intended to modify before posting this question but I forgot at the last minute. But the problem was I had many similar looking operations all over, aside from the fact that the example code did not make sense, I wanted to know if regexp comparison part was efficient.
Thanks for all of your comments, I'll look through and modify the code following the advice!
Well, first at all, try a second look at your code... it will always return a "true" value ! You are not reading the 'match' variable, just putting values....
At second, String is immutable, so, each time you're splitting, you're creating another instances... why don't you try so create a pattern that makes the matches you want ignoring the commas and semicolons? I'm not sure, but I think it will take you less memory...
Yes, this code is inefficient indeed because you can return immediately once you've found that match = true; (no point to continue looping).
Further, are you sure you need to break the line into tokens ? why not check the regex only once ?
And last, if all comparisons checks failed, you should return false (last line).
Instead of altering the text and splitting it you can put it all in the regex.
// the \\b means it must be the start of the String or a word
static Pattern numberPattern = Pattern.compile("\\b[a-zA-Z,;]*[0-9,;]*[0-9]");
// return true if the string contains
// a number which might have letters in front
public static boolean someMethod(String line) {
return numberPattern.matcher(line).find());
}
Aside from what #alfasin has mentioned in his answer, you should avoid duplicating code; Rewrite the following:
{
tokens[i] = tokens[i].replace(",", "");
tokens[i] = tokens[i].replace(";", "");
}
Into:
tokens[i] = tokens[i].replaceAll(",|;", "");
And please just compute this before it was .split(), such that the operation doesn't have to be repeated within the loop:
String[] tokens = line.replaceAll(",|;", "").split(" ");
^^^^^^^^^^^^^^^^^^^^^^
Edit: After staring at your code for a bit I think I have a better solution, using regex ;)
public static boolean someMethod(String line) {
return Pattern.compile("\\b[a-zA-Z]*\\d")
.matcher(line.replaceAll(",|;", "")).find();
}
Online Regex DemoOnline Code Demo
\b is a Word Boundary.
It asserts position at the Boundary of a word (Start of line + after spacing)
Code Demo STDOUT:
foo does not match
bar does not match
bar1 does match
foo baz bar bar1 lolz does match
password_01 does not match

how to check string contain any character other than number in java?

I want to check String contain any character or special character other than number.I wrote following code for this
String expression = "[^a-zA-z]";
Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(jTextFieldPurchaseOrder.getText().toString().trim());
It is working fine when i am taking value from jTextField and checking my condition. But giving error when checking String from DTO as below
list.get(0).getChalan_trans_id().toString().trim().matches("[^a-zA-z]");
Where list is arraylist of DTO.
I am not getting where am I going wrong?
Thanks
If you want to check if there is a non-digit character, you can use .*\\D.*:
if (list.get(0).getChalan_trans_id().toString().trim().matches(".*\\D.*")) {
//non-digit found, handle it
}
or, maybe easier, do it the other way around:
if (list.get(0).getChalan_trans_id().toString().trim().matches("\\d*")) {
//only digits found
}
There's probably a more efficient way than regular expressions. Regular expressions are powerful, but can be overkill for a simple task like this.
Something like this ought to work, and I would expect it to be quicker.
static boolean hasNonNumber(String s) {
for (int i = 0; i < s.length(); ++i) {
char c = s.charAt(i);
if (!Character.isDigit(c)) {
return true;
}
}
return false;
}

Check special arrangement of specific signs in a string in Java

I need to check a string whether it includes a specific arrangements of letters and numbers.
Valid arrangements are for example:
X
X-Y
A-H-K-L-J-Y
A-H-J-Y
123
12?
12*
12-17
Invalid are for example:
-X-Y
-XY
*12
?12
I have written this method in java to solve this problem (but i donĀ“t have some experiences with regular expressions):
public boolean checkPatternMatching(String sourceToScan, String searchPattern) {
boolean patternFounded;
if (sourceToScan == null) {
patternFounded = false;
} else {
Pattern pattern = Pattern.compile(Pattern.quote(searchPattern),
Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(sourceToScan);
patternFounded = matcher.find();
}
return patternFounded;
}
How can i implemented this requirement with regular expressions?
By the way: It is a good solution to check a string, whether it includes numeric content by using the method isNumeric from the java class StringUtils?
//EDIT
The link, which was edited by the admins includes not specific arrangements of characters but only an appearance of characters with regular expressions in general !
After a good while trying to help, answering to constantly changing questions, just found out that the same was asked yesterday, and that the OP doesn't accept answers to his questions...all I have left to say is good night sir, good luck
n-th answer follows:
First pattern: [a-z](-[a-z])* : a letter, possibly followed by more letters, separated by -.
Second pattern: \d+(-\d+)*[?*]* : a number, possibly followed by more numbers, separated by -, and possibly ending with ? or *.
So join them together: ^([a-z](-[a-z])*)|(\d+(-\d+)*[?*]*)$. ^ and $ mark the beginning and the end of the string.
Few more comments on the code: you don't need to use Pattern.quote, and you should use matches() instead of find(), because find() returns true if any part of the string matches the pattern, and you want the whole string:
public static boolean checkPatternMatching(String sourceToScan, String searchPattern) {
boolean patternFounded;
if (sourceToScan == null) {
patternFounded = false;
} else {
Pattern pattern = Pattern.compile(searchPattern, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(sourceToScan);
patternFounded = matcher.matches();
}
return patternFounded;
}
Called like this: checkPatternMatching(s, "^([a-z](-[a-z])*)|(\\d+(-\\d+)*[?*]*)$")
About the second question, this is the current implementation of StringUtils.isNumeric:
public static boolean isNumeric(final CharSequence cs) {
if (isEmpty(cs)) {
return false;
}
final int sz = cs.length();
for (int i = 0; i < sz; i++) {
if (Character.isDigit(cs.charAt(i)) == false) {
return false;
}
}
return true;
}
So no, there is nothing wrong about it, that is as simple as it gets. But you need to include an external JAR in your program, which I find unnecessary if you just want to use such a simple method.
I believe that you should first remove the Pattern.quote() method because that would turn the inputting patterns into string literals; and those are not really useful in your context.
To match the valid arrangements with letters, something like this should work:
^[a-z](?:-[a-z])*$
For the numbers (if I understood the rules correctly):
^\\d+(?:[?*]|-\\d+)*$
And if you want to combine them:
^(?:[a-z](?:-[a-z])*|\\d+(?:[?*]|-\\d+)*)$
I'm not familiar with Java itself, nor the isNumeric method, sorry.
As per your comment, if you want to accept *12 or 1?2 or 12*456, you can use:
^\\*?\\d+(?:[?*]\\d*|-\\d+)*$
Then add it to the previous regex like so:
^(?:[a-z](?:-[a-z])*|\\*?\\d+(?:[?*]\\d*|-\\d+)*)$

Regex capture group match lookup

I have a regex with multiple disjunctive capture groups
(a)|(b)|(c)|...
Is there a faster way than this one to access the index of the first successfully matching capture group?
(matcher is an instance of java.util.regex.Matcher)
int getCaptureGroup(Matcher matcher){
for(int i = 1; i <= matcher.groupCount(); ++i){
if(matcher.group(i) != null){
return i;
}
}
}
That depends on what you mean by faster. You can make the code a little more efficient by using start(int) instead of group(int)
if(matcher.start(i) != -1){
If you don't need the actual content of the group, there's no point trying to create a new string object to hold it. I doubt you'll notice any difference in performance, but there's no reason not to do it this way.
But you still have to write the same amount of boilerplate code; there's no way around that. Java's regex flavor is severely lacking in syntactic sugar compared to most other languages.
I guess the pattern is so:
if (matcher.find()) {
String wholeMatch = matcher.group(0);
String firstCaptureGroup = matcher.group(1);
String secondCaptureGroup = matcher.group(2);
//etc....
}
There could be more than one match. So you could use while cycle for going through all matches.
Please take a look at "Group number" section in javadoc of java.util.regex.Pattern.

Categories

Resources