Matching everything after the first comma in a string - java

I am using java to do a regular expression match. I am using rubular to verify the match and ideone to test my code.
I got a regex from this SO solution , and it matches the group as I want it to in rubular, but my implementation in java is not matching. When it prints 'value', it is printing the value of commaSeparatedString and not matcher.group(1) I want the captured group/output of println to be "v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso"
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
//match everything after first comma
String myRegex = ",(.*)";
Pattern pattern = Pattern.compile(myRegex);
Matcher matcher = pattern.matcher(commaSeparatedString);
String value = "";
if (matcher.matches())
value = matcher.group(1);
else
value = commaSeparatedString;
System.out.println(value);
(edit: I left out that commaSeparatedString will not always contain 2 commas. Rather, it will always contain 0 or more commas)

If you don't have to solve it with regex, you can try this:
int size = commaSeparatedString.length();
value = commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1,size);
Namely, the code above returns the substring which starts from the first comma's index.
EDIT:
Sorry, I've omitted the simpler version. Thanks to one of the commentators, you can use this single line as well:
value = commaSeparatedString.substring( commaSeparatedString.indexOf(",") );

The definition of the regex is wrong. It should be:
String myRegex = "[^,]*,(.*)";

You are yet another victim of Java's misguided regex method naming.
.matches() automatically anchors the regex at the beginning and end (which is in total contradiction with the very definition of "regex matching"). The method you are looking for is .find().
However, for such a simple problem, it is better to go with #DelShekasteh's solution.

I would do this like
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1));

Here is another approach with limited split
String[] spl = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso".split(",", 2);
if (spl.length == 2)
System.out.println(spl[1]);
Byt IMHO Del's answer is best for your case.

I would use replaceFirst
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.replaceFirst(".*?,", ""));
prints
v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso
or you could use the shorter but obtuse
System.out.println(commaSeparatedString.split(",", 2)[1]);

Related

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Filter and find integers in a String with Regex

I have this long string:
String responseData = "fker.phone.bash,0,0,0"
+ "fker.phone.bash,0,0,0"
+ "fker.phone.bash,2,0,0";
What I want to do is to extract the integers in this string. I have successfully done that with this code:
String pattern = "(\\d+)";
// this pattern finds EVERY integer. I only want the integers after the comma
Pattern pr = Pattern.compile(pattern);
Matcher match = pr.matcher(responseData);
while (match.find()) {
System.out.println(match.group());
}
So far it is working, but I want to make my regex more secure because the responsedata I get is dynamic. Sometimes I might get an integer in the middle of the string, but I only want the last integers, meaning after the comma.
I know the regex for starts with is ^ and I have to put my comma tecken as an argument, but I don't know how to piece it all together and that is why I am asking for help. Thank you.
String pattern = "(,)(\\d)+";
Then get the second group.
You can use positive lookbehind for that:
String pattern = "(?<=,)\\d+";
You don't need to extract any groups to do use that solution, because lookbehind is zero-length assertion.
You can simply use the following and find by match.group(1):
String pattern = ",(\\d+)";
See working demo
You can also use word boundaries to get independent numbers:
String pattern = "\\b(\\d+)\\b";

Why doesn't /0/g match in a string that contains zeroes?

This code always returns "false" at last, even if Integer contains any zero:
Integer i = (int) rand(1, 200); // random [1;200)
String regexp = "/0/g";
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(i.toString());
print(i);
print(m.matches());
What is the reason? I don't get where the mistake could be.
Needed: m.matches() = "true" if Integer contains one or more zero.
The problem is that you're giving the regular expression incorrectly. The string you give Pattern.compile is just the text of the expression, without / on either side, and without flags; flags are specified separately.
So in your case, you'd just want:
String regexp = "0";
There's no "global" flag; instead, you use the methods on the resulting Matcher as appropriate to what you're doing.
Needed: m.matches() = "true" if Integer contains one or more zero.
Then you don't want to use Matcher#matches, you want Match#find. Or if you need to use Matcher#matches, the expression would be:
String regexp = ".*0.*";
...e.g., any number of any character, then a 0, then any number of any character. That way, the entire string can match the expression.
Of course, if you just want to know there's a zero, it's much simpler to just use
boolean flag = String.valueOf(i).indexOf('0') != -1;
In this particular case you don't need a regex at all since you are looking for a literal character, use indexOf:
if (Str.indexOf( '0' ) != -1) {
...
about your original pattern:
regex don't need to be enclosed between delimiters in Java, so slashes are useless. The global modifier isn't needed too because the global nature is determined by the method you choose. (in other words, the only way to obtain several results is to use the find method in a loop to obtain the different results)
print(m.find());
Matcher will match from beginning.Use find as 0 input is not possible in your case.
Using find will enable you to locate 0 anywhere in the string.
matches tries to match the expression against the entire string and implicitly add a ^ at the start and $ at the end of your pattern, meaning it will not look for a substring. Hence false.
Also change your regex to "0" as suggested by the other answer.
Try,
String regexp = ".*0.*";
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(i.toString());
if(m.find()){
System.out.println(i);
System.out.println(m.matches());
}
Regex :

Java Regex is including new line in match

I'm trying to match a regular expression to textbook definitions that I get from a website.
The definition always has the word with a new line followed by the definition. For example:
Zither
Definition: An instrument of music used in Austria and Germany It has from thirty to forty wires strung across a shallow sounding board which lies horizontally on a table before the performer who uses both hands in playing on it Not to be confounded with the old lute shaped cittern or cithern
In my attempts to get just the word (in this case "Zither") I keep getting the newline character.
I tried both ^(\w+)\s and ^(\S+)\s without much luck. I thought that maybe ^(\S+)$ would work, but that doesn't seem to successfully match the word at all. I've been testing with rubular, http://rubular.com/r/LPEHCnS0ri; which seems to successfully match all my attempts the way I want, despite the fact that Java doesn't.
Here's my snippet
String str = ...; //Here the string is assigned a word and definition taken from the internet like given in the example above.
Pattern rgx = Pattern.compile("^(\\S+)$");
Matcher mtch = rgx.matcher(str);
if (mtch.find()) {
String result = mtch.group();
terms.add(new SearchTerm(result, System.nanoTime()));
}
This is easily solved by triming the resulting string, but that seems like it should be unnecessary if I'm already using a regular expression.
All help is greatly appreciated. Thanks in advance!
Try using the Pattern.MULTILINE option
Pattern rgx = Pattern.compile("^(\\S+)$", Pattern.MULTILINE);
This causes the regex to recognise line delimiters in your string, otherwise ^ and $ just match the start and end of the string.
Although it makes no difference for this pattern, the Matcher.group() method returns the entire match, whereas the Matcher.group(int) method returns the match of the particular capture group (...) based on the number you specify. Your pattern specifies one capture group which is what you want captured. If you'd included \s in your Pattern as you wrote you tried, then Matcher.group() would have included that whitespace in its return value.
With regular expressions the first group is always the complete matching string. In your case you want group 1, not group 0.
So changing mtch.group() to mtch.group(1) should do the trick:
String str = ...; //Here the string is assigned a word and definition taken from the internet like given in the example above.
Pattern rgx = Pattern.compile("^(\\w+)\s");
Matcher mtch = rgx.matcher(str);
if (mtch.find()) {
String result = mtch.group(1);
terms.add(new SearchTerm(result, System.nanoTime()));
}
A late response, but if you are not using Pattern and Matcher, you can use this alternative of DOTALL in your regex string
(?s)[Your Expression]
Basically (?s) also tells dot to match all characters, including line breaks
Detailed information: http://www.vogella.com/tutorials/JavaRegularExpressions/article.html
Just replace:
String result = mtch.group();
By:
String result = mtch.group(1);
This will limit your output to the contents of the capturing group (e.g. (\\w+)) .
Try the next:
/* The regex pattern: ^(\w+)\r?\n(.*)$ */
private static final REGEX_PATTERN =
Pattern.compile("^(\\w+)\\r?\\n(.*)$");
public static void main(String[] args) {
String input = "Zither\n Definition: An instrument of music";
System.out.println(
REGEX_PATTERN.matcher(input).matches()
); // prints "true"
System.out.println(
REGEX_PATTERN.matcher(input).replaceFirst("$1 = $2")
); // prints "Zither = Definition: An instrument of music"
System.out.println(
REGEX_PATTERN.matcher(input).replaceFirst("$1")
); // prints "Zither"
}

Regarding extracting a string

I have a string Till No. S59997-RSS01 Now I need to extract the 01 from it , but the issue is that it is dynameic means
String TillNo =pinpadTillStore.getHwIdentifier();
The value S59997-RSS01 is in TillNo, that I come to know from debugging but in real time which value is coming inside TillNo , will not be known to me but the pattern of the value will be the same (S59997-RSS01) , Please advise how to extract the last two digits like(01)
int size = tillNo.length();
String value = tillNo.substring(size-2); // do this if size > 2.
You can use the subString method.
Refer to how to use subString()
If the two digits will only appear at last two position, just use the substring method. For a more flexible way, use Regular Expression instead.
String TillNo = "S59997-RSS01";
System.out.println(TillNo.substring(TillNo.length() - 2));
Pattern pattern = Pattern.compile("S[\\d]{5}-RSS([\\d]{2})");
Matcher matcher = pattern.matcher(TillNo);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
exactly, you can extract the last 2 digits using the substring method for strings like:
String TillNo="S59997-RSS01";
String substring=TillNo.substring(TillNo.length()-2,TillNo.length());

Categories

Resources