I've found something which almost fit my needs here
Integer.parseInt(s.replaceAll("[\\D]", ""))
but I can't find out how should I modify this to get negative integer. Sample string is:
"some\\-2c.st"
and I need to extract "-2"
I'd do it the other way around, look for the integer instead of stripping the rest:
String str = "some\\-2c.st";
Pattern pattern = Pattern.compile("-?[0-9]+");
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
int value = Integer.parseInt(matcher.group());
System.out.println(value);
}
Integer.parseInt(s.replaceAll("[^\\d-]", ""))
You can remove everything you don't want, or you can extract that what you want.
It seems, the latter is more appropriate here, you can use a regex like (-?\d+) to do it.
Related
I have this long string:
String responseData = "fker.phone.bash,0,0,0"
+ "fker.phone.bash,0,0,0"
+ "fker.phone.bash,2,0,0";
What I want to do is to extract the integers in this string. I have successfully done that with this code:
String pattern = "(\\d+)";
// this pattern finds EVERY integer. I only want the integers after the comma
Pattern pr = Pattern.compile(pattern);
Matcher match = pr.matcher(responseData);
while (match.find()) {
System.out.println(match.group());
}
So far it is working, but I want to make my regex more secure because the responsedata I get is dynamic. Sometimes I might get an integer in the middle of the string, but I only want the last integers, meaning after the comma.
I know the regex for starts with is ^ and I have to put my comma tecken as an argument, but I don't know how to piece it all together and that is why I am asking for help. Thank you.
String pattern = "(,)(\\d)+";
Then get the second group.
You can use positive lookbehind for that:
String pattern = "(?<=,)\\d+";
You don't need to extract any groups to do use that solution, because lookbehind is zero-length assertion.
You can simply use the following and find by match.group(1):
String pattern = ",(\\d+)";
See working demo
You can also use word boundaries to get independent numbers:
String pattern = "\\b(\\d+)\\b";
I want to check if my string contains only allowed characters. Everything works properly for example 7B, 77B or 7BBBB, but when I input something like this 7B7 or 7BB2 it's not matching.
Everything work fine, but when integer is last character it's not working.
Could You tell me what is wrong with that code?
pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}
If you want to mix numbers and chars in a various order you need sth like:
Pattern pattern = Pattern.compile("[\\da-fA-F]*")
Why not try it this way?
// Compile this pattern.
Pattern pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*[0-9]*");
// See if this String matches.
Matcher m = pattern.matcher("num123");
if (m.matches()) {
System.out.println(true);
}
Source
Are you trying to verify that the string only has digits and letters and nothing else?
If so try using the following:
pattern = Pattern.compile("^[a-z-A-Z\\d]*$");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}
First time posting.
Firstly I know how to use both Pattern Matcher & String Split.
My questions is which is best for me to use in my example and why?
Or suggestions for better alternatives.
Task:
I need to extract an unknown NOUN between two known regexp in an unknown string.
My Solution:
get the Start and End of the noun (from Regexp 1&2) and substring to extract the noun.
String line = "unknownXoooXNOUNXccccccXunknown";
int goal = 12 ;
String regexp1 = "Xo+X";
String regexp2 = "Xc+X";
I need to locate the index position AFTER the first regex.
I need to locate the index position BEFORE the second regex.
A) I can use pattern matcher
Pattern p = Pattern.compile(regexp1);
Matcher m = p.matcher(line);
if (m.find()) {
int afterRegex1 = m.end();
} else {
throw new IllegalArgumentException();
//TODO Exception Management;
}
B) I can use String Split
String[] split = line.split(regex1,2);
if (split.length != 2) {
throw new UnsupportedOperationException();
//TODO Exception Management;
}
int afterRegex1 = line.indexOf(split[1]);
Which Approach should I use and why?
I don't know which is more efficient on time and memory.
Both are near enough as readable to myself.
I'd do it like this:
String line = "unknownXoooXNOUNXccccccXunknown";
String regex = "Xo+X(.*?)Xc+X";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(line);
if (m.find()) {
String noun = m.group(1);
}
The (.*?) is used to make the inner match on the NOUN reluctant. This protects us from a case where our ending pattern appears again in the unknown portion of the string.
EDIT
This works because the (.*?) defines a capture group. There's only one such group defined in the pattern, so it gets index 1 (the parameter to m.group(1)). These groups are indexed from left to right starting at 1. If the pattern were defined like this
String regex = "(Xo+X)(.*?)(Xc+X)";
Then there would be three capture groups, such that
m.group(1); // yields "XoooX"
m.group(2); // yields "NOUN"
m.group(3); // yields "XccccccX"
There is a group 0, but that matches the whole pattern, and it's equivalent to this
m.group(); // yields "XoooXNOUNXccccccX"
For more information about what you can do with the Matcher, including ways to get the start and end positions of your pattern within the source string, see the Matcher JavaDocs
You should use String.split() for readability unless you're in a tight loop.
Per split()'s javadoc, split() does the equivalent of Pattern.compile(), which you can optimize away if you're in a tight loop.
It looks like you want to get a unique occurrence. For this do simply
input.replaceAll(".*Xo+X(.*)Xc+X.*", "$1")
For efficiency, use Pattern.matcher(input).replaceAll instead.
In case you input contains line breaks, use Pattern.DOTALL or the s modifier.
In case you want to use split, consider using Guava's Splitter. It behaves more sane and also accepts a Pattern which is good for speed.
If you really need the locations you can do it like this:
String line = "unknownXoooXNOUNXccccccXunknown";
String regexp1 = "Xo+X";
String regexp2 = "Xc+X";
Matcher m=Pattern.compile(regexp1).matcher(line);
if(m.find())
{
int start=m.end();
if(m.usePattern(Pattern.compile(regexp2)).find())
{
final int end = m.start();
System.out.println("from "+start+" to "+end+" is "+line.substring(start, end));
}
}
But if you just need the word in between, I recommend the way Ian McLaird has shown.
I am using java to do a regular expression match. I am using rubular to verify the match and ideone to test my code.
I got a regex from this SO solution , and it matches the group as I want it to in rubular, but my implementation in java is not matching. When it prints 'value', it is printing the value of commaSeparatedString and not matcher.group(1) I want the captured group/output of println to be "v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso"
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
//match everything after first comma
String myRegex = ",(.*)";
Pattern pattern = Pattern.compile(myRegex);
Matcher matcher = pattern.matcher(commaSeparatedString);
String value = "";
if (matcher.matches())
value = matcher.group(1);
else
value = commaSeparatedString;
System.out.println(value);
(edit: I left out that commaSeparatedString will not always contain 2 commas. Rather, it will always contain 0 or more commas)
If you don't have to solve it with regex, you can try this:
int size = commaSeparatedString.length();
value = commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1,size);
Namely, the code above returns the substring which starts from the first comma's index.
EDIT:
Sorry, I've omitted the simpler version. Thanks to one of the commentators, you can use this single line as well:
value = commaSeparatedString.substring( commaSeparatedString.indexOf(",") );
The definition of the regex is wrong. It should be:
String myRegex = "[^,]*,(.*)";
You are yet another victim of Java's misguided regex method naming.
.matches() automatically anchors the regex at the beginning and end (which is in total contradiction with the very definition of "regex matching"). The method you are looking for is .find().
However, for such a simple problem, it is better to go with #DelShekasteh's solution.
I would do this like
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.substring(commaSeparatedString.indexOf(",")+1));
Here is another approach with limited split
String[] spl = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso".split(",", 2);
if (spl.length == 2)
System.out.println(spl[1]);
Byt IMHO Del's answer is best for your case.
I would use replaceFirst
String commaSeparatedString = "Vtest7,v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso";
System.out.println(commaSeparatedString.replaceFirst(".*?,", ""));
prints
v123_gpbpvl-testpv1,v223_gpbpvl-testpv1-iso
or you could use the shorter but obtuse
System.out.println(commaSeparatedString.split(",", 2)[1]);
If I have a key that has the following sequence of characters: _(some number)_1. How can I just return (some number).
For example if the key is _6654_1 I just need value 6654. The problem/issue that's really confusing me is the number could be any length like _9332123425234_1 in which case I would just need the 9332123425234.
Here's what I've tried so far:
Pattern p = Pattern.compile("_[\\d]_1");
Matcher match = p.matcher(request.getParameter("course_id"));
but this won't cover the case where the middle number can be any number (not just four digits) will it?
You could just figure out the indexOf('_') and then use substring. No need for regular expressions.
...but since you asked for regular expressions, here you go:
import java.util.regex.*;
class Test {
public static void main(String[] args) {
String str = "_6654_1";
Pattern p = Pattern.compile("_(\\d+)_1");
Matcher m = p.matcher(str);
if (m.matches())
System.out.println(m.group(1)); // prints 6654
}
}
(And here is the substring-approach for comparison:)
String str = "_6654_1";
String num = str.substring(1, str.indexOf('_', 1));
System.out.println(num); // prints 6654
And, a final solution, using a simple split("_"):
String str = "_6654_1";
System.out.println(str.split("_")[1]); // prints.... you guessed it: 6654
Do you really need regexp? You can use substring and indexOf:
String st = "_9332123425234_1";
String number = st.substring(1,st.indexOf('_',1));
Assuming you have the underscores before and after your digit sequence, you could use _(\d+)_ to create a Capturing Group.
See http://download.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
You might also want to consider using a Splitter:
Splitter
This might be more efficient than a regex and since it returns all the elements you will be the before and after elements as well as the number in the middle. So, if you eventually need the number after the second "_" this might be the better way to go.