I know there is a lot out there.. But I couldn't figure out how to extract two parameters from Scanner(System.in);
commandline = scanner.readLine();
Two parameters are allowed:
First one can be one of AHG or the digits between 4 to 9.
The second parameters again between 4 to 9 OR any number.
It should handle all the scenarios:
" A 3 " //spaces before and after
"A 3" // spaces between the params
"A 7 6" // Unwanted 3rd parameter
" 6 " // Only one param with spaces.
So how to write Regex for this to extract the above?
I tried this one. \\w\\s. But this did not work. I am poor with RegEx.
Use this on the string returned by readLine():
String [] arguments = commandLine.split( "\\s+" );
The \\s+ stands for at least one whitespace character as separator.
Then check how many elements the array has.
Fimally check the formats of the two arguments
arguments[0].matches("\\s*[AHG4-9]");
arguments[1].matches("\\d");
Try:
public static ArrayList<String> parseArguments(String argument){
Pattern regex = Pattern.compile("^\\s*([AHG4-9])\\s*(\\d)?\\s*$",
Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(argument);
if (regexMatcher.find()) {
ArrayList<String> arguments = new ArrayList<String>();
arguments.add(regexMatcher.group(1));
if(regexMatcher.group(2) != null)
{
arguments.add(regexMatcher.group(2));
}
return arguments;
}
return null;
}
Depending on your input:
It will print:
[A,3]
Above regex also enforce argument rules. e.g as you mention first parameter can be A,H,G or number between 4 and 9. 2nd argument any number and can be optional
Related
I have a set of strings I need to parse and extract values from. They look like:
/apple/1212d3fe
/cat/23224a2f4
/auto/445478eefd
/somethingelse/1234fded
It should match only apple, cat and auto. The output I expect is:
1212, d3fe
23224, a2f4
445478, eefd
null
I need to come up with a regex capturing groups to do the same. I am able to extract the second part but not the first one. The closest I came up with is:
String r2 = "^/(apple/[0-9]{4}|cat/[0-9]{5}|auto/[0-9]{6})([a-f0-9]{4})$";
System.out.println(r2);
Pattern pattern2 = Pattern.compile(r2);
Matcher matcher2 = pattern2.matcher("/apple/2323efff");
if (matcher2.find()) {
System.out.println(matcher2.group(1));
System.out.println(matcher2.group(2));
}
UPDATED QUESTION:
I have a set of strings I need to parse and extract values from. They look like:
/apple/1212d3fe
/cat/23e24a2f4
/auto/df5478eefd
/somethingelse/1234fded
It should match only apple, cat and auto. The output I expect is the everything after the 2nd '/' split as follows: 4 characters if 'apple', 5 characters if 'cat' and 6 characters if 'auto' like:
1212, d3fe
23e24, a2f4
df5478, eefd
null
I need to come up with a regex capturing groups to do the same. I am able to extract the second part but not the first one. The closest I came up with is:
String r2 = "^/(apple/[0-9]{4}|cat/[0-9]{5}|auto/[0-9]{6})([a-f0-9]{4})$";
System.out.println(r2);
Pattern pattern2 = Pattern.compile(r2);
Matcher matcher2 = pattern2.matcher("/apple/2323efff");
if (matcher2.find()) {
System.out.println(matcher2.group(1));
System.out.println(matcher2.group(2));
}
I can do it without the regex OR(|) but it breaks when I include it. Any help with the right regex?
Updated Answer:
As per your updated question you can use this regex based on lookbehind assertions:
/((?<=apple/).{4}|(?<=cat/).{5}|(?<=auto/).{6})(.+)$
RegEx Demo
This regex uses 2 capture groups after matching /
In 1st group we have 3 lookbehind conditions with alternations.
(?<=apple/).{4} makes sure that we match 4 characters that have apple/ on left hand side. Likewise we match 5 and 6 character strings that have cat/ and /auto/.
In 2nd capture group we match remaining characters before end of line.
You could use the regex \/[apple|auto|cat]+\/(\d*)(.*), See here
If you want the last group to have exactly 4 digits you can use this regex:
/(apple|cat|auto)/([0-9a-f]+)([0-9a-f]{4})
Here is a working example:
List<String> strings = Arrays.asList("/apple/1212d3fe", "/cat/23224a2f4", "/auto/445478eefd");
Pattern pattern = Pattern.compile("/(apple|cat|auto)/([0-9a-f]+)([0-9a-f]{4})");
for (String string : strings) {
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
}
}
If you want for digits after apple, 5 after cat and 6 after auto you can split your algorithm in 2 parts:
List<String> strings = Arrays.asList("/apple/1212d3fe", "/cat/23224a2f4", "/auto/445478eefd", "/some/445478eefd");
Pattern firstPattern = Pattern.compile("/(apple|cat|auto)/([0-9a-f]+)");
for (String string : strings) {
Matcher firstMatcher = firstPattern.matcher(string);
if (firstMatcher.find()) {
String first = firstMatcher.group(1);
System.out.println(first);
int length = getLength(first);
Pattern secondPattern = Pattern.compile("([0-9a-f]{" + length + "})([0-9a-f]{4})");
Matcher secondMatcher = secondPattern.matcher(string);
if (secondMatcher.find()) {
System.out.println(secondMatcher.group(1));
System.out.println(secondMatcher.group(2));
}
}
}
private static int getLength(String key) {
switch (key) {
case "apple":
return 4;
case "cat":
return 5;
case "auto":
return 6;
}
throw new IllegalArgumentException("key not allowed");
}
I want to split a string after a certain length.
Let's say we have a string of "message"
123456789
Split like this :
"12" "34" "567" "89"
I thought of splitting them into 2 first using
"(?<=\\G.{2})"
Regexp and then join the last two and again split into 3 but is there any way to do it on a single go using RegExp. Please help me out
Use ^(.{2})(.{2})(.{3})(.{2}).* (See it in action in regex101) to group the String to the specified length and grab the groups as separate Strings
String input = "123456789";
List<String> output = new ArrayList<>();
Pattern pattern = Pattern.compile("^(.{2})(.{2})(.{3})(.{2}).*");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
output.add(matcher.group(i));
}
}
System.out.println(output);
NOTE: Group capturing starts from 1 as the group 0 matches the whole String
And a Magnificent Sorcery from #YCF_L from comment
String pattern = "^(.{2})(.{2})(.{3})(.{2}).*";
String[] vals = "123456789".replaceAll(pattern, "$1-$2-$3-$4").split("-");
Whats the magic here is you can replace the captured group by replaceAll() method. Use $n (where n is a digit) to refer to captured subsequences. See this stackoverflow question for better explanation.
NOTE: here its assumed that no input string contains - in it.
if so, then find any other character that will not be in any of
your input strings so that it can be used as a delimiter.
test this regex in regex101 with 123456789 test string.
^(\d{2})(\d{2})(\d{3})(\d{2})$
output :
Match 1
Full match 0-9 `123456789`
Group 1. 0-2 `12`
Group 2. 2-4 `34`
Group 3. 4-7 `567`
Group 4. 7-9 `89`
With this regex :
private static String p = "^\\(([-+]?([1-8]?\\d(\\.\\d+)?|90(\\.0+)?))\\,([-+]?(180(\\.0+)?|((1[0-7]\\d)|([1-9]?\\d))(\\.\\d+)?))\\)$";//"^(\\-?\d+(\.\d+)?),\s*(\\-?\d+(\\.\d+)?)$";
It is impossible for me to get the values and i don't understand why...
With an input like that :
(50,180) //or even
(-50,-180)
Why my regex doesn't get me the number 180 and can get the value 50??
I mean, my Pattern object can get always the first value after parenthesis and before "," but can't get the value after ",".
What's the problem with my regex ?
My code:
private static String patternGeographicCoordinates = "^\\(([-+]?([1-8]?\\d(\\.\\d+)?|90(\\.0+)?))\\,([-+]?(180(\\.0+)?|((1[0-7]\\d)|([1-9]?\\d))(\\.\\d+)?))\\)$";
....
Pattern geographicCoordinates = Pattern.compile(patternGeographicCoordinates);
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
....
Matcher m1 = geographicCoordinates.matcher(line); //line is a line from a file (String)
....
if(m1.matches()){
System.out.println("IT DID WORK, LINE: "+line+", M.GROUP: "+m1.group(3));
sb.append(line);
sb.append(System.lineSeparator());
}
Why don't you just remove the parenthesis and split around the comma?
import org.apache.commons.lang3.StringUtils;
...
theString = StringUtils.strip(theString,"()"));
String[] tokens = theString.split(",");
Double number2 = Double.parse(tokens[1]);
If you want to use regex anyway, you can do it like:
Pattern p = Pattern.compile("\\(([-]?\\d+)\\s*\\,\\s*([-]?\\d+)\\)$");
String input = "(-50,-80)";
Matcher m = p.matcher(input);
if(m.find())
{
System.out.println(m.group(1));
System.out.println(m.group(2));
}
See demo here
You're looking at wrong group indices. Check your regexp with this parser: https://regex101.com/
Here are the matching groups for the input (50,180):
1. [1-3] `50`
2. [1-3] `50`
5. [4-7] `180`
6. [4-7] `180`
Update
The regexp is made for more complex inputs than you supply in your example, that's why there are groups with null values. The additional groups are for decimal parts and special cases (apparently meaningful for coordinate parsing).
Look at the input (90.00,180.00). It's parsed into the following groups:
1. [1-6] `90.00`
2. [1-6] `90.00`
4. [3-6] `.00`
5. [7-13] `180.00`
6. [7-13] `180.00`
7. [10-13] `.00`
Now group 4 is matching (\.0+)? and group 7 is matching (\.\d+). You see that |90is an alternative, a special case of 90.00 degrees presumably. That's why group 3 is still empty but 4 is filled.
With input (85.21,150.34) you will get more groups filled:
1. [1-6] `85.21`
2. [1-6] `85.21`
3. [3-6] `.21`
5. [7-13] `150.34`
6. [7-13] `150.34`
8. [7-10] `150`
9. [7-10] `150`
11. [10-13] `.34`
Now group 3 is filled, but not the group 4, because it's [1-8]?\d case.
Also, since you have nested groups, same values are assigned twice: to 1 and 2 for instance.
i have a problem to build following regex:
[1,2,3,4]
i found a work-around, but i think its ugly
String stringIds = "[1,2,3,4]";
stringIds = stringIds.replaceAll("\\[", "");
stringIds = stringIds.replaceAll("\\]", "");
String[] ids = stringIds.split("\\,");
Can someone help me please to build one regex, which i can use in the split function
Thanks for help
edit:
i want to get from this string "[1,2,3,4]" to an array with 4 entries. the entries are the 4 numbers in the string, so i need to eliminate "[","]" and ",". the "," isn't the problem.
the first and last number contains [ or ]. so i needed the fix with replaceAll. But i think if i use in split a regex for ",", i also can pass a regex which eliminates "[" "]" too. But i cant figure out, who this regex should look like.
This is almost what you're looking for:
String q = "[1,2,3,4]";
String[] x = q.split("\\[|\\]|,");
The problem is that it produces an extra element at the beginning of the array due to the leading open bracket. You may not be able to do what you want with a single regex sans shenanigans. If you know the string always begins with an open bracket, you can remove it first.
The regex itself means "(split on) any open bracket, OR any closed bracket, OR any comma."
Punctuation characters frequently have additional meanings in regular expressions. The double leading backslashes... ugh, the first backslash tells the Java String parser that the next backslash is not a special character (example: \n is a newline...) so \\ means "I want an honest to God backslash". The next backslash tells the regexp engine that the next character ([ for example) is not a special regexp character. That makes me lol.
Maybe substring [ and ] from beginning and end, then split the rest by ,
String stringIds = "[1,2,3,4]";
String[] ids = stringIds.substring(1,stringIds.length()-1).split(",");
Looks to me like you're trying to make an array (not sure where you got 'regex' from; that means something different). In this case, you want:
String[] ids = {"1","2","3","4"};
If it's specifically an array of integer numbers you want, then instead use:
int[] ids = {1,2,3,4};
Your problem is not amenable to splitting by delimiter. It is much safer and more general to split by matching the integers themselves:
static String[] nums(String in) {
final Matcher m = Pattern.compile("\\d+").matcher(in);
final List<String> l = new ArrayList<String>();
while (m.find()) l.add(m.group());
return l.toArray(new String[l.size()]);
}
public static void main(String args[]) {
System.out.println(Arrays.toString(nums("[1, 2, 3, 4]")));
}
If the first line your code is following:
String stringIds = "[1,2,3,4]";
and you're trying to iterate over all number items, then the follwing code-frag only could work:
try {
Pattern regex = Pattern.compile("\\b(\\d+)\\b", Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
for (int i = 1; i <= regexMatcher.groupCount(); i++) {
// matched text: regexMatcher.group(i)
// match start: regexMatcher.start(i)
// match end: regexMatcher.end(i)
}
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
I am trying to use a simple split to break up the following string: 00-00000
My expression is: ^([0-9][0-9])(-)([0-9])([0-9])([0-9])([0-9])([0-9])
And my usage is:
String s = "00-00000";
String pattern = "^([0-9][0-9])(-)([0-9])([0-9])([0-9])([0-9])([0-9])";
String[] parts = s.split(pattern);
If I play around with the Pattern and Matcher classes I can see that my pattern does match and the matcher tells me my groupCount is 7 which is correct. But when I try and split them I have no luck.
String.split does not use capturing groups as its result. It finds whatever matches and uses that as the delimiter. So the resulting String[] are substrings in between what the regex matches. As it is the regex matches the whole string, and with the whole string as a delimiter there is nothing else left so it returns an empty array.
If you want to use regex capturing groups you will have to use Matcher.group(), String.split() will not do.
for your example, you could simply do this:
String s = "00-00000";
String pattern = "-";
String[] parts = s.split(pattern);
I can not be sure, but I think what you are trying to do is to get each matched group into an array.
Matcher matcher = Pattern.compile(pattern).matcher();
if (matcher.matches()) {
String s[] = new String[matcher.groupCount()) {
for (int i=0;i<matches.groupCount();i++) {
s[i] = matcher.group(i);
}
}
}
From the documentation:
String[] split(String regex) -- Returns: the array of strings computed by splitting this string around matches of the given regular expression
Essentially the regular expression is used to define delimiters in the input string. You can use capturing groups and backreferences in your pattern (e.g. for lookarounds), but ultimately what matters is what and where the pattern matches, because that defines what goes into the returned array.
If you want to split your original string into 7 parts using regular expression, then you can do something like this:
String s = "12-3456";
String[] parts = s.split("(?!^)");
System.out.println(parts.length); // prints "7"
for (String part : parts) {
System.out.println("Part [" + part + "]");
} // prints "[1] [2] [-] [3] [4] [5] [6] "
This splits on zero-length matching assertion (?!^), which is anywhere except before the first character in the string. This prevents the empty string to be the first element in the array, and trailing empty string is already discarded because we use the default limit parameter to split.
Using regular expression to get individual character of a string like this is an overkill, though. If you have only a few characters, then the most concise option is to use foreach on the toCharArray():
for (char ch : "12-3456".toCharArray()) {
System.out.print("[" + ch + "] ");
}
This is not the most efficient option if you have a longer string.
Splitting on -
This may also be what you're looking for:
String s = "12-3456";
String[] parts = s.split("-");
System.out.println(parts.length); // prints "2"
for (String part : parts) {
System.out.print("[" + part + "] ");
} // prints "[12] [3456] "