Single Digits From String In Java Regex - java

Given a string
1 3 2 1 9 1 bla 3 4 3
I found that
/b[1-4]/b
will return only the digits 1 2 3 4 as this shows but String[] input = args[0].split("\b[1-4]\b"); does not return
{"1","3","2","1","1","3","4","3"}

I assume that you only want digits between 1 and 4. A simple split is not going to be enough. One approach could be something like this:
String str = "1 3 2 1 9 1 bla 3 4 3";
String[] splitAndFilter = Pattern.compile("\\s+")
.splitAsStream(str)
.filter(s -> s.matches("[1-4]"))
.toArray(String[]::new);
System.out.println(Arrays.toString(splitAndFilter));

The problem with your current approach is that you are trying to split on the numbers themselves. This won't give the intended result, because that on which you split gets consumed (read: removed), leaving behind everything else. Instead, try splitting on [^1-4]+:
String input = "1 3 2 1 9 1 bla 3 4 3";
String[] parts = input.split("[^1-4]+");
System.out.println(Arrays.toString(parts));
This prints:
[1, 3, 2, 1, 1, 3, 4, 3]
This will split on one or more non 1-4 characters. This happens to work for your input string, because whitespace is a delimiter, and also the non matching digits and words should be removed.

You can use just [1-4] as the regex.
import java.util.Arrays;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
String[] matches = Pattern.compile("[1-4]")
.matcher(args[0])
.results()
.map(MatchResult::group)
.toArray(String[]::new);
System.out.println(Arrays.toString(matches));
}
}
Output:
[1, 3, 2, 1, 1, 3, 4, 3]
where the command-line argument is "1 3 2 1 9 1 bla 3 4 3"

Related

java regex split string by multiple chars and spaces

I have a string, i need to split by chars and spaces. it can be 1 or more of chars/spaces..can be any number of chars.
String a="1a2bc3 4d5 6ads";
s.split(" ");
I want 1 2 3 4 5 6.
do advise how incorporated the characters
Here is a streamlined regex splitting solution:
String input = "1a2bc3 4d5 6ads";
String[] nums = input.split("\\D+");
System.out.println(Arrays.toString(nums)); // [1, 2, 3, 4, 5, 6]
The idea here is to split the input one groups of one or more non digit characters. This turns out an array consisting only of numbers, which were spared during the split.

Regex to retrieve last digits from a string and sort them

I have a list of string which I want to sort them using the last digits present in the string, I have tried this using below code, but for some reasons, it's picking digit present before the last digit as well, for instance in "abc\xyz 2 5" string it's picking 25 instead of just 5 because of which it is sorting it incorrectly. May I know what's incorrect in my regex?
Note: My last two digits will always be timestamp like 1571807700009 1571807700009.
Here's what I have tried so far.
public static void second() {
List<String> strings = Arrays.asList("abc\\xyz 2 5", "abc\\\\xyz 1 8", "abc\\\\xyz 1 9", "abc\\\\xyz 1 7", "abc\\\\xyz 1 3");
Collections.sort(strings, new Comparator<String>() {
public int compare(String o1, String o2) {
return (int) (extractInt(o1) - extractInt(o2));
}
Long extractInt(String s) {
String num = s.replaceAll("\\D", "");
return Long.parseLong(num);
}
});
System.out.println(strings);
}
Output
[abc\\xyz 1 3, abc\\xyz 1 7, abc\\xyz 1 8, abc\\xyz 1 9, abc\xyz 2 5]
Expected Output
[abc\\xyz 1 3, abc\\xyz 2 5, abc\\xyz 1 7, abc\\xyz 1 8, abc\xyz 1 9]
Using a stream, sort only on the last integer by replacing the previous portion of the string with an empty string. You can also take advantage of the API Comparator interface by passing that value to the comparing method.
List<String> strings = Arrays.asList("abc\\xyz 2 5", "abc\\\\xyz 1 8",
"abc\\\\xyz 1 9", "abc\\\\xyz 1 7", "abc\\\\xyz 1 3");
strings = strings.stream()
.sorted(Comparator.comparing(s -> Long.valueOf(s.replaceAll(".*\\s+", ""))))
.collect(Collectors.toList());
System.out.println(strings);
Change your extractInt method to this to remove everything except last number from input:
Long extractInt(String s) {
String num = s.replaceFirst("^.+\\b(\\d+)$", "$1");
return Long.parseLong(num);
}
This regex is matching a greedy match at start .+ to make sure to match longest string before matching \d+ in the end after matching word boundary using \b.
This will give following output:
[abc\\xyz 1 3, abc\xyz 2 5, abc\\xyz 1 7, abc\\xyz 1 8, abc\\xyz 1 9]
If your goal is to sort your strings by comparing only the last digit, you even don't need to parse that digit to int or long. Assuming your strings have always a digit at the end:
Function<String,String> lastDigit = s -> s.substring(s.length()-1);
List<String> strings = Arrays.asList("abc\\xyz 2 5", "abc\\\\xyz 1 8", "abc\\\\xyz 1 9", "abc\\\\xyz 1 7", "abc\\\\xyz 1 3");
System.out.println("Before sorting: " + strings);
strings.sort(Comparator.comparing(lastDigit));
System.out.println("After sorting: " + strings);
EDIT
You don't seem to compare only the last digit as assumed at the beginning, but the last number after the last space character. If this is the case use the similar approach below
Function<String,Long> lastNum = s -> Long.valueOf(s.substring(s.lastIndexOf(" ")+1));
List<String> strings = Arrays.asList("abc\\xyz 2 5", "abc\\\\xyz 1 8", "abc\\\\xyz 1 9", "abc\\\\xyz 1 7", "abc\\\\xyz 1 3");
System.out.println("Before sorting: " + strings);
strings.sort(Comparator.comparing(lastNum));
System.out.println("After sorting: " + strings);
To compare the last number in each string you can just substring from the last space and then parse to a Long.
I.e.
strings = strings.stream().sorted(Comparator.comparing(
s -> parseLong(s.substring(s.lastIndexOf(' ') + 1))
)).collect(Collectors.toList());

Regex to split parameters

I am looking for a regular expression to split a string on commas. Sounds very simple, but there is another restriction. The parameters on the string could have commas surrounded by parenthesis which should not split the string.
Example:
1, 2, 3, add(4, 5, 6), 7, 8
^ ^ ^ ! ! ^ ^
The string should only be splitted by the commas marked with ^ and not with !.
I found a solution for it here: A regex to match a comma that isn't surrounded by quotes
Regex:
,(?=([^\(]*\([^\)]*\))*[^\)]*$)
But my string could be more complex:
1, 2, 3, add(4, 5, add(6, 7, 8), 9), 10, 11
^ ^ ^ ! ! ! ! ! ^ ^
For this string the result is wrong and i have no clue how to fix this or if it even is possible with regular expressions.
Have anyone an idea how to resolve this problem?
Thanks for your help!
Ok, I think a regular expression is not very useful for this. A small block of java might be easier.
So this is my java code for solving the problem:
public static void splitWithJava() {
String EXAMPLE = "1, 2, 3, add(4, 5, add(7, 8), 6), 7, 8";
List<String> list = new ArrayList<>();
int start = 0;
int pCount = 0;
for (int i = 0; i < EXAMPLE.length(); i++) {
char c = EXAMPLE.charAt(i);
switch (c) {
case ',': {
if (0 == pCount) {
list.add(EXAMPLE.substring(start, i).trim());
start = i + 1;
};
break;
}
case '(': {
pCount++;
break;
}
case ')': {
pCount--;
break;
}
}
}
list.add(EXAMPLE.substring(start).trim());
for (String str : list) {
System.out.println(str);
}
}
You can also achieve this using this regex: ([^,(]+(?=,|$)|[\w]+\(.*\)(?=,|$))
regex online demo
Considering this text 1, 2, 3, add(4, 5, add(6, 7, 8), 9), 10, 11 it creates groups based on commas (not surrounded by ())
So, the output would be:
Match 1
Group 1. 0-1 `1`
Match 2
Group 1. 2-4 ` 2`
Match 3
Group 1. 5-7 ` 3`
Match 4
Group 1. 9-35 `add(4, 5, add(6, 7, 8), 9)`
Match 5
Group 1. 36-39 ` 10`
Match 6
Group 1. 40-43 ` 11`

Using Split() in arithmetic formula

I thought a problem for a day but still cannot solve it.
I have a formula input like "11+1+1+2". without space
I want to split the formula according to the operator.
Then I wrote like these:
String s = "11+1+1+2";
String splitByOp[] = s.split("[+|-|*|/|%]");
for(int c=0; c < splitByOp.length; c++){
System.out.println(splitByOp[c]);
The output is:
11
1
1
2
I want to put the operand(the output) and also the operator(+) into an ArrayList. But how can I keep the operator after spliting them?
I try to have one more Array to split the number.
String operator[] = s.split("\\d");
But the result is 11 become 1 1. The length of operator[] is 5.
In other words, how can I perform like:
The output:
11
+
1
+
1
+
2
You need to split on a regex that is non consuming. Specifically, on "word boundary":
String[] terms = s.split("\\b");
A "word boundary" is the gap between the word char and a non-word char, but digits are classified as word chars. Importantly, the match is non-consuming, so all of the content of the input is preserved in the split terms.
Here's some test code:
String s = "11+1+1+2";
String[] terms = s.split("\\b");
for (String term : terms)
System.out.println(term);
Output:
11
+
1
+
1
+
2
public static void main(String[] args) {
String s = "11+1+1+2";
String[] terms = s.split("(?=[+])|(?<=[+])");
System.out.println(Arrays.toString(terms));
}
output
[11, +, 1, +, 1, +, 2]
You could combine lookahead/lookbehind assertions
String[] array = s.split("(?=[+])|(?<=[+])");

split integer from string

I'm importing a text file with multiple lines like this
0 2 23
1 3 34
2 4 45
12 5 56
I'm using this to read the file and split the values
while (txtFile.hasNext()) {
String str = txtFile.nextLine();
String[] parts = str.split("\\s+");
Based on this regex, the 1st three lines will have parts[1] [2] and [3], when it comes to the 4th line, it becomes parts[0] [1] and [2]
My question is which regex should I use to overcome this problem so it can read part[0] [1] and [2] for all the lines?
Trim the leading whitespace from the input String
String str = txtFile.nextLine().trim();

Categories

Resources