I'm importing a text file with multiple lines like this
0 2 23
1 3 34
2 4 45
12 5 56
I'm using this to read the file and split the values
while (txtFile.hasNext()) {
String str = txtFile.nextLine();
String[] parts = str.split("\\s+");
Based on this regex, the 1st three lines will have parts[1] [2] and [3], when it comes to the 4th line, it becomes parts[0] [1] and [2]
My question is which regex should I use to overcome this problem so it can read part[0] [1] and [2] for all the lines?
Trim the leading whitespace from the input String
String str = txtFile.nextLine().trim();
Related
This question already has answers here:
How to split a string with any whitespace chars as delimiters
(13 answers)
Closed 5 years ago.
I'm looking for a way to convert a string to an array and strip all whitespaces in the process. Here's what I have:
String[] splitArray = input.split(" ").trim();
But I can't figure out how to get rid of spaces in between the elements.
For example,
input = " 1 2 3 4 5 "
I want splitArray to be:
[1,2,3,4,5]
First off, this input.split(" ").trim(); won't compile since you can't call trim() on an array, but fortunately you don't need to. Your problem is that your regex, " " is treating each space as a split target, and with an input String like so:
String input = " 1 2 3 4 5 ";
You end up creating an array filled with several empty "" String items.
So this code:
String input = " 1 2 3 4 5 ";
// String[] splitArray = input.split("\\s+").trim();
String[] splitArray = input.trim().split(" ");
System.out.println(Arrays.toString(splitArray));
will result in this output:
[1, , , , , , , , 2, 3, 4, , , , , , 5]
What you need to do is to create a regex that greedily groups all the spaces or whitespace characters together, and fortunately we have this ability -- the + operator
Simply use a greedy split with the whitespace regex group
String[] splitArray = input.trim().split("\\s+");
\\s denotes any white-space character, and the trailing + will greedily aggregate one or more contiguous white-space characters together.
And actually, in your situation where the whitespace is nothing but multiples of spaces: " ", this is adequate:
String[] splitArray = input.trim().split(" +");
Appropriate tutorials for this:
short-hand character classes -- discusses \\s
repetition -- discusses the + also ? and * repetition characters
Try:
String[] result = input.split(" ");
I want to split a string of form:
" 42 2152 12 3095 2"
into a list of integers, but when I use the .split(" ") function I end up with an empty "" element at the beginning due to the whitespace at the start. Is there a way to split this without the empty element?
Use the String.trim() function before you call split on the array. This will remove any white-spaces before and after your original string
For example:
String original = " 42 2152 12 3095 2";
original = original.trim();
String[] array = original.split(" ");
To make your code neater, you could also write it as:
String original = " 42 2152 12 3095 2";
String[] array = original.trim().split(" ");
If we print the array out:
for (String s : array) {
System.out.println(s);
}
The output is:
42
2152
12
3095
2
Hope this helps.
You can use String.trim to remove leading and trailing whitespace from the original string
String withNoSpace = " 42 2152 12 3095 2".trim();
You can use Scanner , it will read one integer at a time from string
Scanner scanner = new Scanner(number);
List<Integer> list = new ArrayList<Integer>();
while (scanner.hasNextInt()) {
list.add(scanner.nextInt());
}
As per above answers and you are asking the performance difference between all these methods:
There is no real performance difference all of these would run with O(n).
Actually, splitting the strings first like , and then adding them to a collection will contain 2 x O(n) loops.
I'm reading from a .csv File line by line. One line could look for example as following: String str = "10,1,,,,".
Now I would like to split according to ",": String[] splitted = str.split(","); The problem now is that this only results in 2 elements but I would like to have 5 elements, the first two elements should contain 10 and 1 and the other 3 should be just an empty String.
Another example is String str = "0,,,,," which results in only one element but I would like to have 5 elements.
The last example is String str = "9,,,1,," which gives 2 elements (9 and 1), but I would like to have 5 elements. The first element should be 9 and the fourth element should be 1 and all other should be an empty String.
How can this be done?
You need to use it with -1 parameter
String[] splitted = str.split(",", -1);
This has been discussed before, e.g.
Java: String split(): I want it to include the empty strings at the end
But split really shouldn't be the way you parse a csv, you could run into problems when you have a String value containing a comma
23,"test,test","123.88"
split would split the row into 4 parts:
[23, "test, test", "123.88"]
and I don't think you want that.
split only drops trailing delimeters by default. You can turn this off with
String str = "9,,,1,,";
String[] parts = str.split(",", -1);
System.out.println(Arrays.toString(parts));
prints
[9, , , 1, , ]
Pass -1 (or any negative number, actually) as a second parameter to split:
System.out.println("0,,,,,".split(",", -1).length); // Prints 6.
I thought a problem for a day but still cannot solve it.
I have a formula input like "11+1+1+2". without space
I want to split the formula according to the operator.
Then I wrote like these:
String s = "11+1+1+2";
String splitByOp[] = s.split("[+|-|*|/|%]");
for(int c=0; c < splitByOp.length; c++){
System.out.println(splitByOp[c]);
The output is:
11
1
1
2
I want to put the operand(the output) and also the operator(+) into an ArrayList. But how can I keep the operator after spliting them?
I try to have one more Array to split the number.
String operator[] = s.split("\\d");
But the result is 11 become 1 1. The length of operator[] is 5.
In other words, how can I perform like:
The output:
11
+
1
+
1
+
2
You need to split on a regex that is non consuming. Specifically, on "word boundary":
String[] terms = s.split("\\b");
A "word boundary" is the gap between the word char and a non-word char, but digits are classified as word chars. Importantly, the match is non-consuming, so all of the content of the input is preserved in the split terms.
Here's some test code:
String s = "11+1+1+2";
String[] terms = s.split("\\b");
for (String term : terms)
System.out.println(term);
Output:
11
+
1
+
1
+
2
public static void main(String[] args) {
String s = "11+1+1+2";
String[] terms = s.split("(?=[+])|(?<=[+])");
System.out.println(Arrays.toString(terms));
}
output
[11, +, 1, +, 1, +, 2]
You could combine lookahead/lookbehind assertions
String[] array = s.split("(?=[+])|(?<=[+])");
I'm trying to create a Java program that is able to follow the order of operations when infix expressions are entered. To simplify input I've decided it would be best to split the entire String by using a regex expression. I've been able to split everything except the 6 3 into their own String values in the splitLine array. My SSCCE attempt at this is as follows:
String line = "6 + 5 + 6 3 + 18";
String regex = "(?<=[-+*/()])|(?=[-+*/()])"; //Not spitting 6 3 correctly
String[] splitLine = line.split(regex);
for (int i=0; i<splitLine.length; i++) {
System.out.println(splitLine[i]);
}
Output:
6
+
5
+
6 3 //Error here
+
18
Expected Output:
6
+
5
+
6 //Notice
3 //these
+
18
I've tried and tried to modify my regex expression and have been unsuccessful. Could anyone tell me why my regex expression isn't splitting the 6 and the 3 into their own Strings in the splitLine array?
Edit: Just wanted to point out that I am doing this for fun, it is not for any sort of school work, etc. I just wanted to see if I could write a program to perform simple infix expressions. I do agree that there are better ways to do this and if the expression were to become more complicated I would run into some issue. But unfortunately this is how my book recommended I approach this step.
Thanks again for all of the quick comments and answers!
try this : (?<=[-+*/()])|(?=[-+*/()]|\\s{2,})
you can try adding a space in the regex, this will also split when there is 2 or more space as in this case 6 and 3 is separated by space, 6 3 will also be separated. this regex will spit the string if more than 2 space is matched. You can change the minimum number of space as \s{min,} in the regex
Following regex would match your current input.
\s(\w+|[-+*/()])
The gist is to search for a whitespace followed by a word or specific character from your list.
Output
6
+
5
+
6
3
+
18
You can use something like that maybe?
String line = "6 + 5 + 6 3 + 18";
String regex = "(?<=[-+*/() ])|(?=[-+*/() ])"; //Added space to character class
String[] splitLine = line.split(regex);
for (int i=0; i<splitLine.length; i++) {
if (splitLine[i].trim().equals("")) // Check for blank elements
continue;
System.out.println(splitLine[i].trim());
}
Or you can split your string on (?<=\\d)\\s+(?=\\d) to get 6 + 5 + 6 and 3 + 18.