Splitting String in Java with empty elements - java

I'm reading from a .csv File line by line. One line could look for example as following: String str = "10,1,,,,".
Now I would like to split according to ",": String[] splitted = str.split(","); The problem now is that this only results in 2 elements but I would like to have 5 elements, the first two elements should contain 10 and 1 and the other 3 should be just an empty String.
Another example is String str = "0,,,,," which results in only one element but I would like to have 5 elements.
The last example is String str = "9,,,1,," which gives 2 elements (9 and 1), but I would like to have 5 elements. The first element should be 9 and the fourth element should be 1 and all other should be an empty String.
How can this be done?

You need to use it with -1 parameter
String[] splitted = str.split(",", -1);
This has been discussed before, e.g.
Java: String split(): I want it to include the empty strings at the end
But split really shouldn't be the way you parse a csv, you could run into problems when you have a String value containing a comma
23,"test,test","123.88"
split would split the row into 4 parts:
[23, "test, test", "123.88"]
and I don't think you want that.

split only drops trailing delimeters by default. You can turn this off with
String str = "9,,,1,,";
String[] parts = str.split(",", -1);
System.out.println(Arrays.toString(parts));
prints
[9, , , 1, , ]

Pass -1 (or any negative number, actually) as a second parameter to split:
System.out.println("0,,,,,".split(",", -1).length); // Prints 6.

Related

Java String , trick in split

I have a string with , separated I want to read till 4th index then remaining string I want to consider as one string.
Like in below
String str = "abc,xyz,123,789,ijk,1232,123,123,STU,PQR,111";
I want to split and take string after ijk in one string and from abc to 789 each part in different string.
String::split can take a second parameter indicating how many groups to form, which in your case is 5:
String[] result = str.split(",", 5);

Java String Split with Multiple characters

I want to split a Java string:
"[1,2,3,4,5]"
So I have an array that only has the integers
1
2
3
4
5
Without the ", [ ]"
I tried
String[] test = x.split("(, )|(\\[\\)|(\\]\\)");
Which I found in another thread but it does not work properly.
It keeps an empty string in test[0].
The easiest approach in this case seems that it would be to just replace the square brace characters [ and ] (via a replace() or replaceAll() call) and then perform your split() function using :
// Replace the square braces and then split using a comma
String[] output = input.replace("[", "").replace("]", "").split(",");
or :
// Replace the square braces and then split using a comma
String[] output = input.replaceAll("\\[|\\]", "").split(",");

Why String.Split(regex) in java returns array of elements of size less then what actually is present?

The number of elements returned is less than what I'd expected when I run String.split()
Example:- The actual string is "country,12345,2,1,,1,,", so 8 elements were expected in array returned, but the size of array was "6"
Code:-
String line1 = "country,12345,2,1,,1,,";
String data1[] = line1.split(",");
System.out.println("Length : "+data1.length);
Output:-
Length : 6
Why is it so?
Because the single-argument split method drops trailing empty fields. If you want to preserve them use the two-argument version, with a negative limit parameter.
String data1[] = line1.split(",", -1);
Thanks #Ian Roberts
Split() method drops tailing empty fields , so trailing empty fields won't count.
for better understanding , see code below -
Case 1:
String line1 = "country,12345,2,1,,1, ,";
String data1[] = line1.split(",");
System.out.println("Length : "+data1.length);
Output : 7
As second last character within comma is i.e. something (but not empty) , split won't count last part only.
Case 2:
String line1 = "country,12345,2,1,,1,, ";
String data1[] = line1.split(",");
System.out.println("Length : "+data1.length);
Output : 8
As last character after comma is i.e. something (but not empty) , split will count all parts.
I think you have to change the regex:
String data1[] = line1.split("\\,");

String.split() returning a "" unexpectedly

I have a simple method splitting a string into an array. It splits it where there are non-letter characters. The line I am using right now is as follows:
String[] words = str.split("[^a-zA-Z]");
So this should split the string where there are only alphabetical characters. But the problem is that when it splits it works for some, but not all. For example:
String str = "!!day--yaz!!";
String[] words = str.split("[^a-zA-Z]");
String result = "";
for (int i = 0; i < words.length; i++) {
result += words[i] + "1 ";
}
return result;
I added the 1 in there to see where the split takes place, becuase i was getting errors on null values. Anyway, when I run this code I get an output of:
1 1 day1 1 yaz1
Why is it splitting between the first two !'s and after one of the -'s, but not after the last two !'s? Why is it even splitting there at all? Any help on this would be great!
It doesn't split before or after it splits ON the matches, therefore you get an empty String between the dashes and the bangs.
This doesn't apply to the trailing bangs, because trailing empty Strings are omitted as described in the javadoc
Trailing empty strings are therefore not included in the resulting
array.
This happens because it indeed uses every non-letter character as a delimiter. It means that string "!" will be splitted into array of 2 empty strings to the left and to the right of the exclamation sign.
Your problem can be solved withing 2 steps.
use "[^a-zA-Z]+" instead of "[^a-zA-Z]". The + will help you to avoid empty string between 2 dashes.
Remove starting and trailing non-letter characters before splitting. This will remove leading and trailing empty strings: str.replaceFirst("[^a-zA-Z]+").replaceFirst("[^a-zA-Z]+$")
Finally your split will look like:
String[] words = str..replaceFirst("[^a-zA-Z]+").replaceFirst("[^a-zA-Z]+$")split("[^a-zA-Z]");
If you want to get rid of some of the extra splits, use split("[^a-zA-Z]+") instead of split("[^a-zA-Z]"). This will match a continuous part of the String that matches the pattern.

Cutting / splitting strings with Java

I have a string as follows:
2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576
I want to extract the number: 872226816, so in this case I assume after the second comma start reading the data and then the following comma end the reading of data.
Example output:
872226816
s = "2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576";
s.split(",")[2];
Javadoc for String.split()
If the number you want will always be after the 2nd comma, you can do something like so:
String str = "2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576";
String[] line = str.split(",");
System.out.println(line[2]);

Categories

Resources