java string split based on new line - java

I have following string
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n";
I want to break it on \n so at the end i should two string aaaaaaaa and bbbbbbbb. I dont want last one as it only contain white space. so if i split it based on new line character using str.split() final array should have two entry only.
I tried below:
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n".replaceAll("\\s+", " ");
String[] split = str.split("\n+");
it ignore all \n and give single string aaaaaaaaaa bbbbbbbb.

Delete the call to replaceAll(), which is removing the newlines too. Just this will do:
String[] split = str.split("\n\\s*");
This will not split on just spaces - the split must start at a newline (followed by optional further whitespace).
Here's some test code using your sample input with edge case enhancement:
String str = "aaaaaaaaa\nbbbbbb bbbbb\n \n";
String[] split = str.split("\n\\s*");
System.out.println(Arrays.toString(split));
Output:
[aaaaaaaaa, bbbbbb bbbbb]

This should do the trick:
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n";
String[] lines = str.split("\\s*\n\\s*");
It will also remove all trailing and leading whitespace from all lines.

The \ns are removed by your first statement: \s matches \n

Related

How to split a string to non empty words if it might include a separator like tab on first place

I have a strings that I want to split on words. I use String[] words = line.split("\\s+"); Problem that some if some them start with tab separator like "\t word1 \t word2....". then as a result of split I get array with 1st element "", 2nd "word1", 3rd: "word2" ... How to modify expression split("\s+") if I don't want to get any empty "" words in split result? (Result of split should have 1st element: "word1")
You may want to trim the word in the first place, so you no longer have spaces before the first character and after the last one, and then you could start splitting.
Example :
String[] words = line.trim().split("\\s+");

Extracting signs/symbols from a string in Java

I have a string with signs and i want to get the signs only and put them in a string array, here is what I've done:
String str = "155+40-5+6";
// replace the numbers with spaces, leaving the signs and spaces
String signString = str.replaceAll("[0-9]", " ");
// then get an array that contains the signs without spaces
String[] signsArray = stringSigns.trim().split(" ");
However the the 2nd element of the signsArray is a space, [+ , , -, +]
Thank you for your time.
You could do this a couple of ways. Either replace multiple adjacent digits with a single space:
// replace the numbers with spaces, leaving the signs and spaces
String signString = str.replaceAll("[0-9]+", " ");
Or alternatively in the last step, split on multiple spaces:
// then get an array that contains the signs without spaces
String[] signsArray = signString.trim().split(" +");
Just replace " " to "" in your code
String str = "155+40-5+6";
// replace the numbers with spaces, leaving the signs and spaces
String signString = str.replaceAll("[0-9]","");
// then get an array that contains the signs without spaces
String[] signsArray = stringSigns.split("");
This should work for you. Cheers
Image of running code

How split string at semi-colon that appears before a colon

How can I split this string power:110V;220V;Color:Pink;White;Type:1;2;Condition:New;Used;
into these 4 strings
power:110V;220V;
Color:Pink;White;
Type:1;2;
Condition:New;Used;
Split your input according to the below regex.
string.split("(?<=;)(?=\\w+:)");
The above regex would match all the boundaries which exists next to a semicolon and the boundary must be followed by one or more word characters and a colon.
OR
string.split("(?<=;)(?=[^;:]*:)");
Example:
String s = "power:110V;220V;Color:Pink;White;Type:1;2;Condition:New;Used;";
String[] parts = s.split("(?<=;)(?=\\w+:)");
for(String i: parts)
{
System.out.println(i);
}

How can I split a string except when the delimiter is protected by quotes or brackets?

I asked How to split a string with conditions. Now I know how to ignore the delimiter if it is between two characters.
How can I check multiple groups of two characters instead of one?
I found Regex for splitting a string using space when not surrounded by single or double quotes, but I don't understand where to change '' to []. Also, it works with two groups only.
Is there a regex that will split using , but ignore the delimiter if it is between "" or [] or {}?
For instance:
// Input
"text1":"text2","text3":"text,4","text,5":["text6","text,7"],"text8":"text9","text10":{"text11":"text,12","text13":"text14","text,15":["text,16","text17"],"text,18":"text19"}
// Output
"text1":"text2"
"text3":"text,4"
"text,5":["text6","text,7"]
"text8":"text9"
"text10":{"text11":"text,12","text13":"text14","text,15":["text,16","text17"],"text,18":"text19"}
You can use:
text = "\"text1\":\"text2\",\"text3\":\"text,4\",\"text,5\":[\"text6\",\"text,7\"],\"text8\":\"text9\",\"text10\":{\"text11\":\"text,12\",\"text13\":\"text14\",\"text,15\":[\"text,16\",\"text17\"],\"text,18\":\"text19\"}";
String[] toks = text.split("(?=(?:(?:[^\"]*\"){2})*[^\"]*$)(?![^{]*})(?![^\\[]*\\]),+");
for (String tok: toks)
System.out.printf("%s%n", tok);
- RegEx Demo
OUTPUT:
"text1":"text2"
"text3":"text,4"
"text,5":["text6","text,7"]
"text8":"text9"
"text10":{"text11":"text,12","text13":"text14","text,15":["text,16","text17"],"text,18":"text19"}

How can I remove the empty string resulting from a split?

When I split a String :
A.B.C.
by .. I get 4 strings. The fourth being the white space. How can I remove that ?
String tokens[] = text.split("\\.");
for(String token : tokens) {
System.out.println("Token : " + token);
}
If whitespace at the beginning or end is the problem, trim it off:
String tokens[] = text.trim().split("\\.");
Remove all the whitespace with a replaceAll() before your code.
text.replaceAll("\\s+","");
Your String is A.B.C. so that whenever you split that it with . it will be give four substrings only. Even though you use trim() it will give four substrings. So try to remove last . and then split string. You will get proper output.

Categories

Resources