I am trying to parse a file that has each line with pipe delimited values.
It did not work correctly when I did not escape the pipe delimiter in split method, but it worked correctly after I escaped the pipe as below.
private ArrayList<String> parseLine(String line) {
ArrayList<String> list = new ArrayList<String>();
String[] list_str = line.split("\\|"); // note the escape "\\" here
System.out.println(list_str.length);
System.out.println(line);
for(String s:list_str) {
list.add(s);
System.out.print(s+ "|");
}
return list;
}
Can someone please explain why the pipe character needs to be escaped for the split() method?
String.split expects a regular expression argument. An unescaped | is parsed as a regex meaning "empty string or empty string," which isn't what you mean.
Because the syntax for that parameter to split is a regular expression, where in the '|' has a special meaning of OR, and a '\|' means a literal '|' so the string "\\|" means the regular expression '\|' which means match exactly the character '|'.
You can simply do this:
String[] arrayString = yourString.split("\\|");
Related
i want a Regex expression to split a string based on \r characters not a carriage return or a new line.
Below is the sample string i have.
MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2\rPID|1|833944|21796920320|8276975
i want this to be split into
MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2
PID|1|833944|21796920320|8276975
currently i have something like this
StringUtils.split(testStr, "\\r");
but it is splitting into
MSH|^~
&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2
PID|1|833944|21796920320|8276975
You can just use String#split:
final String str = "MSH|^~\\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2\\rPID|1|833944|21796920320|8276975";
final String[] substrs = str.split("\\\\r");
System.out.println(Arrays.toString(substrs));
// Outputs [MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2, PID|1|833944|21796920320|8276975]
You can use
import java.utl.regex.*;
//...
String[] results = text.split(Pattern.quote("\\r"));
The Pattern.quote allows using any plain text inside String.split that accepts a valid regular expression. Here, \ is a special char, and needs to be escaped for both Java string interpretation engine and the regex engine.
The method being called matches any one of the contents in the delimiter string as a delimiter, not the entire sequence. Here is the code from SeparatorUtils that executes the delimiter (str is the input string being split) check:
if (separatorChars.indexOf(str.charAt(i)) >= 0) {
As #enzo mentioned, java.lang.String.split() will do the job - just make sure to quote the separator. Pattern.quote() can help.
I have input string in the following format
first|second|third|<forth>|<fifth>|$sixth I want to split this string into an array of string with value [first,second,third,,,$sixth]. I am using following code to split the string but that is not working. please help me.
public String[] splitString(String input){
String[] resultArray = input.split("|")
return resultArray;
}
Could you please tell me what am I doing wrong.
You need to escape | using backslash as it is a special character. This should work:
String[] resultArray = input.split("\\|")
| is a meta character meaning it represents something else in regex. Considering split takes regex as an argument, it interprets the argument using regex. You need to "escape" all of the meta characters by placing a \\ before it. In your case, you would do:
String[] resultArray = input.split("\\|");
I want to split a file with a pipe character on a string like number|twitter|abc.. in the mapper.
It is a long string. But it doesn't recognize pipe delimiter when I do:
String[] columnArray = line.split("|");
If I try to split it with a space like line.split(" "), it works fine so I don't think there is a problem with it recognizing characters.
Is there any other character that can look like pipe? Why doesn't split recognize the | character?
As shared in another answer
"String.split expects a regular expression argument. An unescaped | is parsed as a regex meaning "empty string or empty string," which isn't what you mean."
https://stackoverflow.com/a/9808719/2623158
Here's a test example.
public class Test
{
public static void main(String[] args)
{
String str = "test|pipe|delimeter";
String [] tmpAr = str.split("\\|");
for(String s : tmpAr)
{
System.out.println(s);
}
}
}
String.split takes a regular expression (as the javadoc states), and "|" is a special character in regular expressions. try "[|]" instead.
I want to split a string "ABC\DEF" ?
I have tried
String str = "ABC\DEF";
String[] values1 = str.split("\\");
String[] values2 = str.split("\");
But none seems to be working. Please help.
String.split() expects a regular expression. You need to escape each \ because it is in a java string (by the way you should escape on String str = "ABC\DEF"; too), and you need to escape for the regex. In the end, you will end with this line:
String[] values = str.split("\\\\");
The "\\\\" will be the \\ string, which the regex will interpret as \.
Note that String.split splits a string by regex.
One correct way1 to specify \ as delimiter, in RAW regex is:
\\
Since \ is special character in regex, you need to escape it to specify the literal \.
Putting the regex in string literal, you need to escape again, since \ is also escape character in string literal. Therefore, you end up with:
"\\\\"
So your code should be:
str.split("\\\\")
Note that this splits on every single instance of \ in the string.
Footnote
1 Other ways (in RAW regex) are:
\x5C
\0134
\u005C
In string literal (even worse than the quadruple escaping):
"\\x5C"
"\\0134"
"\\u005C"
Use it:
String str = "ABC\\DEF";
String[] values1 = str.split("\\\\");
final String HAY = "_0_";
String str = "ABC\\DEF".replace("\\", HAY);
System.out.println(Arrays.asList(str.split(HAY)));
I have a String, which I want to split into parts using delimeter }},{". I have tried using:
String delims="['}},{\"']+";
String field[]=new String[50];
field=subResult.split(delims);
But it is not working :-( do you know, what expression in delims should I use?
Thanks for your replies
A { is a regex meta-character which marks the beginning of a character class. To match a literal { you need to escape it by preceding it with a \\ as:
String delims="}},\\{";
String field[] = subResult.split(delims);
You need not escape the } in your regex as the regex engine infers that it is a literal } as it is not preceded by a opening {. That said there is no harm in escaping it.
See it
If the delimiter is simply }},{ then subResult.split("\\}\\},\\{") should work
String fooo = "asdf}},{bar}},{baz";
System.out.println(Arrays.toString(fooo.split("\\}\\},\\{")));
You should be escaping it.
String.split("\\}\\},\\{");
You could be making it more complex than you need.
String text = "{{aaa}},{\"hello\"}";
String[] field=text.split("\\}\\},\\{\"");
System.out.println(Arrays.toString(field));
Use:
Pattern p = Pattern.compile("[}},{\"]");
// Split input with the pattern
String[] result = p.split(MyTextString);