I have this string
String x = "2013-04-17T08:00:00.001,41.14806,-9.58972,-13.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-22.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-31.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-40.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-49.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-58.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-64.0,0.0,0.0,-20.0,4";
if i'm doing the split like this String vec2 [] = x.split(","); the output it will be this
2013-04-17T08:00:00.001
41.14806
-9.58972
-13.0
0.0
0.0
-20.0
and so on.
If I'm doing the split like this String vec2[] = x.split("|"); the output is this:
2
0
1
3
-
0
4
-
1
7
T
0
8
:
0
0
:
and so on.
And I would expect something similar to this:
2013-04-17T08:00:00.001,41.14806,-9.58972,-13.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-22.0,0.0,0.0,-20.0,4
and so on
Any idea what's wrong?
You need to escape the |:
String vec2[] = x.split("\\|");
That's because the argument to split() is a regex not a string.
In regexes, some characters have special meanings.
The vertical bar | represens alternation. So if you want to split according to |, you need to write \\| which like telling: "Don't take | as a special character, take it as the symbol |".
The argument to split is a regular expression and the "|" character has special meaning. Try escaping it \\|.
String.split(String) splits on a regular expression, not on a character. As you can see in the summary of Java regular expression constructs, the | functions as an or construct.
If you want to split on the | character, you might need to escape it using \|. Note that to escape it in a Java String, you'll need to escape the backslash as well: \\|.
The problem is that the split(String regex) takes a regular expression as argument. The pipe (|) is a special character in regex and must thus be escaped:
String x = "2013-04-17T08:00:00.001,41.14806,-9.58972,-13.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-22.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-31.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-40.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-49.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-58.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-64.0,0.0,0.0,-20.0,4";
String[] arr = x.split("\\|");
for(String str : arr)
{
System.out.println(str);
}
Yields:
2013-04-17T08:00:00.001,41.14806,-9.58972,-13.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-22.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-31.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-40.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-49.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-58.0,0.0,0.0,-20.0,4
2013-04-17T08:00:00.001,41.14806,-9.58972,-64.0,0.0,0.0,-20.0,4
Try this
String vec2[] = x.split("\\|");
You need to escape the | character, since it is the regex or pattern.
String x = "2013-04-17T08:00:00.001,41.14806,-9.58972,-13.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-22.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-31.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-40.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-49.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-58.0,0.0,0.0,-20.0,4|2013-04-17T08:00:00.001,41.14806,-9.58972,-64.0,0.0,0.0,-20.0,4";
String[] arr = x.split("\\|");
for(String s: arr){
System.out.println(s);
}
did you try escaping the character as such
x.split("\\|");
Related
I'm given a string, and I want to replace all open parenthesis that occur in succession, with a single one
((5)) → (5)
((((5)))) → (5)
I tried
str = str.replaceAll("((", "(");
and got regex patttern error
then i tried
str = str.replaceAll("\\((", "(");
then i tried
str = str.replaceAll("\\\\((", "(");
I keep getting the same error!
have you tried this?
str = str.replaceAll("\\({2,}", "(");
The '\' is the escape character, so every special character must be proceeded by it. Without them, regex reads it as an open parentheses used for grouping and expects a closed parentheses.
Edit: Originally, I thought he was trying to match exactly 2
You need to escape each parenthesis and add + to account for successive occurrences:
str = str.replaceAll("\\(\\(+","(");
Assuming the parentheses don't need to be paired, e.g. ((((5)) should become (5), then the following will do:
str = str.replaceAll("([()])\\1+", "$1");
Test
for (String str : new String[] { "(5)", "((5))", "((((5))))", "((((5))" }) {
str = str.replaceAll("([()])\\1+", "$1");
System.out.println(str);
}
Output
(5)
(5)
(5)
(5)
Explanation
( Start capture group
[()] Match a '(' or a ')'. In a character class, '(' and ')'
has no special meaning, so they don't need to be escaped
) End capture group, i.e. capture the matched '(' or ')'
\1+ Match 1 or more of the text from capture group #1. As a
Java string literal, the `\` was escaped (doubled)
$1 Replace with the text from capture group #1
See also regex101.com for demo.
I am not sure if the brackets are fixed or dynamic but assuming they may be dynamic what you could do here is use replaceAll and then use String.Format to format the string.
Hope it helps
public class HelloWorld{
public static void main(String []args){
String str = "((((5))))";
String abc = str.replaceAll("\\(", "").replaceAll("\\)","");
abc = String.format("(%s)", abc);
System.out.println(abc);
}
}
Output: (5)
I have tried the above code with ((5)) and (((5))) and it produces the same output.
I have a string which is of the form
String str = "124333 is the otp of candidate number 9912111242.
Please refer txn id 12323335465645 while referring blah blah.";
I need 124333, 9912111242 and 12323335465645 in a string array. I have tried this with
while (Character.isDigit(sms.charAt(i)))
I feel that running the above said method on every character is inefficient. Is there a way I can get a string array of all the numbers?
Use a regex (see Pattern and matcher):
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(<your string here>);
while (m.find()) {
//m.group() contains the digits you want
}
you can easily build ArrayList that contains each matched group you find.
Or, as other suggested, you can split on non-digits characters (\D):
"blabla 123 blabla 345".split("\\D+")
Note that \ has to be escaped in Java, hence the need of \\.
You can use String.split():
String[] nbs = str.split("[^0-9]+");
This will split the String on any group of non-numbers digits.
And this works perfectly for your input.
String str = "124333 is the otp of candidate number 9912111242. Please refer txn id 12323335465645 while referring blah blah.";
System.out.println(Arrays.toString(str.split("\\D+")));
Output:
[124333, 9912111242, 12323335465645]
\\D+ Matches one or more non-digit characters. Splitting the input according to one or more non-digit characters will give you the desired output.
Java 8 style:
long[] numbers = Pattern.compile("\\D+")
.splitAsStream(str)
.mapToLong(Long::parseLong)
.toArray();
Ah if you only need a String array, then you can just use String.split as the other answers suggests.
Alternatively, you can try this:
String str = "124333 is the otp of candidate number 9912111242. Please refer txn id 12323335465645 while referring blah blah.";
str = str.replaceAll("\\D+", ",");
System.out.println(Arrays.asList(str.split(",")));
\\D+ matches one or more non digits
Output
[124333, 9912111242, 12323335465645]
First thing comes into my mind is filter and split, then i realized that it can be done via
String[] result =str.split("\\D+");
\D matches any non-digit character, + says that one or more of these are needed, and leading \ escapes the other \ since \D would be parsed as 'escape character D' which is invalid
Problem description
I am trying to split a into separate strings, with the split() method that the String class provides. The documentation tells me that it will split around matches of the argument, which is a regular expression. The delimiter that I use is a comma, but commas can also be escaped. Escaping character that I use is a forward slash / (just to make things easier by not using a backslash, because that requires additional escaping in string literals in both Java and the regular expressions).
For instance, the input might be this:
a,b/,b//,c///,//,d///,
And the output should be:
a
b,b/
c/,/
d/,
So, the string should be split at each comma, unless that comma is preceded by an odd number of slashes (1, 3, 5, 7, ..., ∞) because that would mean that the comma is escaped.
Possible solutions
My initial guess would be to split it like this:
String[] strings = longString.split("(?<![^/](//)*/),");
but that is not allowed because Java doesn't allow infinite look-behind groups. I could limit the recurrence to, say, 2000 by replacing the * with {0,2000}:
String[] strings = longString.split("(?<![^/](//){0,2000}/),");
but that still puts constraints on the input. So I decided to take the recurrence out of the look-behind group, and came up with this:
String[] strings = longString.split("(?<!/)(?:(//)*),");
However, its output is the following list of strings:
a
b,b (the final slash is lacking in the output)
c/, (the final slash is lacking in the output)
d/,
Why are those slashes omitted in the 2nd and 3rd string, and how can I solve it (in Java)?
You are pretty close. To overcome lookbehind error you can use this workaround:
String[] strings = longString.split("(?<![^/](//){0,99}/),")
You can achieve the split using a positive look behind for an even number of slashes preceding the comma:
String[] strings = longString.split("(?<=[^/](//){0,999999999}),");
But to display the output you want, you need a further step of removing the remaining escapes:
String longString = "a,b/,b//,c///,//,d///,";
String[] strings = longString.split("(?<=[^/](//){0,999999999}),");
for (String s : strings)
System.out.println(s.replaceAll("/(.)", "$1"));
Output:
a
b,b/
c/,/
d/,
If you don't mind another method with regex, I suggest using .matcher:
Pattern pattern = Pattern.compile("(?:[^,/]+|/.)+");
String test = "a,b/,b//,c///,//,d///,";
Matcher matcher = pattern.matcher(test);
while (matcher.find()) {
System.out.println(matcher.group().replaceAll("/(.)", "$1"));
}
Output:
a
b,b/
c/,/
d/,
ideone demo
This method will match everything except the delimiting commas (kind of the reverse). The advantage is that it doesn't rely on lookarounds.
I love regexes, but wouldn't it be easy to write the code manually here, i.e.
boolean escaped = false;
for(int i = 0, len = s.length() ; i < len ; i++){
switch(s.charAt(i)){
case "/": escaped = !escaped; break;
case ",":
if(!escaped){
//found a segment, do something with it
}
//Fallthrough!
default:
escaped = false;
}
}
// handle last segment
Say I have a following string str:
GTM =0.2
Test =100
[DLM]
ABCDEF =5
(yes, it contains newline characters) That I am trying to split with [DLM] delimiter substring like this:
String[] strArr = str.split("[DLM]");
Why is it that when I do:
System.out.print(strArr[0]);
I get this output: GT
and when I do
System.out.print(strArr[1]);
I get =0.2
Does this make any sense at all?
str.split("[DLM]"); should be str.split("\\[DLM\\]");
Why?
[ and ] are special characters and String#split accepts regex.
A solution that I like more is using Pattern#quote:
str.split(Pattern.quote("[DLM]"));
quote returns a String representation of the given regex.
Yes, you're giving a regex which says "split with either D, or L, or M".
You should escape those boys like this: str.split("\[DLM\]");
It's being split at the first M.
Escape the brackets
("\\[DLM\\]")
When you use brackets inside the " ", it reads it as, each character inside of the brackets is a delimiter. So in your case, M was a delimiter
use
String[] strArr = str.split("\\[DLM]\\");
Instead of
String[] strArr = str.split("[DLM]");
Other wise it will split with either D, or L, or M.
I want to split a string "ABC\DEF" ?
I have tried
String str = "ABC\DEF";
String[] values1 = str.split("\\");
String[] values2 = str.split("\");
But none seems to be working. Please help.
String.split() expects a regular expression. You need to escape each \ because it is in a java string (by the way you should escape on String str = "ABC\DEF"; too), and you need to escape for the regex. In the end, you will end with this line:
String[] values = str.split("\\\\");
The "\\\\" will be the \\ string, which the regex will interpret as \.
Note that String.split splits a string by regex.
One correct way1 to specify \ as delimiter, in RAW regex is:
\\
Since \ is special character in regex, you need to escape it to specify the literal \.
Putting the regex in string literal, you need to escape again, since \ is also escape character in string literal. Therefore, you end up with:
"\\\\"
So your code should be:
str.split("\\\\")
Note that this splits on every single instance of \ in the string.
Footnote
1 Other ways (in RAW regex) are:
\x5C
\0134
\u005C
In string literal (even worse than the quadruple escaping):
"\\x5C"
"\\0134"
"\\u005C"
Use it:
String str = "ABC\\DEF";
String[] values1 = str.split("\\\\");
final String HAY = "_0_";
String str = "ABC\\DEF".replace("\\", HAY);
System.out.println(Arrays.asList(str.split(HAY)));