String splitting with different character - java

i think it is a weird question. So here is my splitting:
String s = "asd#asd";
String[] raw1 = s.split("#"); // this has size of two raw[0] = raw[1] = "asd"
However,
String s = "asd$asd";
String[] raw2 = s.split("$"); // this has size of ONE
raw2 is not splitted. Does anyone know why?

Because split() takes a regexp, and $ indicates the end-of-line. If you need to split on a character that is actually a regexp metacharacter, then you'll need to escape it.
See Pattern for the regexp metacharacters.
You may find that StringTokenizer is more appropriate for your needs. This will take a list of characters that you should split on, and it won't interpret them as regular expression metacharacters. However it's a little more verbose and unweildy to use. As Nandkumar notes below, the latest docs states that it is discouraged in new code.

Because split() takes a regex and $ matches the end of a line.
You have to escape it :
s.split("\\$");
See Pattern documentation for more information on regexes.

You have to escape it:
String s = "asd$asd";
String[] raw2 = s.split("\\$"); // this has size of TWO

You need to escape special character, make it
s.split("\\$");

Related

Split String if it has number

Hi Guys its been a while since I ask another question,
I have this String which consist of a name and a number
Ex.
String myString = "give11arrow123test2356read809cell1245cable1257give222..."
Now what I am trying to do is to split it whenever there is a number attached to it
I have to split it so that I could have a result like this
give11, arrow123, test2356, read809, cell1245, cable1257, give222, ....
I could use this code but I cant find the right regex
String[] arrayString = myString.split("Regex")
Thanks for your help.
You can use a combination of lookarounds to split your string.
Lookarounds are zero-width assertions. They don't consume any characters on the string. The point of zero-width is the validation to see if a regex can or cannot be matched looking ahead or looking back from the current position, without adding them to the overall match.
String s = "give11arrow123test2356read809cell1245cable1257give222...";
String[] parts = s.split("(?<=\\d)(?=\\D)");
System.out.println(Arrays.toString(parts));
Output
[give11, arrow123, test2356, read809, cell1245, cable1257, give222, ...]
Use this regex for spliting
String regex = "(?<=\\d)(?=\\D)";
I am unfamiliar with using regex in java, but this expression matches what you need on www.rubular.com
([A-Za-z]+[0-9]+)

Java String Split on any character (including regex special characters)

I'm sure I'm just overlooking something here...
Is there a simple way to split a String on an explicit character without applying RegEx rules?
For instance, I receive a string with a dynamic delimiter, I know the 5th character defines the delimiter.
String s = "This,is,a,sample";
For this, it's simple to do
String delimiter = String.valueOf(s.charAt(4));
String[] result = s.split(delimiter);
However, when I have a delimiter that's a special RegEx character, this doesn't work:
String s = "This*is*a*sample";
So... is there a way to split the string on an explicit character without trying to apply extra RegEx rules? I feel like I must be missing something pretty simple.
split uses a regular expression as its argument. * is a meta-character used to match zero of more characters in regular expressions, You could use Pattern#quote to avoid interpreting the character
String[] result = s.split(Pattern.quote(delimiter));
You need not to worry about the character type If you use Pattern
Pattern regex = Pattern.compile(s.charAt(4));
Matcher matcher = regex.matcher(yourString);
if (matcher.find()){
//do something
}
You can run Pattern.quote on the delimiter before feeding it in. This will create a string literal and escape any regex specific chars:
delimiter = Pattern.quote(delimiter);
StringUtils.split(s, delimiter);
That will treat the delimiter as just a character, not use it like a regex.
StringUtils is a part of the ApacheCommons library, which is tons of useful methods. It is worth taking a look, could save you some time in the future.
Simply put your delimiter between []
String delimiter = "["+s.charAt(4)+"]";
String[] result = s.split(delimiter);
Since [ ] is the regex matches any characters between [ ]. You can also specify a list of delimiters like [*,.+-]

Using regex to separate individual words?

I have the following line to split a sentence into words and store it into an array based on white spaces: string[] s = Regex.Split(input, #"\s+");
The problem is at the end of the sentence, it also picks up the period. For example: C# is cool.
The code would store:
C#
is
cool.
The question is: How do I get it not to pick up the period ?
You can use a character class [] to add in the dot . or other characters that you need to split on.
string[] s = Regex.Split(input, #"[\s.]+");
See Demo
You can add dot (and other punctuation marks as needed) to the regular expression, like this:
string[] s = Regex.Split(input, #"(\s|[.;,])+");
string[] s = Regex.Split(input, #"[^\w#]+");
You may need to add more characters to set [^\w#], so it will work for you based on your requirements...
Use the non-word character pattern: \W
string[] s = Regex.Split(input, #"\W+");
Consider using Regex.Matches as alternative for your requirement...
string[] outputMessage = Regex.Matches(inputMessage, #"\w+").Cast<Match>().Select(match => match.Value).ToArray();
Good Luck!

how to ignore newlines for split function

I am splitting the string using ^ char. The String which I am reading, is coming from some external source. This string contains some \n characters.
The string may look like:
Hi hello^There\nhow are\nyou doing^9987678867abc^popup
when I am splitting like below, why the array length is coming as 2 instead of 4:
String[] st = msg[0].split("^");
st.length //giving "2" instead of "4"
It look like, split is ignoring after \n.
How can I fix it without replacing \n to some other character.
the string parameter for split is interpreted as regular expression.
So you have to escape the char and use:
st.split("\\^")
see this answer for more details
Escape the ^ character. Use msg[0].split("\\^") instead.
String.split considers its argument as regular expression. And as ^ has a special meaning when it comes to regular expressions, you need to escape it to use its literal representation.
If you want to split by ^ only, then
String[] st = msg[0].split("\\^");
If I read your question correctly, you want to split by ^ and \n characters, so this would suffice.
String[] st = msg[0].split("[\\^\\\\n]");
This considers that \n literally exists as 2 characters in a string.
"^" it's know as regular expression by the JDK.
To avoid this confusion you need to modify the code as below
old code = msg[0].split("^")
new code = msg[0].split("\\^")

Splitting a string on space except for single space

I was splitting a string on white spaces using the following
myString.split("\\s+");
How do i provide exception for single space. i.e split on space except for single space
Like this:
myString.split("\\s{2,}");
or like this,
myString.split(" \\s+"); // notice the blank at the beginning.
It depends on what you really want, which is not clear by reading the question.
You can check the quantifier syntax in the Pattern class.
You can use a pattern like
myString.split("\\s\\s+");
This only matches if a whitespace character is followed by further whitespace charactes.
Please note that a whitespace character is more than a simple blank.
"Your String".split("\\s{2,}");
will do the job.
For example:
String str = "I am a String";
String []strArr = str.split("\\s{2,}");
This will return an array with length 3.
The following would be the output.
strArr[0] = "I am"
strArr[1] = "a"
strArr[2] = "String"
I hope this answers your question.
If you literally want to exclude a single space, as opposed to other types of whitespace, then you'll need the following:
s.split("\\s{2,}|[\\s&&[^ ]]")
This constructs a character class by subtracting the space from the \s built-in character class.

Categories

Resources