StringTokenizer in java. Why is it adding one more space - java

I am using jdk 1.6 (it is older but ok). I have a function like this:
public static ArrayList gettokens(String input, String delim)
{
ArrayList tokenArray = new ArrayList();
StringTokenizer tokens = new StringTokenizer(input, delim);
while (tokens.hasMoreTokens())
{
tokenArray.add(tokens.nextToken());
}
return tokenArray;
}
My initial intention is to use tokens to clear the input string of duplicate emails (that is initial).
Let's say I have
input = ", email-1#email.com, email-2#email.com, email-3#email.com"; //yes with , at the beginning
delim = ";,";
And when I run above function the result is:
[email-1#email.com, email-2#email.com, email-3#email.com]
Which is fine, but there is added one more space between , and email .
Why is that? and how to fix it?
Edit:
here is the function that prints the output:
List<String> tokens = StringUtility.gettokens(email, ";,");
Set<String> emailSet = new LinkedHashSet<String>(tokens);
emails = StringUtils.join(emailSet, ", ");
hehe, and now I see the answer.
Edit 2 - the root cause:
the root cause of the problem was that line of the code:
emails = StringUtils.join(emailSet, ", ");
Was adding an extra ", " when joining tokens.
From the example above, one token would look like this " email-1#email.com" and when join in applied it will add comma and space before token. So if a token has a space at the beginning of the string, then it will have two spaces between comma and space.
Example:
", " + " email-1#email.com" = ",<space><space>email-1#email.com"

When printing array list, it prints all the object comma and space separated. Your input also have a space before each comma so that causes two.
You can use:
tokenArray.add(tokens.nextToken().trim());
to remove unwanted spaces from your input.

You've got spaces in your string, and ArrayList's implementation of toString adds a space before each element. The idea is that you've got a list of "x", "y" and "z", the output should be "[x, y, z]" rather than "[x,y,z]"
Your real problem probably is that you've kept the spaces in the tokens. Fix:
public static List<String> gettokens(String input, String delim)
{
ArrayList<String> tokenArray = new ArrayList<String>();
StringTokenizer tokens = new StringTokenizer(input, delim);
while (tokens.hasMoreTokens())
{
tokenArray.add(tokens.nextToken().trim());
}
return tokenArray;
}

You can change the delim to include the sapce ", " then it would not be conatined in the tokens elements.
Easier would be to use the split() method which returns a string array, so basically the method will look like:
public static ArrayList gettokens(String input, string delim)
{
return Arrays.asList(input.split(delim));
}

I think it would be a better approach to use split method of String, just because it would be shorter. All you would need to do is :
String[] values = input.split(delim);
It will return an array instead of a List.
The reason of your space is because you are adding it in your printing method.
List<String> tokens = StringUtility.gettokens(email, ";,");
Set<String> emailSet = new LinkedHashSet<String>(tokens);
emails = StringUtils.join(emailSet, ", "); //adds a space after a comma
So StringTokenizer works as expected.
In your case, without much modifying the code, you could use trim function to clear the spaces before removing duplicates, and then join with separator ", " like this:
tokenArray.add(tokens.nextToken().trim());
And you will get result without two spaces.

There is no space or comma in between.
Try to print your ArrayList as:
for(Object obj: tokenArray )
System.out.println(obj);

Related

Why does System.out.println print a new line while System.out.print prints nothing?

I am having trouble with split function. I do not know how it works.
Here is my source code:
// Using println to print out the result
String str = " Welcome to Java Tutorial ";
str = str.trim();
String[] arr = str.split(" ");
for (String c : arr) {
System.out.println(c);
}
//Using print to print out the result
String str = " Welcome to Java Tutorial ";
str = str.trim();
String[] arr = str.split(" ");
for (String c : arr) {
System.out.print(c);
}
and the results are:
The first is the result when using println, the second is the result when using print,
I do not understand why space appears in println, while it does not appear in print. Can anyone explain it for me?
Since you have many spaces in your string, if you look at the output of split function, the resulted array looks like
[Welcome, , , , , , , to, , , , Java, , , , , Tutorial]
So if you look close they are empty String's "".
When you do a println("") it is printing a line with no string.
However when you do print(""), it is no more visibility of that string and it looks nothing getting printed.
If you want to separate them regardless of spaces between them, split them by white space.
Lastly, trim() won't remove the spaces within the String. It can only trim spaces in the first and last.
What you are doing is splitting by every individual white space. So for every space you have in your string, it is split as a separate string If you want to split just the words, you can use the whitespace regex:
String[] arr = str.split("\\s+");
This will fix your problem with the consecutive whitespaces you are printing.
Also, When you use print instead of println your print value DOES NOT carry over to the next line. Thus when you cann println("") you are just going to a new line.
println() method prints everything in a new line but the print() method prints everything in the same line.
So, when you are splitting your string by space(" ") then,
for the first one, the string is splitting for every space. So, every time a new line comes while printing. You can't see anything just the new line because you are printing: "" this. Because your code is:
System.out.println("");
But for the second one, the string is splitting for every space but you are using print() method that's why it's not going to the new line.
You can overcome this by regex.
You can use this regex: \\s+
So, you have to write:
String[] arr = str.split("\\s+");

How do I escape parentheses in java 7?

I'm trying to split some input from BufferedReader.readLine()
String delimiters = " ,()";
String[] s = in.readLine().split(delimiters);
This gives me a runtime error.
Things I have tried that don't work:
String delimiters = " ,\\(\\)";
String delimiters = " ,[()]";
String[] s = in.readLine().split(Pattern.quote("() ,"));
I tried replacing the () using .replaceAll, didn't work
I tried this:
input = input.replaceAll(Pattern.quote("("), " ");
input = input.replaceAll(Pattern.quote(")"), " ");
input = input.replaceAll(Pattern.quote(","), " ");
String[] s = input.split(" ");
but s[] ends up with blank slots that look like this -> "" no clue why its doing that
Mine works, for
String delimiters = "[ \\(\\)]"
Edit:
You forgot Square brakcets which represents, "Any of the characters in the box will be used as delimiters", its a regex.
Edit:
To remove the empty elements: Idea is to replace any anagram of set of delimiters to just 1 delimiter
Like.
// regex to match any anagram of a given set of delimiters in square brackets
String r = "(?!.*(.).*\1)[ \\(\\)]";
input = input.replaceAll(r, "(");
// this will result in having double or more combinations of a single delimiter, so replace them with just one
input = input.replaceAll("[(]+", "(");
Then you will have the input, with any single delimiter. Then use the split, it will not have any blank words.
From your comment:
but I am only input 1 line: (1,3), (6,5), (2,3), (9,1) and I need 13652391 so s[0] = 1, s[1]=3, ... but I get s[0] = "" s[1] = "" s[2] = 1
You get that because your delimiters are either " ", ",", "(" or ")" so it will split at every single delimiter, even if there is no other characters between them, in which case it will be split into an empty string.
There is an easy fix to this problem, just remove the empty elements!
List<String> list = Arrays.stream(
"(1,3), (6,5), (2,3), (9,1)".split("[(), ]")).filter(x -> !x.isEmpty())
.collect(Collectors.toList());
But then you get a List as the result instead of an array.
Another way to do this, is to replace "[(), ]" with "":
String result = "(1,3), (6,5), (2,3), (9,1)".replaceAll("[(), ]", "");
This will give you a string as a result. But from the comment I'm not sure whether you wanted a string or not. If you want an array, just call .split("") and it will be split into individual characters.

How to use split function when input is new line?

The question is we have to split the string and write how many words we have.
Scanner in = new Scanner(System.in);
String st = in.nextLine();
String[] tokens = st.split("[\\W]+");
When I gave the input as a new line and printed the no. of tokens .I have got the answer as one.But i want it as zero.What should i do? Here the delimiters are all the symbols.
Short answer: To get the tokens in str (determined by whitespace separators), you can do the following:
String str = ... //some string
str = str.trim() + " "; //modify the string for the reasons described below
String[] tokens = str.split("\\s+");
Longer answer:
First of all, the argument to split() is the delimiter - in this case one or more whitespace characters, which is "\\s+".
If you look carefully at the Javadoc of String#split(String, int) (which is what String#split(String) calls), you will see why it behaves like this.
If the expression does not match any part of the input then the resulting array has just one element, namely this string.
This is why "".split("\\s+") would return an array with one empty string [""], so you need to append the space to avoid this. " ".split("\\s+") returns an empty array with 0 elements, as you want.
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
This is why " a".split("\\s+") would return ["", "a"], so you need to trim() the string first to remove whitespace from the beginning.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Since String#split(String) calls String#split(String, int) with the limit argument of zero, you can add whitespace to the end of the string without changing the number of words (because trailing empty strings will be discarded).
UPDATE:
If the delimiter is "\\W+", it's slightly different because you can't use trim() for that:
String str = ...
str = str.replaceAll("^\\W+", "") + " ";
String[] tokens = str.split("\\W+");
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String line = null;
while (!(line = in.nextLine()).isEmpty()) {
//logic
}
System.out.print("Empty Line");
}
output
Empty Line

Java. How to remove white space on array

For example
I split a string "+name" by +. I got an white space" " and the "name" in the array(this doesn't happen if my string is "name+").
t="+name";
String[] temp=t.split("\\+");
the above code produces
temp[0]=" "
temp[1]=name
I only wants to get "name" without whitespace..
Also if t="name+" then temp[0]=name. I'm wondering what is difference between name+ and +name. Why do I get different output.
simply loop thru the items in array like the one below and remove white space
for (int i = 0; i < temp.length; i++){
temp[i] = if(!temp[i].trim().equals("") || temp[i]!=null)temp[i].trim();
}
The value of the first array item is not a space (" ") but an empty string (""). The following snippet demonstrates the behaviour and provides a workaround: I simply strip leading delimiters from the input. Note, that this should never be used for processing csv files, because a leading delimiter will create an empty column value which is usually wanted.
for (String s : "+name".split("\\+")) {
System.out.printf("'%s'%n", s);
}
System.out.println();
for (String s : "name+".split("\\+")) {
System.out.printf("'%s'%n", s);
}
System.out.println();
for (String s : "+name".replaceAll("^\\+", "").split("\\+")) {
System.out.printf("'%s'%n", s);
}
You get the extra element for "+name"'s case is because of non-empty value "name" after the delimiter.
The split() function only "trims" the trailing delimiters that result to empty elements at the end of an array. See JavaSE Manual.
Examples of .split("\\+") output:
"+++++" = { } // zero length array because all are trailing delimiters
"+name+" = { "", "name" } // trailing delimiter removed
"name+++++" = { "name" } // trailing delimiter removed
"name+" = { "name" } // trailing delimiter removed
"++name+" = { "", "", "name" } // trailing delimiter removed
I would suggest preventing to have those extra delimiters on both ends rather than cleaning up afterwards.
to remove white space
str.replaceAll("\\W","").
String yourString = "name +";
yourString = yourString.replaceAll("\\W", "");
yourArray = yourString.split("\\+");
For a one liner :
String temp[] = t.replaceAll("(^\\++)?(\\+)?(\\+*)?", "$2").split("\\+");
This will replace all multiple plus signs by one, or a plus sign at the start by empty String, and then split on plus signs.
Which will basically eliminate empty Strings in the result.
split(String regex) is equivalent to split(String regex, int limit) with limit = 0. And the documentation of the latter states :
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Which is why a '+' at the start works differently than a '+' at the end
You might want to give guavas Splitter a try. It has a nice fluent api to deal with emptyStrings, trim(), etc.
#Test
public void test() {
final String t1 = "+name";
final String t2 = "name+";
assertThat(split(t1), hasSize(1));
assertThat(split(t1).get(0), is("name"));
assertThat(split(t2), hasSize(1));
assertThat(split(t2).get(0), is("name"));
}
private List<String> split(final String sequence) {
final Splitter splitter = Splitter.on("+").omitEmptyStrings().trimResults();
return Lists.newArrayList(splitter.split(sequence));
}

How do I split text by "," and get rid of the "," in java?

I want to split and get rid of the comma's in a string like this that are entered into a textfield:
1,2,3,4,5,6
and then display them in a different textfield like this:
123456
here is what i have tried.
String text = jTextField1.getText();
String[] tokens = text.split(",");
jTextField3.setText(tokens.toString());
Can't you simply replace the , ?
text = text.replace(",", "");
If you're going to put it back together again, you don't need to split it at all. Just replace the commas with the empty string:
jTextField3.setText(text.replace(",", ""));
Assuming this is what you really want to do (e.g. you need to use the individual elements somewhere before concatenating them) the following snippet should work:
String s1 = "1,2,3,4,5,6";
String ss[] = s1.split(",", 0);
StringBuilder sb = new StringBuilder();
for (String s : ss) {
// Use each element here...
sb.append(s);
}
String s2 = sb.toString(); // 123456
Note that the String#split(String) method in Java has strange default behavior so using the method that takes an additional int parameter is recommended.
I may be wrong, but I believe that call to split will get rid of the commas. And it should leave tokens an array of just the numbers

Categories

Resources