How to remove invisible [ZWSP] from string in Java - java

I have a String(assume str) received from some DB query.
str = "+Aa​+Bk​+Bo​+Ac​+Lc​";
But if copied the same string to intelliJ, It shows the invisible chars in str
I have to split this String (i.e.str) to String[] and then to List.
And getting this[ZWSP] in splatted Array and in converted List as well.
Also tried few/following techniques to trim and remove this, but did not worked.
String str = "+Aa​+Bk​+Bo​+Ac​+Lc​";
String[] strArr = str.split("\\+");
List<String> splitStrList = Arrays.stream(str.split("\\+"))
.map(String::trim)
.collect(Collectors.toList());
---Approach 2
String[] array2 = Arrays.stream(strArr).map(String::trim).toArray(String[]::new);
String[] trimmedArray = new String[array2.length];
for (int i = 0; i < array2.length; i++) {
trimmedArray[i] = array2[i].trim();
}
List<String> trimmedArrayList = Arrays.asList(trimmedArray);
Also few other approach, but while copying the output to intelliJ IDE seeing those [ZWSP] special chars.
That is creating issue in further processing.
How Can be these spcl chars i.e [ZWSP] removed to get List/Array like
[, Aa​, Bk​, Bo​, Ac​, Lc​]
Will Appreciate all suggestions/solutions to this problem.

That character it's called zero-width space as #Rogue mentions. You could use unicode character to remove it:
str.replace("\u200B", "");
Or you could split the string like:
str.split("\\+\u200B");
And then process the array as you need.
See:
https://www.fileformat.info/info/unicode/char/200b/index.htm

Related

Java - Split and trim in one shot

I have a String like this : String attributes = " foo boo, faa baa, fii bii," I want to get a result like this :
String[] result = {"foo boo", "faa baa", "fii bii"};
So my issue is how should to make split and trim in one shot i already split:
String[] result = attributes.split(",");
But the spaces still in the result :
String[] result = {" foo boo", " faa baa", " fii bii"};
^ ^ ^
I know that we can make a loop and make trim for every one but I want to makes it in shot.
Use regular expression \s*,\s* for splitting.
String result[] = attributes.split("\\s*,\\s*");
For Initial and Trailing Whitespaces
The previous solution still leaves initial and trailing white-spaces. So if we're expecting any of them, then we can use the following solution to remove the same:
String result[] = attributes.trim().split("\\s*,\\s*");
Using java 8 you can do it like this in one line
String[] result = Arrays.stream(attributes.split(",")).map(String::trim).toArray(String[]::new);
If there is no text between the commas, the following expression will not create empty elements:
String result[] = attributes.trim().split("\\s*,+\\s*,*\\s*");
You can do it with Google Guava library this way :
List<String> results = Splitter.on(",").trimResults().splitToList(attributes);
which I find quite elegant as the code is very explicit in what it does when you read it.
ApaceCommons StringUtils.stripAll function can be used to trim individual elements of an array. It leaves the null as null if some of your array elements are null.
Here,
String[] array = StringUtils.stripAll(attributes.split(","));
create your own custom function
private static String[] split_and_trim_in_one_shot(String string){
String[] result = string.split(",");
int array_length = result.length;
for(int i =0; i < array_length ; i++){
result[i]=result[i].trim();
}
return result;
Overload with a consideration for custom delimiter
private static String[] split_and_trim_in_one_shot(String string, String delimiter){
String[] result = string.split(delimiter);
int array_length = result.length;
for(int i =0; i < array_length ; i++){
result[i]=result[i].trim();
}
return result;
with streams
public static List<String> split(String str){
return Stream.of(str.split(","))
.map(String::trim)
.map (elem -> new String(elem))//optional
.collect(Collectors.toList());
What about spliting with comma and space:
String result[] = attributes.split(",\\s");
// given input
String attributes = " foo boo, faa baa, fii bii,";
// desired output
String[] result = {"foo boo", "faa baa", "fii bii"};
This should work:
String[] s = attributes.trim().split("[,]");
As answered by #Raman Sahasi:
before you split your string, you can trim the trailing and leading spaces. I've used the delimiter , as it was your only delimiter in your string
String result[] = attributes.trim().split("\\s*,[,\\s]*");
previously posted here: https://blog.oio.de/2012/08/23/split-comma-separated-strings-in-java/
Best way is:
value.split(",").map(function(x) {return x.trim()});

How to convert String containing Array into an Array

This might have been asked before, but I spent some time looking, so here's what I have.
I have a string containing an array:
'["thing1","thing2"]'
I would like to convert it into an actual array:
["thing1","thing2"]
How would I do this?
You could create a loop that runs through the whole string, checking for indexes of quotes, then deleting them, along with the word. I'll provide an example:
ArrayList<String> list = new ArrayList<String>();
while(theString.indexOf("\"") != -1){
theString = theString.substring(theString.indexOf("\"")+1, theString.length());
list.add(theString.substring(0, theString.indexOf("\"")));
theString = theString.substring(theString.indexOf("\"")+1, theString.length());
}
I would be worried about an out of bounds error from looking past the last quote in the String, but since you're using a String version of an array, there should always be that "]" at the end. But this creates only an ArrayList. If you want to convert the ArrayList to a normal array, you could do this afterwards:
String[] array = new String[list.size()];
for(int c = 0; c < list.size(); c++){
array[c] = list.get(c);
}
You can do it using replace and split methods of String class, e.g.:
String s = "[\"thing1\",\"thing2\"]";
String[] split = s.replace("[", "").replace("]", "").split(",");
System.out.println(Arrays.toString(split));

Java Convert String[] to Byte[] making sure to skip empty strings

I need to convert String[] to Byte[] in Java. Essentially, I have a space delimited string returned from my database. I have successfully split this String into an array of string elements, and now I need to convert each element into a byte, and produce a byte[] at the end.
So far, the code below is what I have been able to put together but I need some help making this work please, as the getBytes() function returns a byte[] instead of a single byte. I only need a single byte for the string (example string is 0xd1 )
byte[] localbyte = null;
if(nbytes != null)
{
String[] arr = (nbytes.split(" "));
localbyte = new byte[arr.length];
for (int i=0; i<localbyte.length; i++) {
localbyte[i] = arr[i].getBytes();
}
}
I assume you'd like to split strings like this:
"Hello world!"
Into "Hello", "world!" instead of "Hello", " ", "world!"
If that's the case, you can simply tweak on the split regex, using this instead:
String[] arr = (nbytes.split(" +"));
You should be familiar with regular expression. Instead of removing empty string after splitting, you can split the string with one or more white space:
To split a string by space or tab, you can use:
String[] arr = (nbytes.split("\\p{Blank}+"));
E.g.
"Hello \tworld!"
results in
"Hello","world!"
To split a string by any whitespace, you can use:
String[] arr = (nbytes.split("\\p{Space}+"));
E.g
"Hello \tworld!\nRegular expression"
results in
"Hello","world!","Regular","expression"
What about Byte(String string) (Java documentation).
Also you might want to look up Byte.parseByte(string) (doc)
byte[] localbyte = null;
if(nbytes != null)
{
String[] arr = (nbytes.split(" "));
localbyte = new byte[arr.length];
for (int i=0; i<localbyte.length; i++) {
localbyte[i] = new Byte(arr[i]);
}
}
Notice:
The characters in the string must all be decimal digits, except that
the first character may be an ASCII minus sign '-' ('\u002D') to
indicate a negative value.
So you might want to catch the NumberFormatException.
If this is not what your looking for maybe you can provide additional information about nbytes ?
Also Michael's answer could turn out helpful: https://stackoverflow.com/a/2758746/1063730

How to prevent java.lang.String.split() from creating a leading empty string?

passing 0 as a limit argument prevents trailing empty strings, but how does one prevent leading empty strings?
for instance
String[] test = "/Test/Stuff".split("/");
results in an array with "", "Test", "Stuff".
Yeah, I know I could roll my own Tokenizer... but the API docs for StringTokenizer say
"StringTokenizer is a legacy class that is retained for compatibility
reasons although its use is discouraged in new code. It is recommended
that anyone seeking this functionality use the split"
Your best bet is probably just to strip out any leading delimiter:
String input = "/Test/Stuff";
String[] test = input.replaceFirst("^/", "").split("/");
You can make it more generic by putting it in a method:
public String[] mySplit(final String input, final String delim)
{
return input.replaceFirst("^" + delim, "").split(delim);
}
String[] test = mySplit("/Test/Stuff", "/");
Apache Commons has a utility method for exactly this: org.apache.commons.lang.StringUtils.split
StringUtils.split()
Actually in our company we now prefer using this method for splitting in all our projects.
I don't think there is a way you could do this with the built-in split method. So you have two options:
1) Make your own split
2) Iterate through the array after calling split and remove empty elements
If you make your own split you can just combine these two options
public List<String> split(String inString)
{
List<String> outList = new ArrayList<>();
String[] test = inString.split("/");
for(String s : test)
{
if(s != null && s.length() > 0)
outList.add(s);
}
return outList;
}
or you could just check for the delimiter being in the first position before you call split and ignore the first character if it does:
String delimiter = "/";
String delimitedString = "/Test/Stuff";
String[] test;
if(delimitedString.startsWith(delimiter)){
//start at the 1st character not the 0th
test = delimitedString.substring(1).split(delimiter);
}
else
test = delimitedString.split(delimiter);
I think you shall have to manually remove the first empty string. A simple way to do that is this -
String string, subString;
int index;
String[] test;
string = "/Test/Stuff";
index = string.indexOf("/");
subString = string.substring(index+1);
test = subString.split("/");
This will exclude the leading empty string.
I think there is no built-in function to remove blank string in Java. You can eliminate blank deleting string but it may lead to error. For safe you can do this by writing small piece of code as follow:
List<String> list = new ArrayList<String>();
for(String str : test)
{
if(str != null && str.length() > 0)
{
list.add(str);
}
}
test = stringList.toArray(new String[list.size()]);
When using JDK8 and streams, just add a skip(1) after the split. Following sniped decodes a (very wired) hex encoded string.
Arrays.asList("\\x42\\x41\\x53\\x45\\x36\\x34".split("\\\\x"))
.stream()
.skip(1) // <- ignore the first empty element
.map(c->""+(char)Integer.parseInt(c, 16))
.collect(Collectors.joining())
You can use StringTokenizer for this purpose...
String test1 = "/Test/Stuff";
StringTokenizer st = new StringTokenizer(test1,"/");
while(st.hasMoreTokens())
System.out.println(st.nextToken());
This is how I've gotten around this problem. I take the string, call .toCharArray() on it to split it into an array of chars, and then loop through that array and add it to my String list (wrapping each char with String.valueOf). I imagine there's some performance tradeoff but it seems like a readable solution. Hope this helps!
char[] stringChars = string.toCharArray();
List<String> stringList = new ArrayList<>();
for (char stringChar : stringChars) {
stringList.add(String.valueOf(stringChar));
}
You can only add statement like if(StringUtils.isEmpty(string)) continue; before print the string. My JDK version 1.8, no Blank will be printed.
5
this
program
gives
me
problems

Parsing comma delimited text in Java

If I have an ArrayList that has lines of data that could look like:
bob, jones, 123-333-1111
james, lee, 234-333-2222
How do I delete the extra whitespace and get the same data back? I thought you could maybe spit the string by "," and then use trim(), but I didn't know what the syntax of that would be or how to implement that, assuming that is an ok way to do it because I'd want to put each field in an array. So in this case have a [2][3] array, and then put it back in the ArrayList after removing the whitespace. But that seems like a funny way to do it, and not scaleable if my list changed, like having an email on the end. Any thoughts? Thanks.
Edit:
Dumber question, so I'm still not sure how I can process the data, because I can't do this right:
for (String s : myList) {
String st[] = s.split(",\\s*");
}
since st[] will lose scope after the foreach loop. And if I declare String st[] beforehand, I wouldn't know how big to create my array right? Thanks.
You could just scan through the entire string and build a new string, skipping any whitespace that occurs after a comma. This would be more efficient than splitting and rejoining. Something like this should work:
String str = /* your original string from the array */;
StringBuilder sb = new StringBuilder();
boolean skip = true;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (skip && Character.isWhitespace(ch))
continue;
sb.append(ch);
if (ch == ',')
skip = true;
else
skip = false;
}
String result = sb.toString();
If you use a regex for you split, you can specify, a comma followed by optional whitespace (which includes spaces and tabs just in case).
String[] fields = mystring.split(",\\s*");
Depending on whether you want to parse each line separately or not you may first want to create an array split on a line return
String[] lines = mystring.split("\\n");
Just split() on each line with the delimiter set as ',' to get an array of Strings with the extra whitespace, and then use the trim() method on the elements of the String array, perhaps as they are being used or in advance. Remember that the trim() method gives you back a new string object (a String object is immutable).
If I understood your problem, here is a solution:
ArrayList<String> tmp = new ArrayList<String>();
tmp.add("bob, jones, 123-333-1111");
tmp.add(" james, lee, 234-333-2222");
ArrayList<String> fixedStrings = new ArrayList<String>();
for (String i : tmp) {
System.out.println(i);
String[] data = i.split(",");
String result = "";
for (int j = 0; j < data.length - 1; ++j) {
result += data[j].trim() + ", ";
}
result += data[data.length - 1].trim();
fixedStrings.add(result);
}
System.out.println(fixedStrings.get(0));
System.out.println(fixedStrings.get(1));
I guess it could be fixed not to create a second ArrayLis. But it's scalable, so if you get lines in the future like: "bob, jones , bobjones#gmail.com , 123-333-1111 " it will still work.
I've had a lot of success using this library.
Could be a bit more elegant, but it works...
ArrayList<String> strings = new ArrayList<String>();
strings.add("bob, jones, 123-333-1111");
strings.add("james, lee, 234-333-2222");
for(int i = 0; i < strings.size(); i++) {
StringBuilder builder = new StringBuilder();
for(String str: strings.get(i).split(",\\s*")) {
builder.append(str).append(" ");
}
strings.set(i, builder.toString().trim());
}
System.out.println("strings = " + strings);
I would look into:
http://download.oracle.com/docs/cd/E17476_01/javase/1.4.2/docs/api/java/lang/String.html#split(java.lang.String)
or
http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/api/java/util/Scanner.html
you can use Sting.split() method in java or u can use split() method from google guava library's Splitter class as shown below
static final Splitter MY_SPLITTER = Splitter.on(',')
.trimResults()
.omitEmptyStrings();

Categories

Resources