Java Byte[] to String conversion dropping end quotes / weird side-effect

Java Byte[] to String conversion dropping end quotes / weird side-effect - java

I am currently trying to perform some regex on the result of a DatagramPacket.getData() call.
Implemented as String myString = new String(thepkt.getData()):
But weirdly, java is dropping the end quotation that it uses to encapsulate all data(see linked image below).
When I click the field in the variable inspector during a debug session and don't change anything, when I click off the variable field it corrects itself again without me changing anything. It even highlights the variable inspection field in yellow to signal change.
Its values are also displaying like it is still a byte array rather than a String object
http://i.imgur.com/8ZItsZI.png
It's throwing off my regex and I can't see anything that would cause it. It's a client server simulation and on the client side, the getData returns the data no problem.

I got it working by using the solution provided in:
https://stackoverflow.com/a/8557165/1700855
But I still don't understand how not specifying the length of the packet to the String constructor would cause it to drop the systematic end double quotes. Can anyone provide an explanation as I really like to understand solutions to my issues before moving on :)

The problem is that you didn't read the spec for DatagramPacket.getData:
Returns the data buffer. The data received or the data to be sent
starts from the offset in the buffer, and runs for length long.
So, to be correct, you should use
new String(thepkt.getData(), thepkt.getOffset(), thepht.getLength())
Or, to not use the default charset:
new String(thepkt.getData(), thepkt.getOffset(), thepht.getLength(), someCharset)

Related

Java: Faster alternative to String(byte[])

I am developing a Java-based downloader for binary data. This data is transferred via a text-based protocol (UU-encoded). For the networking task the netty library is used. The binary data is split by the server into many thousands of small packets and sent to the client (i.e. the Java application).
From netty I receive a ChannelBuffer object every time a new message (data) is received. Now I need to process that data, beside other tasks I need to check the header of the package coming from the server (like the HTTP status line). To do so I call ChannelBuffer.array() to receive a byte[] array. This array I can then convert into a string via new String(byte[]) and easily check (e.g. compare) its content (again, like comparison to the "200" status message in HTTP).
The software I am writing is using multiple threads/connections, so that I receive multiple packets from netty in parallel.
This usually works fine, however, while profiling the application I noticed that when the connection to the server is good and data comes in very fast, then this conversion to the String object seems to be a bottleneck. The CPU usage is close to 100% in such cases, and according to the profiler very much time is spent in calling this String(byte[]) constructor.
I searched for a better way to get from the ChannelBuffer to a String, and noticed the former also has a toString() method. However, that method is even slower than the String(byte[]) constructor.
So my question is: Does anyone of you know a better alternative to achieve what I am doing?

Perhaps you could skip the String conversion entirely? You could have constants holding byte arrays for your comparison values and check array-to-array instead of String-to-String.
Here's some quick code to illustrate. Currently you're doing something like this:
String http200 = "200";
// byte[] -> String conversion happens every time
String input = new String(ChannelBuffer.array());
return input.equals(http200);
Maybe this is faster:
// Ideally only convert String->byte[] once. Store these
// arrays somewhere and look them up instead of recalculating.
final byte[] http200 = "200".getBytes("UTF-8"); // Select the correct charset!
// Input doesn't have to be converted!
byte[] input = ChannelBuffer.array();
return Arrays.equals(input, http200);

Some of the checking you are doing might just look at part of the buffer. If you could use the alternate form of the String constructor:
new String(byteArray, startCol, length)
That might mean a lot less bytes get converted to a string.
Your example of looking for "200" within the message would be an example.
2
You might find that you can use the length of the byte array as a clue. If some messages are long and you are looking for a short one, ignore the long ones and don't convert to characters. Or something like that.
3
Along with what #EricGrunzke said, partially looking in the byte buffer to filter out some messages and find that you don't need to convert them from bytes to characters.
4
If your bytes are ASCII characters, the conversion to characters might be quicker if you use charset "ASCII" instead of whatever the default is for your server:
new String(bytes, "ASCII")
might be faster in that case.
In fact, you might be able to pick and choose the charset for conversion byte-character in some organized fashion that speeds up things.

Depending on what you are trying to do there are a few options:
If you are just trying to get the response status to then can't you just call getStatus()? This would probably be faster than getting the string out.
If you are trying to convert the buffer, then, assuming you know it will be ASCII, which it sounds like you do, then just leave the data as byte[] and convert your UUDecode method to work on a byte[] instead of a String.
The biggest cost of the string conversion is most likely the copying of the data from the byte array to the internal char array of the String, this combined with the conversion is most likely just a bunch of work that you don't need to do.

cordova.exec strange behaviour in parameters

Using cordova 2.6, i am calling a plugin using cordova.exec.
The issue is that the parameters if passed in an array element are being cut(truncated from the end by some characters) at the java side.
but if i pass a simple string as a parameter its being passed perfectly fine.
Passed 5714 characters in string and its going without any isues, but passing 450 characters in an array's first element is cutting the array into about 1/4th of its original length(119) .
Then tried the following :
Converted the first array element into string.(Checked the typeof using javascript) and passed but did not help.
Created a substring of the original array and passed static limits to the substring i.e substring(0,4000) etc etc, but to no luck.
Made a clone of the original array and repeated steps 1 and 2 but again no luck.
Could someone tell me where the issue is?
Increased eclipse heap memory and changed to cordova 2.8.1 too, but the same result. :(

I got the reason for the behaviour. Inspected many images , and in my execute method of plugin did the following :
int myLength = args.getString(0).length();
Log.v(TAG,Character.toString(args.getString(0).charAt(myLength-1)));
Got the character as = on the plugin/java side. And an = is inserted in the base64 string at the end of it. It seems that cordova is inserting these delimeters in the base64string.
P.S. :Went through phonegap.js and found that the arguments are being packed as JSON.stringify , and after that could not get more. That how and where the = is inserted.

stringbuffer and "0&" causes truncation or escaping

Sorry for the unclear title but I don't even know what to call it, I'll just go ahead and explain what's happening.
I'm using a Stringbuffer to build an URL. It looks like this:
http://maps.googleapis.com/maps/api/geocode/json?latlng=49.0516736,8.38891840&sensor=false
I encountered this behavious when comparing this string in a Unit-test to the actual result of the method.
And this is the assertion-error I'm getting:
latlng=49.0516736[,8.38891840]&sensor=false> but was:<...on?latlng=49.0516736[,8.3889184]&sensor=false
The emphasis is on the character sequence 0]& and 4]& right before sensor=false
IF I remove the zero before the & the test goes green.
then the created string looks like this:
latlng=49.0516736,8.3889184&sensor=false
so ... just as expected.
It's not the problem, that the 0 itself gets truncated and test would fail - I've proved that my code is doing what it's supposed to (when I remove the zero), but I want to know what is happening here.
0& must be some kind of indication for array-access or some kind of escaping. I don't know.
Anyone any idea what's causing this behaviour?
Edit:
Here's the code I'm using
StringBuffer s = new StringBuffer( grailsApplication.config.geocodingurl.toString() )
s.append(coordinates.latitude)
s.append(",")
s.append(coordinates.longitude)
s.append("&sensor=false")
return s.toString()

There is a formatting/padding issue when converting double into String.
What you are doing is probably using StringBuilder#append(double) which in the end calls Double#toString().
See the javadoc of those methods and find out how double values are converted to String.
Alternatively, if you want to have control over your code, use NumberFormat or it's subclasses.

reading multiple lines in file upload

can anyone tell me how to read multiple lines and store their value.
eg:file.txt
Probable Cause: The network operator has issued an alter attribute command for
the specified LCONF assign. The old value and the new value are show
Action Taken : The assign value is changed from the old value to the new
value. Receipt of this message does not guarantee that the new attribute
value was accepted by clients who use it. Additional messages may be.
Probable Cause: The network operator has issued an info attribute command for
the specified LCONF assign. The default value being used is displaye
Action Taken : None. Informational use only.
In the above file, Probable Cause and Action Taken are the column of a database table. And after Probable Cause: those are the value to be stored in the database table for probable cause column, same goes with action taken.
So how can i read the multiple lines and store their value? I have to read the value for probable cause until the line comes with Action Taken. I'm using BufferedReader and the readLine() method to read one line at a time. So can anyone tell me how to read directly from probable cause to action taken no matter how many line comes between them.

The simplest way is probably to just keep a List<String> for each value, with loops something like this:
private static final String ACTION_TAKEN_PREFIX = "Action Taken ";
...
String line;
while ((line = reader.readLine()) != null)
{
if (line.startsWith(ACTION_TAKEN_PREFIX))
{
actions.add(line.substring(ACTION_TAKEN_PREFIX))
// Keep reading the rest of the actions
break;
}
causes.add(line);
}
// Now handle the fact that either we've reached the end of the file, or we're
// reading the actions
Once you've got a "Probable Cause" / "Actions Taken" pair, convert the list of strings back to a single string, e.g. joining with "\n", and then insert in the database. (The Joiner class in Guava will make this easier.)
The tricky bit is dealing with anomalies:
What happens if you don't start with a Probable Cause?
What happens if one probable cause is followed by another, or one set of actions is followed by another?
What happens if you reach the end of the file after reading a probably cause but no list of actions?
I don't have the time to write out a complete solution now, but hopefully the above will help to get you going.

OrientDB having trouble with Unicode, Turkish, and enums

I am using a lib which has an enum type with consts like these;
Type.SHORT
Type.LONG
Type.FLOAT
Type.STRING
While I am debugging in Eclipse, I got an error:
No enum const class Type.STRİNG
As I am using a Turkish system, there is a problem on working i>İ but as this is an enum const, even though I put every attributes as UTF-8, nothing could get that STRING is what Eclipse should look for. But it still looks for STRİNG and it can't find and I can't use that. What must I do for that?
Project > Properties > Resouce > Text file encoding is UTF-8 now. Problem keeps.
EDIT: More information may give some clues which I can't get;
I am working on OrientDB. This is my first attempt, so I don't know if the problem could be on OrientDB packages. But I am using many other libs, I have never seen such a problem. There is a OType enum in this package, and I am only trying to connect to the database.
String url = "local:database";
ODatabaseObjectTx db = new ODatabaseObjectTx(url).
Person person = new Person("John");
db.save(person);
db.close();
There is no more code I use yet. Database created but then I get the java.lang.IllegalArgumentException:
Caused by: java.lang.IllegalArgumentException: No enum const class com.orientechnologies.orient.core.metadata.schema.OType.STRİNG
at java.lang.Enum.valueOf(Unknown Source)
at com.orientechnologies.orient.core.metadata.schema.OType.valueOf(OType.java:41)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLCreateProperty.parse(OCommandExecutorSQLCreateProperty.java:81)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLCreateProperty.parse(OCommandExecutorSQLCreateProperty.java:35)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.parse(OCommandExecutorSQLDelegate.java:43)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.parse(OCommandExecutorSQLDelegate.java:28)
at com.orientechnologies.orient.core.storage.OStorageEmbedded.command(OStorageEmbedded.java:63)
at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.addProperty(OClassImpl.java:342)
at com.orientechnologies.orient.core.metadata.schema.OClassImpl.createProperty(OClassImpl.java:258)
at com.orientechnologies.orient.core.metadata.security.OSecurityShared.create(OSecurityShared.java:177)
at com.orientechnologies.orient.core.metadata.security.OSecurityProxy.create(OSecurityProxy.java:37)
at com.orientechnologies.orient.core.metadata.OMetadata.create(OMetadata.java:70)
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.create(ODatabaseRecordAbstract.java:142)
... 4 more
Here is OType class: http://code.google.com/p/orient/source/browse/trunk/core/src/main/java/com/orientechnologies/orient/core/metadata/schema/OType.java
And other class; OCommandExecutorSQLCreateProperty:
http://code.google.com/p/orient/source/browse/trunk/core/src/main/java/com/orientechnologies/orient/core/sql/OCommandExecutorSQLCreateProperty.java
Line 81 says: type = OType.valueOf(word.toString());

Am I correct to assume you are running this program using a turkish locale? Then it seems the bug is in line 118 of OCommandExecutorSQLCreateProperty:
linkedType = OType.valueOf(linked.toUpperCase());
You would have to specify the Locale whose upper casing rules should be used, probably Locale.ENGLISH as the parameter to toUpperCase.

This problem is related to your database connection. Presumably, there's a string in OrientDB somewhere, and you are reading it, and then trying to use it to select a member of the enum.
I'm assuming in the code that you posted that the variable word comes from data in the database. If it comes from somewhere else, then the problem is the 'somewhere else'. If OrientDB, for some strange reason, returns 'STRİNG' as metadata to tell you the type of something, then that is indeed a defect in OrientDB.
If that string actually contains a İ, then no Eclipse setting will have any effect on the results. You will have to write code to normalize İ to I.
If you dump out the contents of 'word' as a sequence of hex values for the chars of the string, I think you'll see your İ staring right at you. You have to change what's in the DB to have a plain old I.

Unfortunately, it is related with regional setting, locale of your OS which is Turkish.
Two work around options :
1. Change your regional settings to English-US
2. Give encoding to the jvm as command line param for setting locale to English
-Duser.language=en -Duser.region=EN
I have created bug reports for xmlbeans, exist and apache cxf for the same issue. Enumeration toUpper is the point of the exception.
Some related links:
https://issues.apache.org/jira/browse/XMLSCHEMA-22
http://mail-archives.apache.org/mod_mbox/xmlbeans-user/201001.mbox/%3CSNT123-DS11993DD331D6CA7799C46CF6650#phx.gbl%3E
http://mail-archives.apache.org/mod_mbox/cxf-users/201203.mbox/%3CBLU0-SMTP115A668459D9A0DA11EA5FAF6460#phx.gbl%3E
https://vaadin.com/forum/-/message_boards/view_message/793105
http://comments.gmane.org/gmane.comp.apache.cxf.user/18316

One work-around is to type Type.ST and then press Ctrl-space. Eclipse should auto-complete the variable name without you having to figure out how to enter a dotless capital I on a Turkish keyboard. :)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Byte[] to String conversion dropping end quotes / weird side-effect - java

Related

Java: Faster alternative to String(byte[])

cordova.exec strange behaviour in parameters

stringbuffer and "0&" causes truncation or escaping

reading multiple lines in file upload

OrientDB having trouble with Unicode, Turkish, and enums

Categories

Resources