byte array length varies before and after transformation - java

I have a need to send and receive large byte array over internet(http restful service).
the simplest way I can think of is to convert the byte array into string.
I searched around and found this post Java Byte Array to String to Byte Array
I had the follow code to verify the accuracy of the transformation.
System.out.println("message");
System.out.println (message);
String message = "Die Strahlengriffelgewächse stammen...";
byte[] pack = Fbs.packExce(message);
System.out.println ("pack");
System.out.println (pack);
System.out.println ("packlenght:" + pack.length);
String toString = new String(pack);
System.out.println ("toString");
System.out.println (toString);
byte[] toBytes = toString.getBytes();
System.out.println ("toBytes");
System.out.println (toBytes);
System.out.println ("toByteslength:" +toBytes.length);
the "Fbs.packExce()" is a method of taking in large chunk of string and churning out byte array of large size.
I changed the length of the message, checked and printed out the length of byte arrays before converting to string and after converting back.
I got the following results:
...
pack
[B#5680a178
packlenght:748
...
toBytes
[B#5fdef03a
toByteslength:750
----------------------
...
pack
[B#5680a178
packlenght:1016
...
toBytes
[B#5fdef03a
toByteslength:1018
I had omitted the "message" since it is too long.
8 times out of 10, I can see that the derived byte array(the new one, saying "toBytes") is longer by 2 bytes than the original byte array ( the "pack")
I said 8 of 10, because there were also scenarios when the length are the same between the derived and the original, see below
...
pack
[B#5680a178
packlenght:824
toString
...
toBytes
[B#5fdef03a
toByteslength:824
...
I can not figure out the exact rules.
does anyone has any idea?
or are there any better ways of converting byte array to and from string?
cheers

the simplest way I can think of is to convert the byte array into string.
The simplest way is the wrong way. For most character encodings, converting an arbitrary byte sequence to a text is likely to be lossy.
A better (i.e. more robust) way is to use Base64 encoding. Read the javadoc for the Base64 class and its dependent encode and decoder classes.
If you do persist in trying to convert arbitrary bytes top characters and back using new String(byte[]) and the like:
Be sure that you chose a character encoding where a Bytes -> Characters -> Bytes conversion sequence is not lossy. (LATIN-1 will work)
Don't rely on the current execution platform's default character encoding for the encoding / decoding charset.
In a client / server system, the client and server have to use the same encoding.

I have a need to send and receive large byte array over internet(http
restful service).
the simplest way I can think of is to convert the byte array into
string.
If that's all about sending/receiving byte array with jaxrs, each jaxrs implementation is perfectly capable of transmitting byte[]. See specification, section 4.2.4.

as per suggestion by Stephen C, I turned to Base64 basic mode:
following are my current complete verification code:
String message = "Die Strahlengriffelgewächse stammen ... ...
System.out.println("message");
System.out.println (message);
byte[] pack = Fbs.packExce(message);
System.out.println ("pack");
System.out.println (pack);
System.out.println ("packlenght:" + pack.length);
String toString = Base64.getEncoder().encodeToString(pack);
System.out.println ("toString");
System.out.println (toString);
byte[] toBytes = Base64.getDecoder().decode(toString);
System.out.println ("toBytes");
System.out.println (toBytes);
System.out.println ("toByteslength:" +toBytes.length);
String toBytesExtraction = extractExce(toBytes);
System.out.println ("toBytesExtraction");
System.out.println (toBytesExtraction);
String extraction = extractExce(pack);
System.out.println ("extraction");
System.out.println (extraction);
public static byte[] packExce(String text){
FlatBufferBuilder builder = new FlatBufferBuilder(0);
int textOffset = builder.createString(text);
Exce.startExce(builder);
Exce.addText(builder, textOffset);
int exce = Exce.endExce(builder);
Bucket.startBucket(builder);
Bucket.addContentType(builder, Post.Exce);
Bucket.addContent(builder, exce);
int buck = Bucket.endBucket(builder);
builder.finish(buck);
return builder.sizedByteArray();
//ByteBuffer buf = builder.dataBuffer();
//return buf;
//return Base64.getMimeEncoder().encodeToString(buf.array());
}
private String extractExce(byte[] bucket ){
String message = null;
ByteBuffer buf = ByteBuffer.wrap(bucket);
Bucket cont = Bucket.getRootAsBucket(buf);
System.out.println (cont.contentType());
if (cont.contentType() == Post.Exce){
message = ((Exce)cont.content(new Exce())).text();
}
return message;
}
and it seems work for my purpose:
...
pack
[B#5680a178
packlenght:2020
...
toBytes
[B#5fdef03a
toByteslength:2020
'''
----------------------
...
pack
[B#5680a178
packlenght:1872
...
toBytes
[B#5fdef03a
toByteslength:1872
...
and both extraction respectively from "toBytes" and "pack" faithfully restored the original "message"
String toBytesExtraction = extractExce(toBytes);
String extraction = extractExce(pack);
as a matter of fact, what I did not mention is that my original implementation had been base64 mime. my start point had been ByteBuffer then (my current is byte[]).
following are my code snippets if you are interested in.
coder
...
ByteBuffer buf = builder.dataBuffer();
return Base64.getMimeEncoder().encodeToString(buf.array());
decoder
ByteBuffer buf = ByteBuffer.wrap(Base64.getMimeDecoder().decode(bucket));
my guess is that the problem might have come from base64.mime.
because my first step of trouble location had been removing base64.mime, and using ByteBuffer directly. and it was a success...
well, I am a bit wandering off.
Back to the topic, I am still having no idea about the "2 bytes vary" regarding byte arrays before and after converting by "new String(byte[]) and "String.getBytes()" ...
cheers

Related

In Java Is it possible to convert character set 1047 into another character set, say 500?

I have a program which reads a message from MQ. the character set is 1047. Since my java version is very old it doesn't support thus character set.
Is it possible to change this string into char set 500 in the program after receiving but before reading.
For eg:
public void fun (String str){ //str in char set 1047. **1047 is not supported in my system**
/* can I convert str into char set 500 here. Convert it into byte stream and then back to string. Something like this */
byte [] b=str.getBytes();
ByteArrayOutputStream baos = null;
try{
baos = new ByteArrayOutputStream();
baos.write(b);
String str = baos.toString("IBM-500");
System.out.println(str);
}
byte [] b=str.getBytes(); //will convert string(encoding could only be Unicode in jvm) to bytes using file.encoding. You should check whether the str contains correct information, if so, you need not care the 1047 encoding, just run str.getBytes("IBM-500"), you will get the 500 encoded bytes. Again, String object only use Unicode, if you convert string to bytes, the encoding matters the result bytes array.

How to convert a String-represented ByteBuffer into a byte array in Java

I'm new to Java and I'm no sure how to do the following:
A Scala application somewhere converts a String into bytes:
ByteBuffer.wrap(str.getBytes)
I collect this byte array as a Java String, and I wish to do the inverse of what the Scala code above did, hence get the original String (object str above).
Getting the ByteBuffer as a String to begin with is the only option I have, as I'm reading it from an AWS Kinesis stream (or is it?). The Scala code shouldn't change either.
Example string:
String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";
How can this be achieved in Java?
EDIT
Okay, so I'll try to elaborate a little more about the process:
A 3rd party Scala application produces CSV rows which I need to consume
Before storing those rows in an AWS Kinesis stream, the application does the following to each row:
ByteBuffer.wrap(output.getBytes);
I read the data from the stream as a string, and the string could look like the following one:
String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";
I need to restore the contents of the string above into its original, readable, form;
I hope I've made it clearer now, sorry for puzzling you all to begin with.
If you want to go from byte[] to String, try new String(yourBytes).
Both getBytes and the String(byte[]) uses the default character encoding.
From Amazon Kinesis Service API Reference:
The data blob to put into the record, which is Base64-encoded when the blob is serialized.
You need to base64 decode the string. Using Java 8 it would look like:
byte[] bytes = Base64.getDecoder().decode("AAAAAAAAAAGZ7dFR0XmV23BR........");
str = new String(bytes, "utf-8"));
Other options: Base64 Encoding in Java
I m not sure if I understand the question exactly but do you mean this?
String decoded = new String(bytes);
public static void main(String[] args){
String decoded = new String(bytesData);
String actualString;
try{
actualString = new String(bytesData,"UTF-8");
System.out.printLn("String is" + actualString);
}catch(UnsupportedEncodingException e){
e.printstacktrace();
}
}
Sorry,wrong answer.
Again,ByteBuffer is a java class. SO they may work the same way
You need java version..
From kafka ApiUtils:
def writeShortString(buffer:ByteBuffer,string:String){
if(String == null){
buffer.putShort(-1)
}
else{
val encodedString = string.getBytes(“utf-8”)
if(encodedString.length > Short.MaxValue){
throw YourException(Your Message)
else{
buffer.putShort(encodedString.length.asInstanceOf[Short])
buffer.put(encodedString)
}
}
}
For Kinesis data blobs:
private CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
decoder.decode(record.getData()).toString();

JAVA: failing to get encrypted data in string using xor

I was trying to print encrypted text using string perhaps i was wrong somewhere. I am doing simple xor on a plain text. Coming encrypted text/string i am putting in a C program and doing same xor again to get plain text again.
But in between, I am not able to get proper string of encrypted text to pass in C
String xorencrypt(byte[] passwd,int pass_len){
char[] st = new char[pass_len];
byte[] crypted = new byte[pass_len];
for(int i = 0; i<pass_len;i++){
crypted[i] = (byte) (passwd[i]^(i+1));
st[i] = (char)crypted[i];
System.out.println((char)passwd[i]+" "+passwd[i] +"= " + (char)crypted[i]+" "+crypted[i]);/* characters are printed fine but problem is when i am convering it in to string */
}
return st.toString();
}
I don't know if any kind of encoding also needed because if i did so how I will decode and decrypt from C program.
example if suppose passwd = bond007
then java program should return akkb78>
further C program will decrypt akkb78> to bond007 again.
Use
return new String(crypted);
in that case you don't need st[] array at all.
By the way, the encoded value for bond007 is cmm`560 and not what you posted.
EDIT
While solution above would most likely work in most java environments, to be safe about encoding,
as suggested by Alex, provide encoding parameter to String constructor.
For example if you want your string to carry 8-bit bytes :
return new String(crypted, "ISO-8859-1");
You would need the same parameter when getting bytes from your string :
byte[] bytes = myString.getBytes("ISO-8859-1")
Alternatively, use solution provided by Alex :
return new String(st);
But, convert bytes to chars properly :
st[i] = (char) (crypted[i] & 0xff);
Otherwise, all negative bytes, crypted[i] < 0 will not be converted to char properly and you get surprising results.
Change this line:
return st.toString();
with this
return new String(st);

Convert a String to a byte array and then back to the original String

Is it possible to convert a string to a byte array and then convert it back to the original string in Java or Android?
My objective is to send some strings to a microcontroller (Arduino) and store it into EEPROM (which is the only 1  KB). I tried to use an MD5 hash, but it seems it's only one-way encryption. What can I do to deal with this issue?
I would suggest using the members of string, but with an explicit encoding:
byte[] bytes = text.getBytes("UTF-8");
String text = new String(bytes, "UTF-8");
By using an explicit encoding (and one which supports all of Unicode) you avoid the problems of just calling text.getBytes() etc:
You're explicitly using a specific encoding, so you know which encoding to use later, rather than relying on the platform default.
You know it will support all of Unicode (as opposed to, say, ISO-Latin-1).
EDIT: Even though UTF-8 is the default encoding on Android, I'd definitely be explicit about this. For example, this question only says "in Java or Android" - so it's entirely possible that the code will end up being used on other platforms.
Basically given that the normal Java platform can have different default encodings, I think it's best to be absolutely explicit. I've seen way too many people using the default encoding and losing data to take that risk.
EDIT: In my haste I forgot to mention that you don't have to use the encoding's name - you can use a Charset instead. Using Guava I'd really use:
byte[] bytes = text.getBytes(Charsets.UTF_8);
String text = new String(bytes, Charsets.UTF_8);
You can do it like this.
String to byte array
String stringToConvert = "This String is 76 characters long and will be converted to an array of bytes";
byte[] theByteArray = stringToConvert.getBytes();
http://www.javadb.com/convert-string-to-byte-array
Byte array to String
byte[] byteArray = new byte[] {87, 79, 87, 46, 46, 46};
String value = new String(byteArray);
http://www.javadb.com/convert-byte-array-to-string
Use [String.getBytes()][1] to convert to bytes and use [String(byte[] data)][2] constructor to convert back to string.
byte[] pdfBytes = Base64.decode(myPdfBase64String, Base64.DEFAULT)
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
public class FileHashStream
{
// write a new method that will provide a new Byte array, and where this generally reads from an input stream
public static byte[] read(InputStream is) throws Exception
{
String path = /* type in the absolute path for the 'commons-codec-1.10-bin.zip' */;
// must need a Byte buffer
byte[] buf = new byte[1024 * 16]
// we will use 16 kilobytes
int len = 0;
// we need a new input stream
FileInputStream is = new FileInputStream(path);
// use the buffer to update our "MessageDigest" instance
while(true)
{
len = is.read(buf);
if(len < 0) break;
md.update(buf, 0, len);
}
// close the input stream
is.close();
// call the "digest" method for obtaining the final hash-result
byte[] ret = md.digest();
System.out.println("Length of Hash: " + ret.length);
for(byte b : ret)
{
System.out.println(b + ", ");
}
String compare = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
String verification = Hex.encodeHexString(ret);
System.out.println();
System.out.println("===")
System.out.println(verification);
System.out.println("Equals? " + verification.equals(compare));
}
}

HTML5 Websocket Server handshake (v.76) (Java)

I'm trying to build a Java-based HTML5 websocket server (v76) and have problems with the handshake. There are a few opensource Java solutions that supposedly support v76 but none of them seem to work.
I am certain my handshake response is correct (at least calculating the two key's responses). My question: Is Java by default Big Endian? Since the concatenation of the two key answers + the response bytes is the handshake answer, I'm having to do multiple type conversions (string to int, concat two ints into a string, then convert to byte and concat with the response bytes, then MD5 encoding), is there something in particular I need to be looking for? My response always seems accurate using Wireshark (# of bytes), but since the clients have no debug information it's hard to tell why my handshakes are failing.
Any supporting answers or working code would be EXTREMELY valuable to me.
Hey, this is a working example of the handshake producer for websockets version 76. If you use the example from the spec (http://tools.ietf.org/pdf/draft-hixie-thewebsocketprotocol-76.pdf) and print the output as a String, it produces the correct answer.
public byte[] getHandshake (String firstKey, String secondKey, byte[] last8)
{
byte[] toReturn = null;
//Strip out numbers
int firstNum = Integer.parseInt(firstKey.replaceAll("\\D", ""));
int secondNum = Integer.parseInt(secondKey.replaceAll("\\D", ""));
//Count spaces
int firstDiv = firstKey.replaceAll("\\S", "").length();
int secondDiv = secondKey.replaceAll("\\S", "").length();
//Do the division
int firstShake = firstNum / firstDiv;
int secondShake = secondNum / secondDiv;
//Prepare 128 bit byte array
byte[] toMD5 = new byte[16];
byte[] firstByte = ByteBuffer.allocate(4).putInt(firstShake).array();
byte[] secondByte = ByteBuffer.allocate(4).putInt(secondShake).array();
//Copy the bytes of the numbers you made into your md5 byte array
System.arraycopy(firstByte, 0, toMD5, 0, 4);
System.arraycopy(secondByte, 0, toMD5, 4, 4);
System.arraycopy(last8, 0, toMD5, 8, 8);
try
{
//MD5 everything together
MessageDigest md5 = MessageDigest.getInstance("MD5");
toReturn = md5.digest(toMD5);
}
catch (NoSuchAlgorithmException e)
{
e.printStackTrace();
}
return toReturn;
}
I wrote this so feel free to use it where ever.
EDIT: Some other problems I ran into - You MUST write the 'answer' to the handshake as bytes. If you try to write it back to the stream as a String it will fail (must be something to do with char conversion?). Also, make sure you're writing the rest of the response to the handshake exactly as it shows in the spec.
Jetty 7 supports web sockets, and is open source. You might find inspiration (but I would suggest you just embed Jetty in your application and be done with it).
http://blogs.webtide.com/gregw/entry/jetty_websocket_server
You can try my implementation:
https://github.com/TooTallNate/Java-WebSocket
It supports draft 75 and 76 currently. Verified with current versions of Chrome and Safari. Good luck!

Categories

Resources