How to convert String to byte without changing? - java

I need a solution to convert String to byte array without changing like this:
Input:
String s="Test";
Output:
String s="Test";
byte[] b="Test";
When I use
s.getBytes();
then the reply is
"[B#428b76b8"
but I want the reply to be
"Test"

You should always make sure serialization and deserialization are using the same character set, this maps characters to byte sequences and vice versa. By default String.getBytes() and new String(bytes) uses the default character set which could be Locale specific.
Use the getBytes(Charset) overload
byte[] bytes = s.getBytes(Charset.forName("UTF-8"));
Use the new String(bytes, Charset) constructor
String andBackAgain = new String(bytes, Charset.forName("UTF-8"));
Also Java 7 added the java.nio.charset.StandardCharsets class, so you don't need to use dodgy String constants anymore
byte[] bytes = s.getBytes(StandardCharsets.UTF_8);
String andBackAgain = new String(bytes, StandardCharsets.UTF_8);

You can revert back using
String originalString = new String(b, "UTF-8");
That should get you back your original string. You don't want the bytes printed out directly.

You may try the following code snippet -
String string = "Sample String";
byte[] byteArray = string.getBytes();

In general that's probably not what you want to do, unless you're serializing or transmitting the data. Also, Java strings are UTF-16 rather than UTF-8, which what more like what you're expecting. If you really do want/need this then this should work:
String str = "Test";
byte[] raw = str.getBytes(new Charset("UTF-8", null));

Related

How to convert hex string to Shift JIS encoding in java?

How can I convert a word's HEX code string to Shift JIS encoding?
For example, I have a string:
"90DD92E882F08F898AFA89BB82B582DC82B782A9"
And I want to get the following output:
設定を初期化しますか
String s = new String(new BigInteger("90DD92E882F08F898AFA89BB82B582DC82B782A9", 16).toByteArray(), "Shift_JIS");
will do it for you for earlier versions
Assuming you have Java 17+, which added java.util.HexFormat, then you can use parseHex followed by a conversion from the byte array to a string:
byte[] bytes = HexFormat.of().parseHex("90DD92E882F08F898AFA89BB82B582DC82B782A9");
String str = new String(bytes, "Shift_JIS");
If you do not have Java 17+, then the related answer I linked to gives an alternative approach instead of parseHex.
I don't have the correct charset/font to show the result in my console, but here is the str variable in my debugger:

Convert byte[] to String and back

I'm trying to save content of a pdf file in a json and thought of saving the pdf as String value converted from byte[].
byte[] byteArray = feature.convertPdfToByteArray(Paths.get("path.pdf"));
String byteString = new String(byteArray, StandardCharsets.UTF_8);
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
System.out.println(secondString.equals(byteString));
System.out.println(Arrays.equals(byteArray, newByteArray));
System.out.println(byteArray.length + " vs " + newByteArray.length);
The result of the above code is as follows:
true
false
421371 vs 760998
The two String's are equal while the two byte[]s are not. Why is that and how to correctly convert/save a pdf inside a json?
You are probably using the wrong charset when reading from the PDF file.
For example, the character é (e with acute) does not exists in ISO-8859-1 :
byte[] byteArray = "é".getBytes(StandardCharsets.ISO_8859_1);
String byteString = new String(byteArray, StandardCharsets.UTF_8);
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
System.out.println(secondString.equals(byteString));
System.out.println(Arrays.equals(byteArray, newByteArray));
System.out.println(byteArray.length + " vs " + newByteArray.length);
Output :
true
false
1 vs 3
Why is that
If the byteArray indeed contains a PDF, it most likely is not valid UTF-8. Thus, wherever
String byteString = new String(byteArray, StandardCharsets.UTF_8);
stumbles over a byte sequence which is not valid UTF-8, it will replace that by a Unicode replacement character. I.e. this line damages your data, most likely beyond repair. So the following
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
does not result in the original byte array but instead a damaged version of it.
The newByteArray, on the other hand, is the result of UTF-8 encoding a given string, byteString. Thus, newByteArray is valid UTF-8 and
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
does not need to replace anything outside the UTF-8 mappings, in particular byteString and secondString are equal.
how to correctly convert/save a pdf inside a json?
As #mammago explained in his comment,
JSON is not the appropriate format for binary content (like files). You should propably use something like base64 to create a string out of your PDF and store that in your JSON object.

Java - String of bytes to bytes[]

I have a
String b = "[B#64964f8e";
this is the byte[] output which i store in a string
Now I would like to convert it back to byte[]
byte[] c = b.getBytes();
but it gave me different byte which is
[B#9615a1f
how can I get back the same as [B#64964f8e ?
String b = "[B#64964f8e";
that's not a real string. That's the type and address of your byte array. It's nothing more than a transient reference code, and if the original array was GC'd you wouldn't even have a hope of getting it back with really funky native methods romping through memory.
I suspect you are trying to do the wrong thing and this won't help you at all because I would have though you want the contents to be the same, not the result of the toString() method.
You shouldn't be using a text String to binary data but you can use ISO-8859-1
byte[] bytes = random bytes
String text = new String(bytes, "ISO-8859-1");
byte[] bytes2 = text.getBytes("ISO-8859-1"); // gets back the same bytes.
But to answer your question, you can do this.
Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
Unsafe unsafe = (Unsafe) theUnsafe.get(null);
byte[] bytes = new byte[0];
unsafe.putInt(bytes, 1L, 0x64964f8e);
System.out.println(bytes);
prints
[B#64964f8e
"[B#64964f8e" is not a string encoding of your byte[]. That is the result of the default toString() implementation, which tells you the type and reference location. Maybe you wanted to use base64-encoding instead, e.g. using javax.xml.bind.DatatypeConverter's parseBase64Binary() and printBase64Binary():
byte[] myByteArray = // something
String myString = javax.xml.bind.DatatypeConverter.printBase64Binary(myByteArray);
byte[] decoded = javax.xml.bind.DatatypeConverter.parseBase64Binary(myString);
// myByteArray and decoded have the same contents!
A simple answer is:
System.out.println(c) prints the reference's representation of c object. Not c's content.(Only in cases where Object's toString() method is not overriden)
String b = "[B#64964f8e";
byte[] c = b.getBytes();
System.out.println(c); //prints reference's representation of c
System.out.println(new String(c)); //prints [B#64964f8e

Convert a String to a byte array and then back to the original String

Is it possible to convert a string to a byte array and then convert it back to the original string in Java or Android?
My objective is to send some strings to a microcontroller (Arduino) and store it into EEPROM (which is the only 1  KB). I tried to use an MD5 hash, but it seems it's only one-way encryption. What can I do to deal with this issue?
I would suggest using the members of string, but with an explicit encoding:
byte[] bytes = text.getBytes("UTF-8");
String text = new String(bytes, "UTF-8");
By using an explicit encoding (and one which supports all of Unicode) you avoid the problems of just calling text.getBytes() etc:
You're explicitly using a specific encoding, so you know which encoding to use later, rather than relying on the platform default.
You know it will support all of Unicode (as opposed to, say, ISO-Latin-1).
EDIT: Even though UTF-8 is the default encoding on Android, I'd definitely be explicit about this. For example, this question only says "in Java or Android" - so it's entirely possible that the code will end up being used on other platforms.
Basically given that the normal Java platform can have different default encodings, I think it's best to be absolutely explicit. I've seen way too many people using the default encoding and losing data to take that risk.
EDIT: In my haste I forgot to mention that you don't have to use the encoding's name - you can use a Charset instead. Using Guava I'd really use:
byte[] bytes = text.getBytes(Charsets.UTF_8);
String text = new String(bytes, Charsets.UTF_8);
You can do it like this.
String to byte array
String stringToConvert = "This String is 76 characters long and will be converted to an array of bytes";
byte[] theByteArray = stringToConvert.getBytes();
http://www.javadb.com/convert-string-to-byte-array
Byte array to String
byte[] byteArray = new byte[] {87, 79, 87, 46, 46, 46};
String value = new String(byteArray);
http://www.javadb.com/convert-byte-array-to-string
Use [String.getBytes()][1] to convert to bytes and use [String(byte[] data)][2] constructor to convert back to string.
byte[] pdfBytes = Base64.decode(myPdfBase64String, Base64.DEFAULT)
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
public class FileHashStream
{
// write a new method that will provide a new Byte array, and where this generally reads from an input stream
public static byte[] read(InputStream is) throws Exception
{
String path = /* type in the absolute path for the 'commons-codec-1.10-bin.zip' */;
// must need a Byte buffer
byte[] buf = new byte[1024 * 16]
// we will use 16 kilobytes
int len = 0;
// we need a new input stream
FileInputStream is = new FileInputStream(path);
// use the buffer to update our "MessageDigest" instance
while(true)
{
len = is.read(buf);
if(len < 0) break;
md.update(buf, 0, len);
}
// close the input stream
is.close();
// call the "digest" method for obtaining the final hash-result
byte[] ret = md.digest();
System.out.println("Length of Hash: " + ret.length);
for(byte b : ret)
{
System.out.println(b + ", ");
}
String compare = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
String verification = Hex.encodeHexString(ret);
System.out.println();
System.out.println("===")
System.out.println(verification);
System.out.println("Equals? " + verification.equals(compare));
}
}

How best to convert a byte[] array to a string buffer

I have a number of byte[] array variables I need to convert to string buffers.
is there a method for this type of conversion ?
Thanks
Thank you all for your responses..However I didn't make myself clear....
I'm using some byte[] arrays pre-defined as public static "under" the class declaration
for my java program. these "fields" are reused during the "life" of the process.
As the program issues status messages, (written to a file) I've defined a string buffer
(mesg_data) that used to format a status message.
So as the program executes
I tried msg2 = String(byte_array2)
I get a compiler error:
cannot find symbol
symbol : method String(byte[])
location: class APPC_LU62.java.LU62XnsCvr
convrsID = String(conversation_ID) ;
example:
public class LU62XnsCvr extends Object
.
.
static String convrsID ;
static byte[] conversation_ID = new byte[8] ;
So I can't use a "dynamic" define of a string variable because the same variable is used
in multiple occurances.
I hope I made myself clear
Thanks ever so much
Guy
String s = new String(myByteArray, "UTF-8");
StringBuilder sb = new StringBuilder(s);
There is a constructor that a byte array and encoding:
byte[] bytes = new byte[200];
//...
String s = new String(bytes, "UTF-8");
In order to translate bytes to characters you need to specify encoding: the scheme by which sequences (typically of length 1,2 or 3) of 0-255 values (that is: sequence of bytes) are mapped to characters. UTF-8 is probably the best bet as a default.
You can turn it to a String directly
byte[] bytearray
....
String mystring = new String(bytearray)
and then to convert to a StringBuffer
StringBuffer buffer = new StringBuffer(mystring)
You may use
str = new String(bytes)
By thewhat the code above does is to create a java String (i.e. UTF-16) with the default platform character encoding.
If the byte array was created from a string encoded in the platform default character encoding this will work well.
If not you need to specify the correct character encoding (Charset) as
String str = new String (byte [] bytes, Charset charset)
It depends entirely on the character encoding, but you want:
String value = new String(bytes, "US-ASCII");
This would work for US-ASCII values.
See Charset for other valid character encodings (e.g., UTF-8)

Categories

Resources