Apache common codec in java from string to hex and viceversa

Apache common codec in java from string to hex and viceversa - java

I am trying to encode a string in hex and then convert it again to string. For this purpose I'm using the apache common codec. In particular I have defined the following methods:
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Hex;
public String toHex(byte[] byteArray){
return Hex.encodeHexString(byteArray);
}
public byte[] fromHex(String hexString){
byte[] array = null;
try {
array = Hex.decodeHex(hexString.toCharArray());
} catch (DecoderException ex) {
Logger.getLogger(SecureHash.class.getName()).log(Level.SEVERE, null, ex);
}
return array;
}
The strange thing is that I do not get the same initial string when converting back. More strangely, the byte array I get, it's different from the initial byte array of the string.
The small test program that I wrote is the following:
String uno = "uno";
byte[] uno_bytes = uno.getBytes();
System.out.println(uno);
System.out.println(uno_bytes);
toHex(uno_bytes);
System.out.println(hexed);
byte [] arr = fromHex(hexed);
System.out.println(arr.toString());
An example of output is the following:
uno #initial string
[B#1afe17b #byte array of the initial string
756e6f #string representation of the hex
[B#34d46a #byte array of the recovered string
There is also another strange behaviour. The byte array ([B#1afe17b) is not fixed, but is different from run to run of the code, but I cannot understand why.

When you print a byte array, the toString() representation does not contain the contents of the array. Instead, it contains a type indicator ([B means byte array) and the hashcode. The hashcode will be different for two distinct byte arrays, even if they contain the same contents. See Object.toString() and Object.hashCode() for further information.
Instead, you maybe want to compare the arrays for equality, using:
System.out.println("Arrays equal: " + Arrays.equals(uno_bytes, arr));

Related

Java - checking encoding of string for unit test?

I have a unit test I was trying to write for a generateKey(int **length**) method. The method:
1. Creates a byte array with size of input parameter length
2. Uses SecureRandom().nextBytes(randomKey) method to populate the byte array with random values
3. Encodes the byte array filled with random values to a UTF-8 String object
4. Re-writes the original byte array (called randomKey) to 0's for security
5. Returns the UTF-8 encoded String
I already have a unit test checking for the user inputting a negative value (i.e. -1) such that the byte array would throw a Negative array size exception.
Would a good positive test case be to check that a UTF-8 encoded String is successfully created? Is there a method I can call on the generated String to check that it equals "UTF-8" encoding?
I can't check that the String equals the same String, since the byte array is filled with random values each time it is called....
source code is here:
public static String generateKey(int length) {
byte[] randomKey = new byte[length];
new SecureRandom().nextBytes(randomKey);
String key = new String(randomKey, Charset.forName("UTF-8"));//Base64.getEncoder().encodeToString(randomKey);
Arrays.fill(randomKey,(byte) 0);
return key;
}

You can convert a UTF8 string to a byte array as below
String str = "私の"; // replace this with your generateKey result
byte [] b = str.getBytes();
String newString;
try {
newString = new String (b, "UTF-8");
System.out.println(newString);
System.out.println("size is equal ? " + (str.length() == newString.length()));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}

First, the code you posted is simply wrong: you can't take a random array of bytes and treat it as a UTF-8 string, because UTF-8 expects certain bit patterns to indicate multi-byte characters.
Unfortunately, this failure happens silently, because you're using a string constructor that "always replaces malformed-input and unmappable-character sequences with this charset's default replacement string". So you'll get something, but you wouldn't be able to translate it back to the same binary value.
The comment in the code, however, gives you the answer: you need to use Base64 to encode the binary value.
However, that still won't let you verify that the encoded string is equivalent to the original byte array, because the function takes great care to zero out the array immediately after use (which doesn't really do what the author thinks it does, but I'm not going to get into that argument).
If you really want to test a method like this, you need to be able to mock out core parts of it. You could, for example, separate out the generation of random bytes from encoding, and then pass in a byte generator that keeps track of the bytes that it generated.
But the real question is why? What are you (or more correctly, the person writing this code) actually trying to accomplish? And why won't a UUID accomplish it?

Store byte[] in MongoDB using Java

I insert a document into a collection with a field of byte[]. When I query the inserted document to get that field, it returns a different byte[]. How do I fix that?
byte[] inputBytes = ...
MongoCollection<Document> collection = _db.getCollection("collectionx");
Document doc = new Document("test", 1).append("val", inputBytes);
collection.insertOne(doc.getDocument());
MongoCursor<Document> result = collection.find(eq("test", 1)).iterator();
Document retrived_doc = cursor.next();
cursor.close();
byte[] outputBytes = ((Binary)retrived_doc.get("val")).getData();
// inputBytes = [B#719f369d
// outputBytes = [B#7b70cec2

The problem is not your code but how you check if both arrays - input and output array - are equal. It seems you are just comparing the results of calling toString() on both results. But toString() is not overridden for array types, so it is actually Object.toString() which just returns the type and hash code of an object:
The toString method for class Object returns a string consisting of
the name of the class of which the object is an instance, the at-sign
character `#', and the unsigned hexadecimal representation of the hash
code of the object. In other words, this method returns a string equal
to the value of:
getClass().getName() + '#' + Integer.toHexString(hashCode())
So [B#719f369d means: 'Array of bytes' ([B) with hash code 0x719f369d. It has nothing to do with the array content.
In your example, input and output arrays are two different objects, hence they have different memory addresses and hash codes (due to the fact, that hashCode() is also not overridden for array types).
Solution
If you want to compare the contents of two byte arrays, call Arrays.equals(byte[], byte[]).
If you want to print the content of a byte array, call Arrays.toString(byte[]) to convert the content into a human readable String.

MongoDB has support org.bson.types.Binary type
You can use:
BasicDBObject("_id", Binary(session.getIp().getAddress()))
the binary comparisons will work

You can encode byte array and store it in doc also decode it after extraction to retrieve original.
static String encodeBase64String(byte[] binaryData)
Encodes binary data using the base64 algorithm but does not chunk the output.
static byte[] decodeBase64(String base64String)
Decodes a Base64 String into octets.
Please refer this link - https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html

Transform String value to bytearray java

I have a string which contains the byte array's String value for example like this [B#42031498
I want to retrieve the String content as byte[] ? How can I do that ?
PS : converting the string with String.getBytes() method doesn't work . It converts the string but doesn't give me the value as byte array. It works like this.
If It's is not possible is there a way to get byte[] from Object in java (and always without converting)

converting the string with String.getBytes() method doesn't work . It converts the string but doesn't give me the value as byte array.
Yes it does.
You have two problems:
you try and print the array directly; you should use Arrays.toString(), otherwise the .toString() method of the array itself is called;
you don't specify the encoding; you should really use .getBytes(StandardCharsets.UTF_8) to have the same result on all environments.
In the same manner, building a string from a byte array should be done using the correct encoding: new String(array, StandardCharsets.UTF_8).

if [B#42031498 has already been saved into a String, there is no way you can get this back to the originating byte array. Look at the following example:
String str = "[B#42031498";
byte[] ba = str.getBytes();
String s = new String(ba);
System.out.println(s);
This will output [B#42031498

What you have done:
byte[] array = ....
String result = array.toString();
What you (probably) want:
String result = new String(array, "UTF-8");

Iterate the byte array as below and you will get the byte value :
for (byte b : bytes) {
System.out.println(b);
}
The output B#42031498 you get because of Object class toString() method .
public String toString()
{
return getClass().getName() + "#" + Integer.toHexString(hashCode());
}

Converting a string to byte[] such that the contents remain same

I have a String say String a = "abc";. Now I want to convert it into a byte array say byte b[];, so that when I print b it should show "abc".
How can I do that?
getBytes() method is giving different result.
My program looks like that so far:
String a="abc";
byte b[]=a.getBytes();
what I want is I have two methods made in a class one is
public byte[] encrypt(String a)
and another is
public String decrypt(byte[] b)
after doing encryption i saved the data into database but when i am getting it back then byte methods are not giving the correct output but i got the same data using String method but now I have to pass it into decrypt(byte[] b)
How to do it this is the real scenario.

Well, your first problem is that a String in Java is not an array of bytes, but of chars, where each of them takes 16bit. This is to cover for unicode characters, instead of only ascii that you'd get with bytes. That means that if you use the getBytes method, you won't be able to print the string one array position at a time, since it takes two array positions (two bytes) to represent one character.
What you could do is use getChars and then cast each char to a byte, with the corresponding precision los. This is not a good idea since it won't work outside of normal English characters! You asked, though, so here you go ;)
EDIT: as #PeterLawerey mentions,Unicode characters make it even harder, with some unicode characters needing more than one char. There's a good discussion in StackOverflow and it links to an detailed article from Oracle.

byte b[]=a.getBytes();
System.out.println(new String(b));

You could use this constructor to build your string back again:
String a="abc";
byte b[]=a.getBytes("UTF-8");
System.out.println(new String(b, "UTF-8"));
Other than that, you can't do System.out.println(b) and expect to see abc.

A byte is value between -128 and 127. When you print it, it will be a number by default.
If you want to print it as an ASCII char, you can cast it to a (char)
byte[] bytes = "abc".getBytes();
for(byte b: bytes)
System.out.println((char) b);
prints
a
b
c

It seems like you are implementing encryption and decryption code.
String constructors are for text data. you should not use it to convert byte array
which contains encrypted data to string value.
You should use base64 instead, which encodes any binary data into ASCII.
this one is good public domain one
http://iharder.sourceforge.net/current/java/base64/
base64 apache commons
http://commons.apache.org/codec/download_codec.cgi
String msg ="abc";
byte[] data = Base64.decode(msg);
String convert = Base64.encodeBytes(data);

This will convert "abc" to byte and then the code will print "abc" in respective ASCII code (ie. 97 98 99).
byte a[]=new byte[160];
String s="abc";
a=s.getBytes();
for(int i=0;i<s.length();i++)
{
System.out.print(a[i]+" ");
}
If you add these lines it will again change the ASCII code to String (ie. abc)
String s1=new String(a);
System.out.print("\n"+s1);
Hope it Helpes.
Modified Code:
To send byte array as an argument:
public static void another_method_name(byte b1[])
{
String s1=new String(b1);
System.out.print("\n"+s1);
}
public static void main(String[] args)
{
byte a[]=new byte[160];
String s="abc";
a=s.getBytes();
for(int i=0;i<s.length();i++)
{
System.out.print(a[i]+" ");
}
another_method_name(a);
}
Hope it helps again.

Java Encode SHA-1 Byte Array

I have an SHA-1 byte array that I would like to use in a GET request. I need to encode this. URLEncoder expects a string, and if I create a string of it and then encode it, it gets corrupt?
To clarify, this is kinda a follow up to another question of mine.
(Bitorrent Tracker Request) I can get the value as a hex string, but that is not recognized by the tracker. On the other hand, encoded answer mark provided return 200 OK.
So I need to convert the hex representation that I got:
9a81333c1b16e4a83c10f3052c1590aadf5e2e20
into encoded form
%9A%813%3C%1B%16%E4%A8%3C%10%F3%05%2C%15%90%AA%DF%5E.%20

Question was edited while I was responding, following is ADDITIONAL code and should work (with my hex conversion code):
//Inefficient, but functional, does not test if input is in hex charset, so somewhat unsafe
//NOT tested, but should be functional
public static String encodeURL(String hexString) throws Exception {
if(hexString==null || hexString.isEmpty()){
return "";
}
if(hexString.length()%2 != 0){
throw new Exception("String is not hex, length NOT divisible by 2: "+hexString);
}
int len = hexString.length();
char[] output = new char[len+len/2];
int i=0;
int j=0;
while(i<len){
output[j++]='%';
output[j++]=hexString.charAt(i++);
output[j++]=hexString.charAt(i++);
}
return new String(output);
}
You'll need to convert the raw bytes to hexadecimal characters or whatever URL-friendly encoding they are using. Base32 or Base64 encodings are possible, but straight hexadecimal characters is the most common. URLEncoder is not needed for this string, because it shouldn't contain any characters that would require URL Encoding to %NN format.
The below will convert bytes for a hash (SHA-1, MD5SUM, etc) to a hexadecimal string:
/** Lookup table: character for a half-byte */
static final char[] CHAR_FOR_BYTE = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
/** Encode byte data as a hex string... hex chars are UPPERCASE*/
public static String encode(byte[] data){
if(data == null || data.length==0){
return "";
}
char[] store = new char[data.length*2];
for(int i=0; i<data.length; i++){
final int val = (data[i]&0xFF);
final int charLoc=i<<1;
store[charLoc]=CHAR_FOR_BYTE[val>>>4];
store[charLoc+1]=CHAR_FOR_BYTE[val&0x0F];
}
return new String(store);
}
This code is fairly optimized and fast, and I am using it for my own SHA-1 byte encoding. Note that you may need to convert uppercase to lowercase with the String.toLowerCase() method, depending on which form the server accepts.

This depends on what the recipient of your request expects.
I would imagine it could be a hexadecimal representation of the bytes in your hash. A string would probably not be the best idea, because the hash array will most likely contain non-printable character values.
I'd iterate over the array and use Integer.toHexValue() to convert the bytes to hex.

SHA1 is in hex format [0-9a-f], there should be no need to URLEncode it.

Use Apache Commons-Codec for all your encoding/decoding needs (except ASN.1, which is a pain in the ass)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Apache common codec in java from string to hex and viceversa - java

Related

Java - checking encoding of string for unit test?

Store byte[] in MongoDB using Java

Transform String value to bytearray java

Converting a string to byte[] such that the contents remain same

Java Encode SHA-1 Byte Array

Categories

Resources