Getting Invalid data on decode using appache common-codec library

Getting Invalid data on decode using appache common-codec library - java

I have scenario where I am getting my pkcs12 cert content as encoded string(appache common-codec library). Now I have to decode that string and have to store it file. But while decoding it as a string I am getting an invalid cert content.
When I am trying to write bytes in a file it works fine. Please find the snippets I have tried below.
For encode:
Base64.encodeBase64String(certcontentInBytes[])
For decode:
new String(Base64.decodeBase64(certstringContent));

new String(bytes) actually does new String(bytes, defaultCharset) to convert the bytes using the charset of the bytes to a Unicode string. Non-portable and probable wrong charset.
For bytes as binary data that will not work. String should not be used for binary data. I bet the bytes were written to the file.

Maybe you have to use new String(byte[], Charset charset) with the correct charset (probably UTF-8), because otherwise it will use the platform charset, which differs between Windows and Linux/Unix.
I wonder why you can't simply use the byte array as parameter?

Related

Encode to UTF-8. Encode character eg. ö to Ã¶

I want to encode a string in Android to UTF-8. For example this string:
Grüne Ähren beißen Flöhe
to
GrÃ¼ne Ãhren beiÃen FlÃ¶he
But no matter what I do I encode ü to ü or ü to %C3%BC (online often called 'raw URL encode').
Found solutions to convert to byte[] or URI.toASCIIString(). But non of them work for me.
UPDATE
I am participating in the eBay partner network and try to concat a searchword to my partner url.
The people of eBay must use a wrong character set, as UTF-8 URL encoded string don't work.
A searchword with UTF-8 URL encoding
(Grüne Ähren beißen Flöhe
to
Gr%C3%BCne%20%C3%84hren%20bei%C3%9Fen%20Fl%C3%B6he)
comes out to this result in the eBay searchbox:
If I encode my searchword with ISO_8859_1 it works (GrÃ¼ne Ãhren beiÃen FlÃ¶he):
Thank you very much community

What you essentially want is to convert a String to it's byte representation according to UTF-8 and interpret these bytes using a different Charset, such as ISO-8859-1.
This is usually the cause of many problems. You want to intentionally do what most developers do incorrectly (or they simply ignore the problems of charsets).
Since you just need this to work, use this piece of code:
byte[] bytes = "Grüne Ähren beißen Flöhe".getBytes("UTF-8");
String result = new String(bytes, "ISO-8859-1");
see it at work here.

Unzip files that contain chinese characters

I have a zip file.It contains some files.Files contain chinese characters so I used
ZipInputStream zipStream = new ZipInputStream(
new BufferedInputStream(new FileInputStream(zipFilePath), BUFFER_SIZE),
Charset.forName("ISO-8859-1")
);
......
FileOutputStream fileOutput = new FileOutputStream(uncompressedFileName);
while (zipStream.available() > 0) {
fileOutput.write(zipStream.read());
}
Extraction runs succesfully.After that I want to use encodingDetect method to find encoding but now service is not running.It returns nomatch. If I send files directly to service,The service is running.It find charset properly like UTF-8.
I guess that Charset.forName("ISO-8859-1")extract files but format is corrupted.Do you have any idea?

The problem is the Charset of the file names in the zip. UTF-8 raises an error (the file names are evidently not in UTF-8), as UTF-8 requires as special format for the multi-byte sequences, and evidently there are wrong "multibyte" sequences.
ISO-8859-1 is a single byte enconding, accepting garbage.
What you should do is to try the small number of Chinese Charsets, so the file name strings are filled correctly. Java String contains Unicode, so can hold any Charset. The help from someone talking Chinese probably would make sense.
And then try writing files with those names. If not successful on your PC, you must use artificial file names, maybe transliteration from Chinese.
A translation table from original Chinese file name to actual file name may be created
as UTF-8 text file, maybe with a BOM, '\uFEFF` at the begin-of-file.

ISO-8859-1 charset most definitely does not support Chinese language. Use UTF-8 instead of ISO-8859-1

Java Encoding for "GB2312" CHARACTER ® replacing with question mark(?)

I'm trying to get encoded value using GB2312 characterset but I'm getting '? 'instead of '®'
Below is my sample code:
new String("Test ®".getBytes("GB2312"));
but I'm getting Test ? instead of Test ®.
Does any one faced this issue?
Java version- JDK6
Platform: Window 7
I'm not aware of Chinese character encoding so need suggestion.

For better understanding, the statement divided in two parts:
byte[] bytes = "Test ®".getBytes("GB2312"); // bytes, encoding the string to GB2312
new String(bytes); // back to string, using default encoding
Probably ® is not a valid GB2312 character, so it is converted to ?. See the result of
Charset.forName("GB2312").newEncoder().canEncode("®")
Based on documentation of getBytes:
The behavior of this method when this string cannot be encoded in the given charset is unspecified. The CharsetEncoder class should be used when more control over the encoding process is required.
which also suggest using CharsetEncoder.

Charset bug in AES Decryption on Android system

Good evening!
In my android app the smartphones load a AES encrypted String from my server and store it in a variable. After that process the variable and a key are pass to a method which decrypt the string. My mistake is that german umlauts (ä, ü, ö) aren't correct decoded. All umlauts displayed as question marks with black background...
My Code:
public static String decrypt(String input, String key) {
byte[] output = null;
String newString = "";
try {
SecretKeySpec skey = new SecretKeySpec(key.getBytes(), "AES");
Cipher cipher = Cipher.getInstance("AES/ECB/PKCS5Padding");
cipher.init(Cipher.DECRYPT_MODE, skey);
output = cipher.doFinal(Base64.decode(input, Base64.DEFAULT));
newString = new String(output);
} catch(Exception e) {}
return newString;
}
The code works perfectly - only umlauts displayed not correctly, an example is that (should be "ö-ä-ü"):
How can I set the encoding of the decrypted String? In my iOS app I use ASCII to encoding the decoded downloaded String. That works perfectly! Android and iOS get the String from the same Server on the same way - so I think the problem is the local Code above.
I hope you can help me with my problem... Thanks!

There is no text but encoded text.
It seems like you are guessing at the character set and encoding—That's no way to communicate.
To recover the text, you need to reverse the original process applied to it with the parameters associated with each step.
For explanation, assume that the server is taking text from a Java String and sending it to you securely.
String uses the Unicode character set (specifically, Unicode's UTF-16 encoding).
Get the bytes for the String, using some specific encoding, say ISO8859-1. (UTF-8 could be better because it is also an encoding for the Unicode character set, whereas ISO8859-1 has a lot fewer characters.) As #Andy points out, exceptions are your friends here.
Encrypt the bytes with a specific key. The key is a sequence of bytes, so, if you are generating this from a string, you have to use a specific encoding.
Encode the encrypted bytes with Base64, producing a Java String (again, UTF-16) with a subset of characters so reduced that it can be re-encoded in just about any character encoding and placed in just about any context such as SMTP, XML, or HTML without being misinterpreted or making it invalid.
Transmit the string using a specific encoding. An HTTP header and/or HTML charset value is usually used to communicate which encoding.
To receive the text, you have to get:
the bytes,
the encoding from step 5,
the key from step 3,
the encoding from step 3 and
the encoding from step 2.
Then you can reverse all of the steps. Per your comments, you discovered you weren't using the encoding from step 2. You also need to use the encoding from step 3.

Uploading image to server corrupts the image

I have in my application a image upload method that need to send a image and a string to my server.
The problem is that the server receives the content (image and string) but when it saves the image on the disk it is corrupted and can't be opened.
This is the relevant part of the script.
HttpPost httpPost = new HttpPost(url);
Bitmap bmp = ((BitmapDrawable) imageView.getDrawable()).getBitmap();
ByteArrayOutputStream stream = new ByteArrayOutputStream();
bmp.compress(Bitmap.CompressFormat.PNG, 100, stream);
byte[] byteArray = stream.toByteArray();
String byteStr = new String(byteArray);
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("--"+boundary+"\r\n");
stringBuilder.append("Content-Disposition: form-data; name=\"content\"\r\n\r\n");
stringBuilder.append(message+"\r\n");
stringBuilder.append("--"+boundary+"\r\n");
stringBuilder.append("Content-Disposition: form-data; name=\"image\"; filename=\"image.jpg\"\r\n");
stringBuilder.append("Content-Type: image/jpeg\r\n\r\n");
stringBuilder.append(byteStr);
stringBuilder.append("\r\n");
stringBuilder.append("--"+boundary+"--\r\n");
StringEntity entity = new StringEntity(stringBuilder.toString());
httpPost.setEntity(entity);
I can't change the server because other clients use it and it works for them. I just need to understand why the image is being corrupted.

When you do new String(byteArray), it's converting binary into the default character set (which is typically UTF-8). Most character sets aren't a suitable encoding for binary data. In other words if you were to encode certain binary strings to UTF-8 and then decode back to binary, you would not get the same binary string.
Since you're using multipart encoding, you need to write directly to the stream of the entity. Apache HTTP Client has helpers for doing this. See this guide, or this Android guide to uploading with multipart.
If you NEED to using strings only, you can safely convert your byte array to a string with
String byteStr = android.util.Base64.encode(byteArray, android.util.Base64.DEFAULT);
But it's important to note that your server will need to Base64 decode the string back to a byte array and save it to an image. Further, the transfer size will be greater because Base64 encoding isn't as space efficient as raw binary.

Your solutions above is not working because you are using new String(byteArray). The constructor encodes the byte array using the default encoding - see What is the default encoding - and it is very likely, that you have byte sequences in your data that cannot be encoded into a character.
To be more precise, a charset defines how characters are represented as bytes.
Most charsets have more than 256 characters. That is why you need more than one byte to represent a character. UTF-8 and UTF-16 uses up to four bytes.
So you have a mapping between the number space and the character space and this mapping is not bejectiv a priori. So it is very likely that there exist a number in the number space that have no character mapped to it.
The solution #Samuel suggested is foolproof because Base64 uses A–Z, a–z, 0–9, + , / and terminates with = to represent a byte. I would prefer this solution!
If you don't want or cannot use Base64, than you can try just to throw in every byte as it is into the StringBuilder hoping that the server does not do any encoding before you get it.
for (byte b : byteArray) {
stringBuilder.append((char)b);
}
I do not recommand that solution in general, but it may help you to get your stuff done.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.