Issue with Encoding base64 in PHP and Decoding base64 in Java - java

A string-"gACA" encoded in PHP using base64. Now I'm trying to decode in java using base64. But getting absurd value after decoding. I have tried like this:
public class DecodeString{
{
public static void main(String args[]){
String strEncode = "gACA"; //gACA is encoded string in PHP
byte byteEncode[] = com.sun.org.apache.xerces.internal.impl.dv.util.Base64.decode(strEncode );
System.out.println("Decoded String" + new String(k, "UTF-8"));
}
}
Output:
??
Please help me out

Java has built-in Base64 encoder-decoder, no need extra libraries to decode it:
byte[] data = javax.xml.bind.DatatypeConverter.parseBase64Binary("gACA");
for (byte b : data)
System.out.printf("%02x ", b);
Output:
80 00 80
It's 3 bytes with hexadecimal codes: 80 00 80

public static void main(String args[]) {
String strEncode = "gACA"; //gACA is encoded string in PHP
byte byteEncode[] = Base64.decode(strEncode);
String result = new String(byteEncode, "UTF-8");
char[] resultChar = result.toCharArray();
for(int i =0; i < resultChar.length; i++)
{
System.out.println((int)resultChar[i]);
}
System.out.println("Decoded String: " + result);
}
I suspect it's an encoding problem. Issue about 65533 � in C# text file reading this post suggest the first and last character are \“. In the middle there is a char 0. Your result is probably "" or "0", but with wrong encoding.

Try this, it worked fine for me (However I was decoding files):
Base64.decodeBase64(IOUtils.toByteArray(strEncode));
So it would look like this:
public class DecodeString{
{
public static void main(String args[]){
String strEncode = "gACA"; //gACA is encoded string in PHP
byte[] byteEncode = Base64.decodeBase64(IOUtils.toByteArray(strEncode));
System.out.println("Decoded String" + new String(k, "UTF-8));
}
}
Note that you will need extra libraries:
Commons Codec
Commons FileUpload
Commons IO

First things first, the code you use should not compile, it's missing a closing quote after "UTF-8.
And yeah, "gACA" is a valid base64 string as the format goes, but it doesn't decode to any meaningful UTF-8 text. I suppose you're using the wrong encoding, or messed up the string somehow...

RFC 4648 defines two alphabets.
PHP uses Base 64 Encoding
Java uses Base 64 Encoding with URL and Filename Safe Alphabet.
They are very close but not the exact same. In PHP:
const REPLACE_PAIRS = [
'-' => '+',
'_' => '/'
];
public static function base64FromUrlSafeToPHP($base64_url_encoded) {
return strtr($base64_url_encoded, self::REPLACE_PAIRS);
}
public static function base64FromPHPToUrlSafe($base64_encoded) {
return strtr($base64_encoded, array_flip(self::REPLACE_PAIRS));
}

Related

Convert a byte array from one encoding to another java

hi guys i should convert this code to C# in Java. Could you give me a hand?
private static String ConvertStringToHexStringByteArray(String input) {
Encoding ebcdic = Encoding.GetEncoding("IBM037");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(input);
byte[] isoBytes = Encoding.Convert(utf8, ebcdic, utfBytes);
StringBuilder hex = new StringBuilder(isoBytes.length * 2);
foreach( byte b in isoBytes)
hex.AppendFormat("{0:x2}", b);
return hex.ToString();
}
I tried to convert it to java like this. But the result is different:
private static String ConvertStringToHexStringByteArray(String input) throws UnsupportedEncodingException {
byte[] isoBytes = input.getBytes("IBM037");
StringBuilder hex = new StringBuilder(isoBytes.length * 2);
for (byte b : isoBytes) {
hex.append(String.format("%02x", b));
}
return hex.toString();
}
input = "X1GRUPPO 00000000726272772"
expected = "e7f1c7d9e4d7d7d64040404040f0f0f0f0f0f0f0f0f1f6f7f3f5f3f5f5f2"
result = "e7f1c7d9e4d7d7d640f0f0f0f0f0f0f0f0f7f2f6f2f7f2f7f7f2"
what am I doing wrong?
Your code works but you are comparing the output for two different input strings.
When you write expected and result side by side:
e7f1c7d9e4d7d7d64040404040f0f0f0f0f0f0f0f0f1f6f7f3f5f3f5f5f2
e7f1c7d9e4d7d7d640f0f0f0f0f0f0f0f0f7f2f6f2f7f2f7f7f2
you will notice that both start with the same sequence (e7f1c7d9e4d7d7d6) which seems to come from a common beginning X1GRUPPO
But then the two outputs differ:
4040404040f0f0f0f0f0f0f0f0f1f6f7f3f5f3f5f5f2
40f0f0f0f0f0f0f0f0f7f2f6f2f7f2f7f7f2
Reasoning from the input that you provided, the remainder of first input string starts with 5 spaces followed by "00000000167353552"
This means the complete input for the C# code was "X1GRUPPO 00000000167353552", which is not the same input that you provided to the Java code and then clearly the output cannot match.

base64 encoding issue, java

I'm using apache library for encoding to base64. But this time problem is very typical. I've a b64 encoded string.
MIIHSjCCBjKgAwIBAgIQQuw1emUfNRlPD/euDuzBjDANBgkqhkiG9w0BAQUFADCB"+
"5TELMAkGA1UEBhMCRVMxIDAeBgkqhkiG9w0BCQEWEWFjQGFjYWJvZ2FjaWEub3Jn
Its the part of certificate (.CER) file. I am just decoding it and again encoding it but result is little bit different. Resultant string is,
"MIIHSjCCBjKgAwIBAgIQQuw1emUfNRlPD/euDuzBjDANBgkqhkiG9w0BAQUFADA"+ "/5TELMAkGA1UEBhMCRVMxIDAeBgkqhkiG9w0BCQEWEWFjQGFjYWJvZ2FjaWEub3Jn"
The difference is at the end of the first line and starting of the second line. CB are replaced by A/.
This change invalidates my certificate. Where the problem can be ?
The problem is in your intermediate string conversion. If you use only byte array, everything is fine.
public static void main(String args[]) {
String partOfCer = "MIIHSjCCBjKgAwIBAgIQQuw1emUfNRlPD/euDuzBjDANBgkqhkiG9w0BAQUFADCB" + "5TELMAkGA1UEBhMCRVMxIDAeBgkqhkiG9w0BCQEWEWFjQGFjYWJvZ2FjaWEub3Jn";
byte[] dec1_byte = Base64.decodeBase64(partOfCer.getBytes());
// String dec1 = new String(dec1_byte);
byte[] newBytes = Base64.encodeBase64(dec1_byte);
String newStr = new String(newBytes);
System.out.println(partOfCer);
System.out.println(newStr);
System.out.println(partOfCer.equals(newStr));
}

Check if a String contains encoded characters

Hello I am looking for a way to detect if a string has being encoded
For example
String name = "Hellä world";
String encoded = new String(name.getBytes("utf-8"), "iso8859-1");
The output of this encoded variable is:
Hellä world
As you can see there is an A with grave and another symbol. Is there a way to check if the output contains encoded characters?
Sounds like you want to check if a string that was decoded from bytes in latin1 could have been decoded in UTF-8, too. That's easy because illegal byte sequences are replaced by the character \ufffd:
String recoded = new String(encoded.getBytes("iso-8859-1"), "UTF-8");
return recoded.indexOf('\uFFFD') == -1; // No replacement character found
Your question doesn't make sense. A java String is a list of characters. They don't have an encoding until you convert them into bytes, at which point you need to specify one (although you will see a lot of code that uses the platform default, which is what e.g. String.getBytes() with no argument does).
I suggest you read this http://kunststube.net/encoding/.
String name = "Hellä world";
String encoded = new String(name.getBytes("utf-8"), "iso8859-1");
This code is just a character corruption bug. You take a UTF-16 string, transcode it to UTF-8, pretend it is ISO-8859-1 and transcode it back to UTF-16, resulting in incorrectly encoded characters.
If I correctly understood your question, this code may help you. The function isEncoded check if its parameter could be encoded as ascii or if it contains non ascii-chars.
public boolean isEncoded(String text){
Charset charset = Charset.forName("US-ASCII");
String checked=new String(text.getBytes(charset),charset);
return !checked.equals(text);
}
#Test
public void testAscii() throws Exception{
Assert.assertFalse(isEncoded("Hello world"));
}
#Test
public void testNonAscii() throws Exception{
Assert.assertTrue(isEncoded("Hellä world"));
}
You can also check for other charset changing charset var or moving it to a parameter.
I'm not really sure what are you trying to do or what is your problem.
This line doesn't make any sense:
String encoded = new String(name.getBytes("utf-8"), "iso8859-1");
You are encoding your name into "UTF-8" and then trying to decode as "iso8859-1".
If you what to encode your name as "iso8859-1" just do name.getBytes("iso8859-1").
Please tell us what is the problem you encountered so that we can help more.
You can check that your string is encoded or not by this code
public boolean isEncoded(String input) {
char[] charArray = input.toCharArray();
for (int i = 0, charArrayLength = charArray.length; i < charArrayLength; i++) {
Character c = charArray[i];
if (Character.getType(c) == Character.OTHER_LETTER)){
return true;
}
}
return false;
}

Extract hexadecimal values from a percent encoded URL

Let's say for example i have URL containing the following percent encoded character : %80
It is obviously not an ascii character.
How would it be possible to convert this value to the corresponding hex string in Java.
i tried the following with no luck.Result should be 80.
public static void main(String[] args) {
System.out.print(byteArrayToHexString(URLDecoder.decode("%80","UTF-8").getBytes()));
}
public static String byteArrayToHexString(byte[] bytes)
{
StringBuffer buffer = new StringBuffer();
for(int i=0; i<bytes.length; i++)
{
if(((int)bytes[i] & 0xff) < 0x10)
buffer.append("0");
buffer.append(Long.toString((int) bytes[i] & 0xff, 16));
}
return buffer.toString();
}
The best way to deal with this is to parse the url using either java.net.URL or java.net.URI, and then use the relevant getters to extract the components that you require. These will take care of decoding any %-encoded portions in the appropriate fashion.
The problem with your current idea is that %80 does not represent "80", or 80. Rather it represents a byte that further needs to be interpreted in the context of the character encoding of the URL. And if the encoding is UTF-8, then the %80 needs to be followed by one or two more %-encoded bytes ... otherwise this is a malformed UTF-8 character representation.
I don't really see what you are trying. However, I'll give it a try.
When you have got this String: "%80" and you want to got the string "80", you can use this:
String str = "%80";
String hex = str.substring(1); // Cut off the '%'
If you are trying to extract the value 0x80 (which is 128 in decimal) out of it:
String str = "%80";
String hex = str.substring(1); // Cut off the '%'
int value = Integer.parseInt(hex, 16);
If you are trying to convert an int to its hexadecimal representation use this:
String hexRepresenation = Integer.toString(value, 16);

Convert byte array to understandable String

I have a program that handles byte arrays in Java, and now I would like to write this into a XML file. However, I am unsure as to how I can convert the following byte array into a sensible String to write to a file. Assuming that it was Unicode characters I attempted the following code:
String temp = new String(encodedBytes, "UTF-8");
Only to have the debugger show that the encodedBytes contain "\ufffd\ufffd ^\ufffd\ufffd-m\ufffd\ufffd\/ufffd \ufffd\ufffdIA\ufffd\ufffd". The String should contain a hash in alphanumerical format.
How would I turn the above String into a sensible String for output?
The byte array doesn't look like UTF-8. Note that \ufffd (named REPLACEMENT CHARACTER) is "used to replace an incoming character whose value is unknown or unrepresentable in Unicode."
Addendum: Here's a simple example of how this can happen. When cast to a byte, the code point for ñ is neither UTF-8 nor US-ASCII; but it is valid ISO-8859-1. In effect, you have to know what the bytes represent before you can encode them into a String.
public class Hello {
public static void main(String[] args)
throws java.io.UnsupportedEncodingException {
String s = "Hola, señor!";
System.out.println(s);
byte[] b = new byte[s.length()];
for (int i = 0; i < b.length; i++) {
int cp = s.codePointAt(i);
b[i] = (byte) cp;
System.out.print((byte) cp + " ");
}
System.out.println();
System.out.println(new String(b, "UTF-8"));
System.out.println(new String(b, "US-ASCII"));
System.out.println(new String(b, "ISO-8859-1"));
}
}
Output:
Hola, señor!
72 111 108 97 44 32 115 101 -15 111 114 33
Hola, se�or!
Hola, se�or!
Hola, señor!
If your string is the output of a password hashing scheme (which it looks like it might be) then I think you will need to Base64 encode in order to put it into plain text.
Standard procedure, if you have raw bytes you want to output to a text file, is to use Base 64 encoding. The Commons Codec library provides a Base64 encoder / decoder for you to use.
Hope this helps.

Categories

Resources