Base64 encode gives different result on linux CentOS terminal and in Java

Base64 encode gives different result on linux CentOS terminal and in Java - java

I am trying to generate some random password on Linux CentOS and store it in database as base64. Password is 'KQ3h3dEN' and when I convert it with 'echo KQ3h3dEN | base64' as a result I will get 'S1EzaDNkRU4K'.
I have function in java:
public static String encode64Base(String stringToEncode)
{
byte[] encodedBytes = Base64.getEncoder().encode(stringToEncode.getBytes());
String encodedString = new String(encodedBytes, "UTF-8");
return encodedString;
}
And result of encode64Base("KQ3h3dEN") is 'S1EzaDNkRU4='.
So, it is adding "K" instead of "=" in this example. How to ensure that I will always get same result when using base64 on linux and base64 encoding in java?
UPDATE: Updated question as I didn't noticed "K" at the end of linux encoded string. Also, here are few more examples:
'echo KQ3h3dENa | base64' => result='S1EzaDNkRU5hCg==', but it should be 'S1EzaDNkRU5h'
echo KQ3h3dENaa | base64' => result='S1EzaDNkRU5hYQo=', but it should be 'S1EzaDNkRU5hYQ=='

Found solution after few hours of experimenting. It seems like new line was added to the string I wanted to encode. Solution would be :
echo -n KQ3h3dEN | base64
Result will be the same as with java base64 encode.

Padding
The '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes.
In theory, the padding character is not needed for decoding, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations, the padding character is mandatory, while for others it is not used.
So it depends on tools and libraries you use. If base64 with padding is the same as without padding for them, there is no problem. As an insurance you can use on linux tool that generates base64 with padding.

Use withoutPadding() of Base64.Encoder class to get Base64.Encoder instance which encodes without adding any padding character at the end.
check the link :
https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html#withoutPadding

Related

java base64.deocde to python3

I have a piece of java code
enter code here
byte[] random1 = Base64.getDecoder().decode(arr.getString(2));
byte[] test1 = "/bCN99cbY13kwEf+wnRErg".getBytes(StandardCharsets.ISO_8859_1);
System.out.println(test1.length); // #22
System.out.println(Base64.getDecoder().decode(test1).length); // #16
I am trying to use python3 and I get an error.
text = bytes("/bCN99cbY13kwEf+wnRErg", encoding='iso-8859-1')
print(len(base64.b64decode(text)))
# Traceback (most recent call last):
# File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\base64.py", line
# 546, in decodebytes
# return binascii.a2b_base64(s)
# binascii.Error: Incorrect padding
How can I use python3 to implement the functions on the java code to achieve decode length=16

The problem is that your Base64 encoded array is missing the padding, which is not always required.
It is no problem in Java as the Decoder does not require it:
The Base64 padding character '=' is accepted and interpreted as the end of the encoded byte data, but is not required.
In contrast, the Python b64decode(s) method requires the padding and throws an error if it is missing.
A binascii.Error exception is raised if s is incorrectly padded.
I found a simple solution in this answer: Always adding the maximum padding at the end of the byte array (two equal signs ==). The b64decode(s) method simply ignores the padding if it is too long and so it always works.
You only have to change your code slightly for it to work:
text = bytes("/bCN99cbY13kwEf+wnRErg", encoding='iso-8859-1')
padded_text = text + b'=='
result = base64.b64decode(padded_text)
print(len(result))
The output is 16 identical to the Java output.

Java Encoding for "GB2312" CHARACTER ® replacing with question mark(?)

I'm trying to get encoded value using GB2312 characterset but I'm getting '? 'instead of '®'
Below is my sample code:
new String("Test ®".getBytes("GB2312"));
but I'm getting Test ? instead of Test ®.
Does any one faced this issue?
Java version- JDK6
Platform: Window 7
I'm not aware of Chinese character encoding so need suggestion.

For better understanding, the statement divided in two parts:
byte[] bytes = "Test ®".getBytes("GB2312"); // bytes, encoding the string to GB2312
new String(bytes); // back to string, using default encoding
Probably ® is not a valid GB2312 character, so it is converted to ?. See the result of
Charset.forName("GB2312").newEncoder().canEncode("®")
Based on documentation of getBytes:
The behavior of this method when this string cannot be encoded in the given charset is unspecified. The CharsetEncoder class should be used when more control over the encoding process is required.
which also suggest using CharsetEncoder.

Convert JSON Base64 string to String in Java

I am trying to convert a protobuf stream to JSON object using the com.google.protobuf.util.JsonFormat class as below.
String jsonFormat = JsonFormat.printer().print(data);
As per the documentation https://developers.google.com/protocol-buffers/docs/proto3#json I am getting the bytes as Base64 string(example "hashedStaEthMac": "QDOMIxG+tTIRi7wlMA9yGtOoJ1g=",
). But I would like to get this a readable string(example "locAlgorithm": "ALGORITHM_ESTIMATION",
). Below is a sample output. is there a way to the JSON object asplain text or any work around to get the actual values.
{
"seq": "71811887",
"timestamp": 1488640438,
"op": "OP_UPDATE",
"topicSeq": "9023777",
"sourceId": "xxxxxxxx",
"location": {
"staEthMac": {
"addr": "xxxxxx"
},
"staLocationX": 1148.1763,
"staLocationY": 980.3377,
"errorLevel": 588,
"associated": false,
"campusId": "n5THo6IINuOSVZ/cTidNVA==",
"buildingId": "7hY/jVh9NRqqxF6gbqT7Jw==",
"floorId": "LV/ZiQRQMS2wwKiKTvYNBQ==",
"hashedStaEthMac": "xxxxxxxxxxx",
"locAlgorithm": "ALGORITHM_ESTIMATION",
"unit": "FEET"
}
}
Expected format is as below.
seq: 85264233
timestamp: 1488655098
op: OP_UPDATE
topic_seq: 10955622
source_id: 00505698749E
location {
sta_eth_mac {
addr: xx:xx:xx:xx:xx:xx
}
sta_location_x: 916.003
sta_location_y: 580.115
error_level: 854
associated: false
campus_id: 9F94C7A3A20836E392559FDC4E274D54
building_id: EE163F8D587D351AAAC45EA06EA4FB27
floor_id: 83144E609EEE3A64BBD22C536A76FF5A
hashed_sta_eth_mac:
loc_algorithm: ALGORITHM_ESTIMATION
unit: FEET
}

Not easily, because the actual values are binary, which is why they're Base64-encoded in the first place.
Try to decode one of these values:
$ echo -n 'n5THo6IINuOSVZ/cTidNVA==' | base64 -D
??ǣ6?U??N'MT
In order to get more readable values, you have to understand what the binary data actually is, and then decide what format you want to use to display it.
The field called staEthMac.addr is 6 bytes and is probably an Ethernet MAC address. It's usually displayed as xx:xx:xx:xx:xx:xx where xx are the hexadecimal values of each byte. So you could decode the Base64 strings into a byte[] and then call a function to convert each byte to hex and delimit them with ':'.
The fields campusId, buildingId, and floorId are 16 bytes (128 bits) and are probably UUIDs. UUIDs are usually displayed as xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx where each x is a hex digit (4 bits). So you could (again) convert the Base64 string to byte[] and then print the hex digits, optionally adding the dashes.
Not sure about sourceId and hashedStaEthMac, but you could just follow the pattern of converting to byte[] and printing as hex. Essentially you're just doing a conversion from base 64 to base 16. You'll wind up with something like this:
$ echo -n 'n5THo6IINuOSVZ/cTidNVA==' | base64 -D | xxd -p
9f94c7a3a20836e392559fdc4e274d54
A point that I'm not sure you are getting is that it's binary data. There is no "readable" version that makes sense like "ALGORITHM_ESTIMATION" does; the best you can do is encode the binary data using letters and numbers so you can at least pronounce it.
Base64 (which encodes binary using 64 different characters) is pronounceable "N five T H lowercase-O six ..." but it's not real friendly because letter case is significant and because it uses letters like O and I that look like numbers. Hex (which encodes binary using just 16 characters) is a little easier to read.

Charset bug in AES Decryption on Android system

Good evening!
In my android app the smartphones load a AES encrypted String from my server and store it in a variable. After that process the variable and a key are pass to a method which decrypt the string. My mistake is that german umlauts (ä, ü, ö) aren't correct decoded. All umlauts displayed as question marks with black background...
My Code:
public static String decrypt(String input, String key) {
byte[] output = null;
String newString = "";
try {
SecretKeySpec skey = new SecretKeySpec(key.getBytes(), "AES");
Cipher cipher = Cipher.getInstance("AES/ECB/PKCS5Padding");
cipher.init(Cipher.DECRYPT_MODE, skey);
output = cipher.doFinal(Base64.decode(input, Base64.DEFAULT));
newString = new String(output);
} catch(Exception e) {}
return newString;
}
The code works perfectly - only umlauts displayed not correctly, an example is that (should be "ö-ä-ü"):
How can I set the encoding of the decrypted String? In my iOS app I use ASCII to encoding the decoded downloaded String. That works perfectly! Android and iOS get the String from the same Server on the same way - so I think the problem is the local Code above.
I hope you can help me with my problem... Thanks!

There is no text but encoded text.
It seems like you are guessing at the character set and encoding—That's no way to communicate.
To recover the text, you need to reverse the original process applied to it with the parameters associated with each step.
For explanation, assume that the server is taking text from a Java String and sending it to you securely.
String uses the Unicode character set (specifically, Unicode's UTF-16 encoding).
Get the bytes for the String, using some specific encoding, say ISO8859-1. (UTF-8 could be better because it is also an encoding for the Unicode character set, whereas ISO8859-1 has a lot fewer characters.) As #Andy points out, exceptions are your friends here.
Encrypt the bytes with a specific key. The key is a sequence of bytes, so, if you are generating this from a string, you have to use a specific encoding.
Encode the encrypted bytes with Base64, producing a Java String (again, UTF-16) with a subset of characters so reduced that it can be re-encoded in just about any character encoding and placed in just about any context such as SMTP, XML, or HTML without being misinterpreted or making it invalid.
Transmit the string using a specific encoding. An HTTP header and/or HTML charset value is usually used to communicate which encoding.
To receive the text, you have to get:
the bytes,
the encoding from step 5,
the key from step 3,
the encoding from step 3 and
the encoding from step 2.
Then you can reverse all of the steps. Per your comments, you discovered you weren't using the encoding from step 2. You also need to use the encoding from step 3.

NSData bytes not matching bytes from Java WS with same base64 string

I am using protocol buffers in an iOS application. The app consumes a web service written in Java, which spits back a base64 encoded string.
The base64 string is the same on both ends.
In the app however, whenever I try to convert the string to NSData, the number of bytes may or may not be the same on both ends. The result is a possible invalid protocol buffer exception, invalid end tag.
For example:
Source(bytes) | NSData | Diff
93 93 0
6739 6735 -4
5745 5739 -6
The bytes are equal in the trivial case of an empty protocol buffer.
Here is the Java source:
import org.apache.commons.codec.binary.Base64;
....
public static String bytesToBase64(byte[] bytes) {
return Base64.encodeBase64String(bytes);
}
On the iOS side, I have tried various algorithms from similar questions which all agree in byte size and content.
What could be causing this?

On closer inspection, the issue was my assumption that Base64 is Base64. I was using the url variant in the web service while the app's decode was expecting a normal version.
I noticed underscores in the Base64, which I thought odd.
The Base64 page http://en.wikipedia.org/wiki/Base64 map of value/char shows no underscores, but later in the article goes over variants, which do use underscores.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.