I realise this is probably more of a general java question, but since it's running in Notes\ Domino environment, thought I'd check that community first.
Summary:
I don't seem to be able to decode the string: dABlAHMAdAA= using lotus.domino.axis.encoding.Base64 or sun.misc.BASE64Decoder
I know the original text is: test
I confirmed by decoding at http://www5.rptea.com/base64/ it appears it is UTF-16.
As simple test, using either of below:
String s_base64 = "dABlAHMAdAA=";
byte[] byte_base64 = null;
String s_decoded = "";
byte_base64 = new sun.misc.BASE64Decoder().decodeBuffer(s_base64);
s_decoded = new String(byte_base64, "UTF-16");
System.out.println("Test1: " + s_decoded);
byte_base64 = lotus.domino.axis.encoding.Base64.decode(s_base64);
s_decoded = new String(byte_base64, "UTF-16");
System.out.println("Test2: " + s_decoded);
System.out.println("========= FINISH.");
I get the output:
Test1: ????
Test2: ????
If I create String as UTF-8
s_decoded = new String(byte_base64, "UTF-8");
it outputs:
t
no error is thrown, but it doesn't complete the code, doesn't get to the "FINISH".
Detail
I'm accessing an asmx web service, in the SOAP response, some nodes contain base64 encoded data. At this point in time, there is no way to get the service changed, so I am having to XPath and decode myself. Encoded data is either text or html. If I pass the encoded data thru http://www5.rptea.com/base64/ and select UTF-16, it decodes correctly, so I must be doing something incorrectly.
As side note, I encoded "test":
s_base64 = lotus.domino.axis.encoding.Base64.encode(s_text.getBytes());
System.out.println("test1 encodes to: " + s_base64);
s_base64 = new sun.misc.BASE64Encoder().encode(s_text.getBytes());
System.out.println("test2 encodes to: " + s_base64);
they both encode to:
dGVzdA==
...which if you then feed into 2 decoders above, as expected, decodes correctly.
If I go to site above, and encode "test" as UTF-16, I get: dABlAHMAdAA= so that confirms that data is in UTF-16.
It's like the data is genuine base64 data, but the decoder doesn't recognise it as such. I'm slightly stumped at the moment.
Any pointers or comments would be gratefully received.
The string has been encoded in UTF-16LE (little-endian), where the least significant byte is stored first. Java defaults to big-endian. You need to use:
s_decoded = new String(byte_base64, "UTF-16LE");
i have used your sample "dABlAHMAdAA=" on my base64 decode online tool and it seems like you are missing the Apache base64 jar files
Click the link below.
http://www.hosting4free.info/Base64Decode/Base64-Decode.jsp
The code behind the website is
import org.apache.commons.codec.binary.Base64;
public class base64decode
{
public static void main(String[] args) throws UnsupportedEncodingException
{
byte[] decoded = Base64.decodeBase64("YWJjZGVmZw==".getBytes());
System.out.println(new String(decoded) + "\n");
}
}
Related
To reset my password I want to send the user a link to site/account/{hash} where {hash} is a hash of the user's password and a timestamp.
I have the following code to hash only the email and have a readable link:
String check = info.mail;
MessageDigest md = MessageDigest.getInstance("SHA-1");
String checkHash = Base64.encodeBase64String(md.digest(check.getBytes()));
if(checkHash.equals(hash)){
return ResponseEntity.ok("Password reset to: " + info.password);
}else{
return ResponseEntity.ok("Hash didn't equal to: " + checkHash);
}
The problem is that when I convert this to Base64 it may include / signs what will mess up my links and checking of the hash.
I can simply replace any unwanted signs by something else after the hashing but is there some other way to have your hash only include a certain part of codes?
Also I know the returns are still sent unsafe but this is just for testing and debugging.
The RFC 3548 specifies a variant often called "base64url" specifically designed for that purpose. In this variant, + and / are replaced by - and _.
Java 8 has built-in support with the new Base64 class. If you're stuck with an older version, the Base64 class of Apache Commons can be configured to be url safe by using the new Base64(true) constructor.
Other options might be:
Don't use Base64, but transfer the bytes as hexadecimal
representation (which will not contain any special characters):
String checkHash = toHex(md.digest(check.getBytes()));
with
private static String toHex(byte[] bytes) {
StringBuilder sb = new StringBuilder();
for (byte b : bytes) {
sb.append(String.format("%02X", b));
}
return sb.toString();
}
Use URL encoding/decoding on the generated hash (that's what you already know)
Question
Are the Java 8 java.util.Base64 MIME Encoder and Decoder a drop-in replacement for the unsupported, internal Java API sun.misc.BASE64Encoder and sun.misc.BASE64Decoder?
EDIT (Clarification): By drop-in replacement
I mean that I can switch legacy code using sun.misc.BASE64Encoder and sun.misc.BASE64Decoder to Java 8 MIME Base64 Encoder/Decoder for any existing other client code transparently.
What I think so far and why
Based on my investigation and quick tests (see code below) it should be a drop-in replacement because
sun.misc.BASE64Encoder based on its JavaDoc is a BASE64 Character encoder as specified in RFC1521. This RFC is part of the MIME specification...
java.util.Base64 based on its JavaDoc Uses the "The Base64 Alphabet" as specified in Table 1 of RFC 2045 for encoding and decoding operation... under MIME
Assuming no significant changes in the RFC 1521 and 2045 (I could not find any) and based on my quick test using the Java 8 Base64 MIME Encoder/Decoder should be fine.
What I am looking for
an authoritative source confirming or disproving the "drop-in replacement" point OR
a counterexample which shows a case where java.util.Base64 has different behaviour than the sun.misc.BASE64Encoder OpenJDK Java 8 implementation (8u40-b25) (BASE64Decoder) OR
whatever you think answers above question definitely
For reference
My test code
public class Base64EncodingDecodingRoundTripTest {
public static void main(String[] args) throws IOException {
String test1 = " ~!##$%^& *()_+=`| }{[]\\;: \"?><,./ ";
String test2 = test1 + test1;
encodeDecode(test1);
encodeDecode(test2);
}
static void encodeDecode(final String testInputString) throws IOException {
sun.misc.BASE64Encoder unsupportedEncoder = new sun.misc.BASE64Encoder();
sun.misc.BASE64Decoder unsupportedDecoder = new sun.misc.BASE64Decoder();
Base64.Encoder mimeEncoder = java.util.Base64.getMimeEncoder();
Base64.Decoder mimeDecoder = java.util.Base64.getMimeDecoder();
String sunEncoded = unsupportedEncoder.encode(testInputString.getBytes());
System.out.println("sun.misc encoded: " + sunEncoded);
String mimeEncoded = mimeEncoder.encodeToString(testInputString.getBytes());
System.out.println("Java 8 Base64 MIME encoded: " + mimeEncoded);
byte[] mimeDecoded = mimeDecoder.decode(sunEncoded);
String mimeDecodedString = new String(mimeDecoded, Charset.forName("UTF-8"));
byte[] sunDecoded = unsupportedDecoder.decodeBuffer(mimeEncoded); // throws IOException
String sunDecodedString = new String(sunDecoded, Charset.forName("UTF-8"));
System.out.println(String.format("sun.misc decoded: %s | Java 8 Base64 decoded: %s", sunDecodedString, mimeDecodedString));
System.out.println("Decoded results are both equal: " + Objects.equals(sunDecodedString, mimeDecodedString));
System.out.println("Mime decoded result is equal to test input string: " + Objects.equals(testInputString, mimeDecodedString));
System.out.println("\n");
}
}
Here's a small test program that illustrates a difference in the encoded strings:
byte[] bytes = new byte[57];
String enc1 = new sun.misc.BASE64Encoder().encode(bytes);
String enc2 = new String(java.util.Base64.getMimeEncoder().encode(bytes),
StandardCharsets.UTF_8);
System.out.println("enc1 = <" + enc1 + ">");
System.out.println("enc2 = <" + enc2 + ">");
System.out.println(enc1.equals(enc2));
Its output is:
enc1 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>
enc2 = <AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
false
Note that the encoded output of sun.misc.BASE64Encoder has a newline at the end. It doesn't always append a newline, but it happens to do so if the encoded string has exactly 76 characters on its last line. (The author of java.util.Base64 considered this to be a small bug in the sun.misc.BASE64Encoder implementation – see the review thread).
This might seem like a triviality, but if you had a program that relied on this specific behavior, switching encoders might result in malformed output. Therefore, I conclude that java.util.Base64 is not a drop-in replacement for sun.misc.BASE64Encoder.
Of course, the intent of java.util.Base64 is that it's a functionally equivalent, RFC-conformant, high-performance, fully supported and specified replacement that's intended to support migration of code away from sun.misc.BASE64Encoder. You need to be aware of some edge cases like this when migrating, though.
I had same issue, when i moved from sun to java.util.base64, but then org.apache.commons.codec.binary.Base64 solved my problem
There are no changes to the base64 specification between rfc1521 and rfc2045.
All base64 implementations could be considered to be drop-in replacements of one another, the only differences between base64 implementations are:
the alphabet used.
the API's provided (e.g. some might take only act on a full input buffer, while others might be finite state machines allowing you to continue to push chunks of input through them until you are done).
The MIME base64 alphabet has remained constant between RFC versions (it has to or older software would break) and is: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
As Wikipedia notes, only the last 2 characters may change between base64 implementations.
As an example of a base64 implementation that does change the last 2 characters, the IMAP MUTF-7 specification uses the following base64 alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+,
The reason for the change is that the / character is often used as a path delimiter and since the MUTF-7 encoding is used to flatten non-ASCII directory paths into ASCII, the / character needed to be avoided in encoded segments.
Assuming both encoders are bug free, then the RFC requires distinct encodings for every 0 byte, 1 byte, 2 byte and 3 bytes sequence. Longer sequences are broken down into as many 3 byte sequences as needed followed by a final sequence. Hence if the two implementations handle all 16,843,009 (1+256+65536+16777216) possible sequences correctly, then the two implementations are also identical.
These tests only take a few minutes to run. By slightly changing your test code, I have done that and my Java 8 installation passed all the test. Hence the public implementation can be used to safely replace the sun.misc implementation.
Here is my test code:
import java.util.Base64;
import java.util.Arrays;
import java.io.IOException;
public class Base64EncodingDecodingRoundTripTest {
public static void main(String[] args) throws IOException {
System.out.println("Testing zero byte encoding");
encodeDecode(new byte[0]);
System.out.println("Testing single byte encodings");
byte[] test = new byte[1];
for(int i=0;i<256;i++) {
test[0] = (byte) i;
encodeDecode(test);
}
System.out.println("Testing double byte encodings");
test = new byte[2];
for(int i=0;i<65536;i++) {
test[0] = (byte) i;
test[1] = (byte) (i >>> 8);
encodeDecode(test);
}
System.out.println("Testing triple byte encodings");
test = new byte[3];
for(int i=0;i<16777216;i++) {
test[0] = (byte) i;
test[1] = (byte) (i >>> 8);
test[2] = (byte) (i >>> 16);
encodeDecode(test);
}
System.out.println("All tests passed");
}
static void encodeDecode(final byte[] testInput) throws IOException {
sun.misc.BASE64Encoder unsupportedEncoder = new sun.misc.BASE64Encoder();
sun.misc.BASE64Decoder unsupportedDecoder = new sun.misc.BASE64Decoder();
Base64.Encoder mimeEncoder = java.util.Base64.getMimeEncoder();
Base64.Decoder mimeDecoder = java.util.Base64.getMimeDecoder();
String sunEncoded = unsupportedEncoder.encode(testInput);
String mimeEncoded = mimeEncoder.encodeToString(testInput);
// check encodings equal
if( ! sunEncoded.equals(mimeEncoded) ) {
throw new IOException("Input "+Arrays.toString(testInput)+" produced different encodings (sun=\""+sunEncoded+"\", mime=\""+mimeEncoded+"\")");
}
// Check cross decodes are equal. Note encoded forms are identical
byte[] mimeDecoded = mimeDecoder.decode(sunEncoded);
byte[] sunDecoded = unsupportedDecoder.decodeBuffer(mimeEncoded); // throws IOException
if(! Arrays.equals(mimeDecoded,sunDecoded) ) {
throw new IOException("Input "+Arrays.toString(testInput)+" was encoded as \""+sunEncoded+"\", but decoded as sun="+Arrays.toString(sunDecoded)+" and mime="+Arrays.toString(mimeDecoded));
}
}
}
Stuart Marks' answer is almost correct. The getMimeEncoder in his example above should be configured like this to emulate sun.misc:
String enc2 = new String(java.util.Base64.getMimeEncoder(76, new byte[]{0xa}).encode(bytes),
StandardCharsets.UTF_8);
At this point, it will be a drop-in as requested in the original post.
My input hex is
C30A010000003602000F73B32F9ECA00E9F2F2E9
I need to convert it to the following base 64 encoded String:
wwoBAAAANgIAD3OzL57KAOny8uk=
I can simulate this transformation on this site: http://www.asciitohex.com/ but I cant seem to get this transformation working in Java using the various base64 encoder Utils that are suggested on this site and other places on the web. For example,
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Base64;
public class Test {
public static void main(final String args[]) throws DecoderException {
String hexString = "C30A010000003602000F73B32F9ECA00E9F2F2E9";
String output = new String(Base64.encodeBase64String(hexString.getBytes()));
System.out.println(output);
}
However the output for this is something different:
QzMwQTAxMDAwMDAwMzYwMjAwMEY3M0IzMkY5RUNBMDBFOUYyRjJFOQ==
Can anyone suggest how to get this transformation working successfully?
Thanks
Basically, hexString.getBytes() doesn't do what you expect it to. It's just encoding the string as a byte sequence in your platform default encoding - it's got nothing to do with hex.
You need to decode from hex to byte[] to start with. Additionally, you don't need to call the String constructor with another string. As you're already using Apache Commons Codec, it makes sense to use the Hex class for the decoding. I would also separate out the steps for clarity:
String hexString = "C30A010000003602000F73B32F9ECA00E9F2F2E9";
byte[] rawData = Hex.decodeHex(hexString.toCharArray());
String output = Base64.encodeBase64String(rawData);
I'm new to Java and I'm no sure how to do the following:
A Scala application somewhere converts a String into bytes:
ByteBuffer.wrap(str.getBytes)
I collect this byte array as a Java String, and I wish to do the inverse of what the Scala code above did, hence get the original String (object str above).
Getting the ByteBuffer as a String to begin with is the only option I have, as I'm reading it from an AWS Kinesis stream (or is it?). The Scala code shouldn't change either.
Example string:
String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";
How can this be achieved in Java?
EDIT
Okay, so I'll try to elaborate a little more about the process:
A 3rd party Scala application produces CSV rows which I need to consume
Before storing those rows in an AWS Kinesis stream, the application does the following to each row:
ByteBuffer.wrap(output.getBytes);
I read the data from the stream as a string, and the string could look like the following one:
String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";
I need to restore the contents of the string above into its original, readable, form;
I hope I've made it clearer now, sorry for puzzling you all to begin with.
If you want to go from byte[] to String, try new String(yourBytes).
Both getBytes and the String(byte[]) uses the default character encoding.
From Amazon Kinesis Service API Reference:
The data blob to put into the record, which is Base64-encoded when the blob is serialized.
You need to base64 decode the string. Using Java 8 it would look like:
byte[] bytes = Base64.getDecoder().decode("AAAAAAAAAAGZ7dFR0XmV23BR........");
str = new String(bytes, "utf-8"));
Other options: Base64 Encoding in Java
I m not sure if I understand the question exactly but do you mean this?
String decoded = new String(bytes);
public static void main(String[] args){
String decoded = new String(bytesData);
String actualString;
try{
actualString = new String(bytesData,"UTF-8");
System.out.printLn("String is" + actualString);
}catch(UnsupportedEncodingException e){
e.printstacktrace();
}
}
Sorry,wrong answer.
Again,ByteBuffer is a java class. SO they may work the same way
You need java version..
From kafka ApiUtils:
def writeShortString(buffer:ByteBuffer,string:String){
if(String == null){
buffer.putShort(-1)
}
else{
val encodedString = string.getBytes(“utf-8”)
if(encodedString.length > Short.MaxValue){
throw YourException(Your Message)
else{
buffer.putShort(encodedString.length.asInstanceOf[Short])
buffer.put(encodedString)
}
}
}
For Kinesis data blobs:
private CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
decoder.decode(record.getData()).toString();
Running the following (example) code
import java.io.*;
public class test {
public static void main(String[] args) throws Exception {
byte[] buf = {-27};
InputStream is = new ByteArrayInputStream(buf);
BufferedReader r = new BufferedReader(
new InputStreamReader(is, "ISO-8859-1"));
String s = r.readLine();
System.out.println("test.java:9 [byte] (char)" + (char)s.getBytes()[0] +
" (int)" + (int)s.getBytes()[0]);
System.out.println("test.java:10 [char] (char)" + (char)s.charAt(0) +
" (int)" + (int)s.charAt(0));
System.out.println("test.java:11 string below");
System.out.println(s);
System.out.println("test.java:13 string above");
}
}
gives me this output
test.java:9 [byte] (char)? (int)63
test.java:10 [char] (char)? (int)229
test.java:11 string below
?
test.java:13 string above
How do I retain the correct byte value (-27) in the line-9 printout? And consequently receive the expected output of the System.out.println(s) command (å).
If you want to retain byte values, don't use a Reader at all, ideally. To represent arbitrary binary data in text and convert it back to binary data later, you should use base16 or base64 encoding.
However, to explain what's going on, when you call s.getBytes() that's using the default character encoding, which apparently doesn't include Unicode character U+00E5.
If you call s.getBytes("ISO-8859-1") everywhere instead of s.getBytes() I suspect you'll get back the right byte value... but relying on ISO-8859-1 for this is kinda dirty IMO.
As noted, getBytes() (no-arguments) uses the Java platform default encoding, which may not be ISO-8859-1. Simply printing it should work, provided your terminal and the default encoding match and support the character. For instance, on my system, the terminal and default Java encoding are both UTF-8. The fact that you're seeing a '?' indicates that yours don't match or å is not supported.
If you want to manually encode to UTF-8 on your system, do:
String s = r.readLine();
byte[] utf8Bytes = s.getBytes("UTF-8");
It should give a byte array with {-61, -91}.