CharBuffer and ByteBuffer - charset encoding

CharBuffer and ByteBuffer - charset encoding - java

Java stores characters in UCS-2 format.
byte[] bytes = {0x00, 0x48, 0x00, 0x69, 0x00, 0x2c,
0x60, (byte)0xA8, 0x59, 0x7D, 0x00, 0x21};
// Print UCS-2 in hex codes
System.out.printf("%10s", "UCS-2");
for(int i=0; i<bytes.length; i++) {
System.out.printf("%02x", bytes[i]);
}
1)
In the below code,
Charset charset = Charset.forName("UTF-8");
// Encode from UCS-2 to UTF-8
// Create a ByteBuffer by wrapping a byte array
ByteBuffer bb = ByteBuffer.wrap(bytes);
What is the byte order used to store bytes in bb on wrap()? BigEndian or LittleEndian?
2)
In the below code,
// Create a CharBuffer from a view of this ByteBuffer
CharBuffer cb = bb.asCharBuffer();
ByteBuffer bbOut = charset.encode(cb);
What is the encoding format used to store bytes of bb as characters in cb on asCharBuffer()?

Related

LengthFieldBasedFrameDecoder not parsing correctly when buffer size is less than frame size

I am unit testing a netty pipeline using the frame based decoder. It looks like the framing is incorrect if I use buffer size that is smaller that the largest frame. I am testing with a file that contains two messages. The length field is the second work and includes the length of the entire message including the length field and the work before it.
new LengthFieldBasedFrameDecoder(65536, 4, 4, -8, 0)
I am reading a file with various block sizes. The size of the first message is 348 bytes, the second is 456 bytes. If block size of 512, 3456, or larger, is used both messages are read and correctly framed to the next handler which for diagnostic purposes will print out as a hexadecimal string the contents of the buffer it received. If a smaller block size is used framing errors occur. The code used to read and write the file is shown below.
public class NCCTBinAToCSV {
private static String inputFileName = "/tmp/combined.bin";
private static final int BLOCKSIZE = 456;
public static void main(String[] args) throws Exception {
byte[] bytes = new byte[BLOCKSIZE];
EmbeddedChannel channel = new EmbeddedChannel(
new LengthFieldBasedFrameDecoder(65536, 4, 4, -8, 0),
new NCCTMessageDecoder(),
new StringOutputHandler());
FileInputStream fis = new FileInputStream(new File(inputFileName));
int bytesRead = 0;
while ((bytesRead = fis.read(bytes)) != -1) {
ByteBuf buf = Unpooled.wrappedBuffer(bytes, 0, bytesRead);
channel.writeInbound(buf);
}
channel.flush();
}
}
Output from a successful run with block size of 356 bytes is show below (with the body of the messages truncated for brevity
LOG:DEBUG 2017-04-24 04:19:24,675[main](netty.NCCTMessageDecoder) - com.ticomgeo.mtr.ncct.netty.NCCTMessageDecoder.decode(NCCTMessageDecoder.java:21) ]received 348 bytes
Frame Start========================================
(byte) 0xbb, (byte) 0x55, (byte) 0x05, (byte) 0x16,
(byte) 0x00, (byte) 0x00, (byte) 0x01, (byte) 0x5c,
(byte) 0x01, (byte) 0x01, (byte) 0x02, (byte) 0x02,
(byte) 0x05, (byte) 0x00, (byte) 0x00, (byte) 0x00,
(byte) 0x50, (byte) 0x3a, (byte) 0xc9, (byte) 0x17,
....
Frame End========================================
Frame Start========================================
(byte) 0xbb, (byte) 0x55, (byte) 0x05, (byte) 0x1c,
(byte) 0x00, (byte) 0x00, (byte) 0x01, (byte) 0xc8,
(byte) 0x01, (byte) 0x01, (byte) 0x02, (byte) 0x02,
(byte) 0x05, (byte) 0x00, (byte) 0x00, (byte) 0x00,
(byte) 0x04, (byte) 0x02, (byte) 0x00, (byte) 0x01,
If I change the block size to 256, the wrong bytes seem to be read as the length field.
Exception in thread "main" io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 65536: 4294967040 - discarded
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:499)
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:477)
at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:403)

TL;DR; Your problem is caused because netty reuses the passed in bytebuf, and then you are overwriting the contents.
LengthFieldBasedFrameDecoder is designed through inheritance to reuse the passed in ByteBuf, because it is useless to let the object decay through garbage collection when you can reuse it because its reference count is 1. The problem however comes from the fact that you are changing the internals of the passed in bytebuf, and therefore changing the frame on the fly. Instead of making a wrappedBuffer, that uses your passed in variable as storage, you should use copiedBuffer, because that one properly makes a copy of it, so the internals of LengthFieldBasedFrameDecoder can do freely things with it.

int to unsigned char array in Java

I'm trying to connect Android(Java) with Linux(Qt C++) using socket. After that I want to transfer length of message in bytes. For converting int in unsigned char array on the C++ side I use:
QByteArray IntToArray(qint32 source)
{
QByteArray tmp;
QDataStream data(&temp, QIODevice::ReadWrite);
data << source;
return tmp;
}
But I don't know how I can do the same converting on the Java side, because Java hasn't unsigned types. I tried to use some examples but always got different results. So, I need Java method which returns this for source = 17:
0x00, 0x00, 0x00, 0x11
I understand that it's a very simple question, but I'm new in Java so it's not clear to me.
UPD:
Java:
PrintWriter out = new PrintWriter(socket.getOutputStream(), true);
out.print(ByteBuffer.allocate(4).putInt(17).array());
Qt C++:
QByteArray* buffer = new QByteArray();
buffer->append(socket->readAll());
Output:
buffer = 0x5b, 0x42, 0x40, 0x61, 0x39, 0x65, 0x31,
0x62, 0x66, 0x35.
UPD2:
Java:
out.print(toBytes(17));
...
byte[] toBytes(int i)
{
byte[] result = new byte[4];
result[0] = (byte) (i >> 24);
result[1] = (byte) (i >> 16);
result[2] = (byte) (i >> 8);
result[3] = (byte) (i /*>> 0*/);
return result;
}
Qt C++: same
Output:
buffer = 0x5b, 0x42, 0x40, 0x63, 0x38, 0x61, 0x39,
0x33, 0x38, 0x33.
UPD3:
Qt C++:
QByteArray buffer = socket->readAll();
for(int i = 0; i < buffer.length(); ++i){
std::cout << buffer[i];
}
std::cout<<std::endl;
Output:
[B#938a15c

First of all, don't use PrintWriter.
Here's something to remember about Java I/O:
Streams are for bytes, Readers/Writers are for characters.
In Java, a character is not a byte. Characters have an encoding associated with them, like UTF-8. Bytes don't.
When you wrap a Stream in a Reader or a Writer, you are taking a byte stream and imposing a character encoding on that byte stream. You don't want that here.
Just try this:
OutputStream out = socket.getOutputStream();
out.write(toBytes(17));

How to use Java Card crypto sample?

I'm trying to make run example from IBM website.
I wrote this method:
public static byte[] cipher(byte[] inputData) {
Cipher cipher
= Cipher.getInstance(
Cipher.ALG_DES_CBC_NOPAD, true);
DESKey desKey = (DESKey) KeyBuilder.buildKey(
KeyBuilder.TYPE_DES,
KeyBuilder.LENGTH_DES,
false);
byte[] keyBytes = {(byte) 0x01, (byte) 0x02, (byte) 0x03, (byte) 0x04};
desKey.setKey(keyBytes, (short) 0);
cipher.init(desKey, Cipher.MODE_ENCRYPT);
byte[] outputData = new byte[8];
cipher.doFinal(inputData, (short) 0, (short) inputData.length, outputData, (short) 0);
return outputData;
}
And call this method cipher("test".getBytes());. When I call this servlet server gives me Internal server error and javacard.security.CryptoException.
I tried ALG_DES_CBC_ISO9797_M1, ALG_DES_CBC_ISO9797_M2 (and others) and got the same exception.
How to make run simple example of cipher on Java Card Connected?
UPDATE
As #vojta said, key must be 8 bytes long. So it must be something like this:
byte[] keyBytes = {(byte) 0x01, (byte) 0x02, (byte) 0x03, (byte) 0x04, (byte) 0x01, (byte) 0x02, (byte) 0x03, (byte) 0x04};
I don't know why, but it works only if replace
Cipher cipher = Cipher.getInstance(Cipher.ALG_DES_CBC_NOPAD, true);
with
Cipher cipher = Cipher.getInstance(Cipher.ALG_DES_CBC_ISO9797_M2, false);
I could not find anything about it in documentation.

These lines seem to be wrong:
byte[] keyBytes = {(byte) 0x01, (byte) 0x02, (byte) 0x03, (byte) 0x04};
desKey.setKey(keyBytes, (short) 0);
DES key should be longer than 4 bytes, right? Standard DES key is 8 bytes long (with strength of 56 bits).

In addition to #vojta's answer, the input data should be block aligned.
Your input data "test".getBytes() have length 4 which is not valid for Cipher.ALG_DES_CBC_NOPAD (but valid for Cipher.ALG_DES_CBC_ISO9797_M2).
Strange is that this should cause CryptoException.ILLEGAL_USE reason (which is 5 opposed to 3 you are getting)...

Query on reading bytes from "UTF-8" world to Java "char"

With the below code snippet given in this link,
byte[] bytes = {0x00, 0x48, 0x00, 0x69, 0x00, 0x2C,
0x60, (byte)0xA8, 0x59, 0x7D, 0x00, 0x21}; // "Hi,您好!"
Charset charset = Charset.forName("UTF-8");
// Encode from UCS-2 to UTF-8
// Create a ByteBuffer by wrapping a byte array
ByteBuffer bb = ByteBuffer.wrap(bytes);
// Create a CharBuffer from a view of this ByteBuffer
CharBuffer cb = bb.asCharBuffer();
Using wrap() method, "The new buffer will be backed by the given byte array", Here we do not have any encoding from byte to other format, it just placed byte array in a buffer.
Can you please help me understand, what exactly are we doing when we say bb.asCharBuffer() in the above code?cb is similar to array of characters. Because char is UTF-16 in Java, Using asCharBuffer() method, Are we considering every 2bytes in bb as char? Is this the right approach? If no, Please help me with right approach.
Edit:
I tried this program as recommended by Meisch below,
byte[] bytes = {0x00, 0x48, 0x00, 0x69, 0x00, 0x2C,
0x60, (byte)0xA8, 0x59, 0x7D, 0x00, 0x21}; // "Hi,您好!"
Charset charset = Charset.forName("UTF-8");
CharsetDecoder decoder = charset.newDecoder();
ByteBuffer bb = ByteBuffer.wrap(bytes);
CharBuffer cb = decoder.decode(bb);
which gives exception
Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(Unknown Source)
at java.nio.charset.CharsetDecoder.decode(Unknown Source)
at TestCharSet.main(TestCharSet.java:16)
Please help me, am stuck up here!!!
Note : am using java 1.6

You ask: “Because char is UTF-16 in Java, using asCharBuffer() method, are we considering every 2 bytes in bb as char?”
The answer to that question is yes. Your understanding is correct.
Your next question is: “Is this the right approach?”
If you are just trying to demonstrate how the ByteBuffer, CharBuffer and Charset classes work, it's acceptable.
However, when you are coding an application, you will never write code like that. To begin with, there is no need for a byte array; you can represent the characters as a literal String:
String s = "Hi,\u60a8\u597d!";
If you want to convert the string to UTF-8 bytes, you can simply do this:
byte[] encodedBytes = s.getBytes(StandardCharsets.UTF_8);
If you're still using Java 6, you would do this instead:
byte[] encodedBytes = s.getBytes("UTF-8");
Update: Your byte array represents chars in the UTF-16BE (big-endian) encoding. Specifically, your array has exactly two bytes per character. That is not a valid UTF-8 encoded byte sequence, which is why you're getting the MalformedInputException.
When characters are encoded as UTF-8 bytes, each character will be represented with 1 to 4 bytes. For your second code fragement to work, the array must be:
byte[] bytes = {
0x48, 0x69, 0x2c, // ASCII chars are 1 byte each
(byte) 0xe6, (byte) 0x82, (byte) 0xa8, // U+60A8
(byte) 0xe5, (byte) 0xa5, (byte) 0xbd, // U+597D
0x21
};
When converting from bytes to chars, my earlier statement still applies: You don't need ByteBuffer or CharBuffer or Charset or CharsetDecoder. You can use those classes, but usually it's more succinct to just create a String:
String s = new String(bytes, "UTF-8");
If you want a CharBuffer, just wrap the String:
CharBuffer cb = CharBuffer.wrap(s);
You may be wondering when it is appropriate to use a CharsetDecoder directly. You would do that if the bytes are coming from a source which is not under your control, and you have good reason to believe it may not contain properly UTF-8 encoded bytes. Using an explicit CharsetDecoder allows you to customize how invalid bytes will be handled.

I just had a look at the sources, it boils down to two bytes from the byte buffer being combined into one character. The order in which the two bytes are used depends on the endianness, default ist big-endian.
Another approach using nio classes than what I wrote in the comments would be to use the CharsetDecoder.decode() method.
Charset charset = Charset.forName("UTF-8");
CharsetDecoder decoder = charset.newDecoder();
ByteBuffer bb = ByteBuffer.wrap(bytes);
CharBuffer cb = decoder.decode(bb);

AES256 on Java vs PHP

Quick one that's thus far been evading me (long night). I'm comparing AES256 in PHP vs Java and noticing discrepancies. Please for simplicity ignore the ascii key and the null IV, those will be replaced in production. But I need to get past this first and can't figure out where I am erring:
PHP:
echo base64_encode(
mcrypt_encrypt(
MCRYPT_RIJNDAEL_128,
"1234567890ABCDEF1234567890ABCDEF",
"This is a test",
MCRYPT_MODE_CBC,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
)
);
Java
byte[] key = "1234567890ABCDEF1234567890ABCDEF".getBytes("UTF-8");
byte[] iv = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
AlgorithmParameterSpec ivSpec = new IvParameterSpec(iv);
SecretKeySpec newKey = new SecretKeySpec(key, "AES");
Cipher cipher = Cipher.getInstance("AES");
cipher.init(Cipher.ENCRYPT_MODE, newKey, ivSpec);
byte[] results = cipher.doFinal("This is a test".getBytes("UTF-8"));
return Base64.encodeToString(results,Base64.DEFAULT);
PHP output: 0KwK+eubMErzDaPU1+mwTQ==
Java output: DEKGJDo3JPtk48tPgCVN3Q==
Not quite what I was expecting o_O !
I've also tried MCRYPT_MODE_CBC, MCRYPT_MODE_CFB, MCRYPT_MODE_ECB, MCRYPT_MODE_NOFB, etc.. none of them produced the Java string.

PHP pads the input bytes with \0 to make it a multiple of the block size. The equivalent in Java would be this (assuming the string you want to encrypt is in data):
Cipher cipher = Cipher.getInstance("AES/CBC/NoPadding");
int blockSize = cipher.getBlockSize();
byte[] inputBytes = data.getBytes();
int byteLength = inputBytes.length;
if (byteLength % blockSize != 0) {
byteLength = byteLength + (blockSize - (byteLength % blockSize));
}
byte[] paddedBytes = new byte[byteLength];
System.arraycopy(inputBytes, 0, paddedBytes, 0, inputBytes.length);
cipher.init(Cipher.ENCRYPT_MODE, newKey, ivSpec);
byte[] results = cipher.doFinal(paddedBytes);
As a warning to this - zero-based padding is not desired. There's no way to determine the difference between \0 characters at the end of your string, and the actual padding. It's better to use PKCS5Padding instead, but you will get different results in PHP. Ask yourself if you NEED the encryption cross-platform like this.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

CharBuffer and ByteBuffer - charset encoding - java

Related

LengthFieldBasedFrameDecoder not parsing correctly when buffer size is less than frame size

int to unsigned char array in Java

How to use Java Card crypto sample?

Query on reading bytes from "UTF-8" world to Java "char"

AES256 on Java vs PHP

Categories

Resources