I am trying to use ByteBuffer properly with BigEndian byte order format..
I have couple of fields which I am trying to put together into a single ByteBuffer before storing it in Cassandra database.
That Byte Array which I will be writing into Cassandra is made up of three Byte Arrays as described below-
short employeeId = 32767;
long lastModifiedDate = "1379811105109L";
byte[] attributeValue = os.toByteArray();
Now, I will write employeeId , lastModifiedDate and attributeValue together into a single Byte Array and that resulting Byte Array I will write into Cassandra and then I will be having my C++ program which will retrieve that Byte Array data from Cassandra and then deserialize it to extract employeeId , lastModifiedDate and attributeValue from it.
So to do this, I am using ByteBuffer with BigEndian byte order format.
I have put up this code together -
public static void main(String[] args) throws Exception {
String text = "Byte Buffer Test";
byte[] attributeValue = text.getBytes();
long lastModifiedDate = 1289811105109L;
short employeeId = 32767;
int size = 2 + 8 + 4 + attributeValue.length; // short is 2 bytes, long 8 and int 4
ByteBuffer bbuf = ByteBuffer.allocate(size);
bbuf.order(ByteOrder.BIG_ENDIAN);
bbuf.putShort(employeeId);
bbuf.putLong(lastModifiedDate);
bbuf.putInt(attributeValue.length);
bbuf.put(attributeValue);
bbuf.rewind();
// best approach is copy the internal buffer
byte[] bytesToStore = new byte[size];
bbuf.get(bytesToStore);
// write bytesToStore in Cassandra...
// Now retrieve the Byte Array data from Cassandra and deserialize it...
byte[] allWrittenBytesTest = bytesToStore;//magicFunctionToRetrieveDataFromCassandra();
ByteBuffer bb = ByteBuffer.wrap(allWrittenBytesTest);
bb.order(ByteOrder.BIG_ENDIAN);
bb.rewind();
short extractEmployeeId = bb.getShort();
long extractLastModifiedDate = bb.getLong();
int extractAttributeValueLength = bb.getInt();
byte[] extractAttributeValue = new byte[extractAttributeValueLength];
bb.get(extractAttributeValue); // read attributeValue from the remaining buffer
System.out.println(extractEmployeeId);
System.out.println(extractLastModifiedDate);
System.out.println(new String(extractAttributeValue));
}
Is there any better way of doing this, the way I am doing it currently? Or some minor improvements that we can do it here??
This is the first time I am using ByteBuffer so having little bit problem...
Can anyone take a look and let me know whether this is the right way to use ByteBuffer?
The default order is always BIG_ENDIAN, so you don't meed to set it. Also when you wrap() is is already rewind()ed.
Instead of copying the underlying array, I would use the underlying array.
Replace
bbuf.rewind();
// best approach is copy the internal buffer
byte[] bytesToStore = new byte[size];
bbuf.get(bytesToStore);
with
byte[] bytesToStore = bbuf.array();
Related
I am using Zstd compression in Java for compressing a large JSON payload. I am using methods from the zstd-jni library for Java. I create a byte array out of the JSON string and use this method.
public static byte[] compress(byte[] var0, int var1)
I read that ZSTD will give more optimal results when a dictionary is passed during compression and decompression. How do I create a ZstdDictCompress object? What byte array and integer should I pass to the constructor?
public static long compress(byte[] var0, byte[] var1, ZstdDictCompress var2)
This example is for https://github.com/luben/zstd-jni.
First of all you need to get many samples of your jsons. You shouldn't use just one or couple samples. After that you can train your dictionary:
List<String> jsons = ...; // List of your jsons samples
ZstdDictTrainer trainer = new ZstdDictTrainer(1024 * 1024, 16 * 1024); // 16 KB dictionary
for(String json : jsons) {
trainer.addSample(json.getBytes(StandardCharsets.UTF_8));
}
byte[] dictionary = trainer.trainSamples();
Now you have you dictionary in byte array.
Next step is using SAME dictionary to compress and decompress.
// Compress
byte[] json = jsonString.getBytes(StandardCharsets.UTF_8);
ZstdDictCompress zstdDictCompress = new ZstdDictCompress(dictionary, Zstd.defaultCompressionLevel());
byte[] compressed = Zstd.compress(json, zstdDictCompress);
// Tricky moment, you have to pass json full length to decompress method
int jsonFullLength = json.length;
// Decompress
ZstdDictDecompress zstdDictDecompress = new ZstdDictDecompress(dictionary);
byte[] decompressed = Zstd.decompress(compressed, zstdDictDecompress, jsonFullLength);
String jsonString2 = new String(decompressed, StandardCharsets.UTF_8);
That's all!
i am using memcache , for communicating between java and c#.
c# put data into memcache in byte[] format and from java application trying to read that byte array but in Java i m getting String Object .
Sample :- C# code
MemcachedClient _mc = new MemcachedClient();
_mc.Serverlist = { "127.0.0.1:11211" }
byte[] stestValue = GetBytes("india");
m_c.set("key1",stestValue);
private byte[] GetBytes(string str)
{
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
Java Code :-// to fetch data which we set in memcache having key :- Key1
MemcachedClient mcc = new MemcachedClient(new InetSocketAddress("127.0.0.1", 11211));
Object value = mcc.get("key1");
here we get string object in Value rather than byte[].
yes i have tried to get byte[] from this but this byte array length is different from c# byte array.My requirement is to parse byte array by it's length so requirement is
length of byte[] in c# = length in byte[] in Java
I have my below layout in which I need to represent my data and then finally I need to make one byte array out of that.
// below is my data layout -
// data key type which is 1 byte
// data key len which is 1 byte
// data key (variable size which is a key_len)
// timestamp (sizeof uint64_t)
// data size (sizeof uint16_t)
// data (variable size = data size)
So I started like this but I am having some confusion so got stuck -
// data layout
byte dataKeyType = 101;
byte dataKeyLength = 3;
// not sure how to represent key here
long timestamp = System.currentTimeMillis(); // which is 64 bit
short dataSize = 320; // what does this mean? it means size of data is 320 bytes?
// and now confuse as well how to represent data here, we can have any string data which can be converted to bytes
// and then make final byte array out of that
How do I represent this in one byte array using Byte Buffer? Any simple example will help me to understand better.
byte keyType = 101;
byte keyLength = 3;
byte[] key = {27, // or whatever your key is
55,
111};
long timestamp = System.currentTimeMillis();
// If your data is just a string, then you could do the following.
// However, you will likely want to provide the getBytes() method
// with an argument that specifies which text encoding you are using.
// The default is just the current platform's default charset.
byte[] data = "your string data".getBytes();
short dataSize = (short) data.length;
int totalSize = (1 + 1 + keyLength + 8 + 2 + dataSize);
ByteBuffer bytes = ByteBuffer.allocate(totalSize);
bytes.put(keyType);
bytes.put(keyLength);
bytes.put(key);
bytes.putLong(timestamp);
bytes.putShort(dataSize);
bytes.put(data);
// If you want everthing as a single byte array:
byte[] byteArray = bytes.array();
You can use Java's DataOutputStream class to dynamically generate the byte array. For example:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
DataOutputStream dos = new DataOutputStream(baos);
dos.writeByte(keyType);
dos.writeByte(keyLength);
dos.write(new byte[] { 1, 2, 3, ..., key_len-1 }, 0, key_len);
dos.writeLong(System.currentTimeMillis());
dos.writeShort(320);
dos.write(new byte[] { 1, 2, 3, ..., 319 }, 0, 320);
You should replace the two new byte[] {} parts by the array that contains the key bytes and the array that contains the data, respectively.
I have a ByteArray value as avroBinaryValue , Schema Name value as String schemaName and Last Modified Date value as lastModifiedDate in long.
byte[] avroBinaryValue = os.toByteArray();
String schemaName = "DEMOGRAPHIC";
long lastModifiedDate = "1379811105109";
Now I am planning to convert schemaName into byteArray as well. Let's name it byteSchmeName.
After that, I will convert lastModifiedDate to byteArray as well. let's name that as well to byteLMD.
Now what's the best way to concatenate these three byteArrays together.
avroBinaryValue + byteSchemaName + byteLMD
Secondly, after concatenating these three byteArrays together, I want to split the resulting byteArrays in such a way such that I will be able to get all the three respective byteArrays properly...
Is it possible to do that? Any help will be appreciated.
NOTE:-
All the three byteArrays value will be different in different scenarios.. I am looking the most efficient way to store the resulting byteArrays in such a way such that it doesn't take that much space on the disk. I dont want to serialize it again since avroBinaryValue that I am getting is coming from Avro Data Serialization.. So I want to convert the other two things as well in ByteArray so that I can merge all three together into a single ByteArray.
You need to define a format. You have the following
byte[] avroBinaryValue = os.toByteArray();
String schemaName = "DEMOGRAPHIC";
long lastModifiedDate = 1379811105109L;
I guess avroBinaryValue can be variable length and so can schemaName. For all intents and purposes, lastModifiedDate fits in a long, ie. 8 bytes.
If you want to serialize this (other than using Serializable), you'll have to use a specific format that will tell you what you are reading and when to stop readin it. For example
Offset Length (in bytes) Purpose
0 4 - length of avroBinaryValue array
4 X - avroBinaryValue array
4+X 4 - length of of schemaName byte array
4+X+4 Y - schemaName byte array
4+X+4+Y 8 - value of lastModifiedDate
Also decide if you want big-endian or small-endian byte order.
So you write your three fields as described in the format and you read it the same way.
Here's an example done in memory where os is a String (for simplicity)
public static void main(String[] args) throws Exception {
String os = "whatever os is";
byte[] avroBinaryValue = os.getBytes();
String schemaName = "DEMOGRAPHIC";
long lastModifiedDate = 1379811105109L;
byte[] schemaNameBytes = schemaName.getBytes();
ByteArrayOutputStream byteOs = new ByteArrayOutputStream();
DataOutputStream out = new DataOutputStream(byteOs);
out.writeInt(avroBinaryValue.length);
out.write(avroBinaryValue);
out.writeInt(schemaNameBytes.length);
out.write(schemaNameBytes);
out.writeLong(lastModifiedDate);
// write done
byte[] allWrittenBytes = byteOs.toByteArray();
DataInputStream in = new DataInputStream(new ByteArrayInputStream(allWrittenBytes));
int sizeAvro = in.readInt();
avroBinaryValue = new byte[sizeAvro];
in.read(avroBinaryValue, 0, sizeAvro);
int sizeSchema = in.readInt();
schemaNameBytes = new byte[sizeSchema];
in.read(schemaNameBytes, 0, sizeSchema);
lastModifiedDate = in.readLong();
// read done
System.out.println(new String(avroBinaryValue));
System.out.println(new String(schemaNameBytes));
System.out.println(lastModifiedDate);
}
It prints
whatever os is
DEMOGRAPHIC
1379811105109
I understand you are trying to save space, but it might just be better to write each field to its own column or use a standard format like XML or JSON to serialize your fields.
Is this the recommended way to get the bytes from the ByteBuffer
ByteBuffer bb =..
byte[] b = new byte[bb.remaining()]
bb.get(b, 0, b.length);
Depends what you want to do.
If what you want is to retrieve the bytes that are remaining (between position and limit), then what you have will work. You could also just do:
ByteBuffer bb =..
byte[] b = new byte[bb.remaining()];
bb.get(b);
which is equivalent as per the ByteBuffer javadocs.
Note that the bb.array() doesn't honor the byte-buffers position, and might be even worse if the bytebuffer you are working on is a slice of some other buffer.
I.e.
byte[] test = "Hello World".getBytes("Latin1");
ByteBuffer b1 = ByteBuffer.wrap(test);
byte[] hello = new byte[6];
b1.get(hello); // "Hello "
ByteBuffer b2 = b1.slice(); // position = 0, string = "World"
byte[] tooLong = b2.array(); // Will NOT be "World", but will be "Hello World".
byte[] world = new byte[5];
b2.get(world); // world = "World"
Which might not be what you intend to do.
If you really do not want to copy the byte-array, a work-around could be to use the byte-buffer's arrayOffset() + remaining(), but this only works if the application supports index+length of the byte-buffers it needs.
As simple as that
private static byte[] getByteArrayFromByteBuffer(ByteBuffer byteBuffer) {
byte[] bytesArray = new byte[byteBuffer.remaining()];
byteBuffer.get(bytesArray, 0, bytesArray.length);
return bytesArray;
}
final ByteBuffer buffer;
if (buffer.hasArray()) {
final byte[] array = buffer.array();
final int arrayOffset = buffer.arrayOffset();
return Arrays.copyOfRange(array, arrayOffset + buffer.position(),
arrayOffset + buffer.limit());
}
// do something else
If one does not know anything about the internal state of the given (Direct)ByteBuffer and wants to retrieve the whole content of the buffer, this can be used:
ByteBuffer byteBuffer = ...;
byte[] data = new byte[byteBuffer.capacity()];
((ByteBuffer) byteBuffer.duplicate().clear()).get(data);
This is a simple way to get a byte[], but part of the point of using a ByteBuffer is avoiding having to create a byte[]. Perhaps you can get whatever you wanted to get from the byte[] directly from the ByteBuffer.