Is ByteBuf.arrayOffset useless? - java

I'm learning Netty in Action.
At the chapter 5.2.2 ByteBuf usage patterns, there is a piece of code that confused me. It is shown below.
ByteBuf heapBuf = ...
if (heapBuf.hasArray()) {
byte[] array = heapBuf.array();
int offset = heapBuf.arrayOffset() + heapBuf.readerIndex();
int lenght = heapBuf.readableBytes();
handleArray(array, offset, length)
}
I wondered what is the use case of the ByteBuf.arrayOffset() method. The documentation for that method reads:
Returns the offset of the first byte within the backing byte array of this buffer.
Then, I looked up the arrayOffset() method in UnpooledHeapByteBuf.java which implements ByteBuf. The implementation for the method always just returns 0, as seen below.
#Override
public int arrayOffset() {
return 0;
}
So, is ByteBuf#arrayOffset useless?

There may be other implementations for ByteBuf and it could be possible that they have a more useful or even complex implementation.
So for the case of UnpooledHeapByteBuf returning 0 works but that does not mean that there aren't other implementations of ByteBuf that need a different implementation.
The method should do what the documentation states and you could imagine that other implementations indeed have an offset that is different to 0. For example if they use something like a circular-array as backing byte array.
In that case the method needs to return the index of where the current start pointer is located at and not 0.
Here's an example-image showing such a circular-array (the current pointer is at index 2 and not at 0, it moves around the array while using it):
And on the user-side, if you want to safely use your ByteBuf object you also should use the method. You can avoid using it if you operate on UnpooledHeapByteBuf but even then you should not because it could be possible that they change the internal behavior with future versions.

No its not useless at all as it allows us to have one huge byte array back multiple ByteBuf implementations. This in fact is done in PooledHeapByteBuf

Related

How to return value from the function without exiting from the function?

I am adding byte arrays from an array list of another array list of bytes. So, basically, I am playing with nested byte arrays. I am able to add the first index of each byte array but I am unable to return it immediately. The function will return the whole byte array when all indexes are added. But, I want to return the sum of each index separately.
public static byte[] final_stream(ArrayList<ArrayList<byte[]>> outerstream) {
ArrayList<byte[]> streams = new ArrayList<byte[]>();
int x = 0;
while (x < outerstream.size()) {
streams = new ArrayList<byte[]>();
for (ArrayList<byte[]> bytes : outerstream) {
streams.add(bytes.remove(0));
}
x++;
return stream_Addr(streams); // Here I want to return the value
}
} // Here it gives error to return byte[]
Your code is wrong on many levels, a quick (probably incomplete) list:
you violate java naming conventions. Method names go camelCase(), and variable names (unless constants), too. And you only use the "_" for SOME_CONSTANT
the term "stream" has a very special meaning in Java. A list is not a stream (but you can create a true java stream from a list by yourList.stream())
and yes, what you intend to do in that while loop is beyond my creativity to interpret. Honestly: throw that away, and start from scratch.
Regarding your real question: every "exit" path of a non-void method needs to either throw an exception or to return something.
Finally: what you intend to do isn't possible like that in Java. A caller calls a method, and that method returns one value and then ends.
What you can do, is something like:
thread A creates someList and passes that to some thread B somehow
thread B manipulates that list object, and by using appropriate synchronization the other thread can access that data (while B continues to make updates)
And the real answer is: you can't learn a new language by assuming that the language supports a concept you know from other languages (like pythons generators) to then invent your own syntax or construct in the new language to then be surprised "gosh, it doesn't work". It goes the other way round: you research if your target language has such a concept, if not, you research what else is offered. Then you read a tutorial about that, and follow that.

Compact Java Externalization

I am trying to figure out a way to serialize simple Java objects (ie all the fields are primitive types) compactly, without the big header that normally gets added on when you use writeExternal. It does not need to be super general, backwards compatible across versions, or anything like that, I just want it to work with ObjectOutputStreams (or something similar) and not add ~100 bytes to the size of each object I serialize.
More concretely, I have a class that has 3 members: a boolean flag and two longs. I should be able to represent this object in 17 bytes. Here is a simplified version of the code:
class Record implements Externalizable {
bool b;
long id;
long uid;
public void writeExternal(ObjectOutput out) throws IOException {
int size = 1 + 8 + 8; //I know, I know, but there's no sizeof
ByteBuffer buff = ByteBuffer.allocate(size);
if (b) {
buff.put((byte) 1);
} else {
buff.put((byte) 0);
}
buff.putLong(id);
buff.putLong(uid);
out.write(buff.array(), 0, size);
}
}
Elsewhere, these are stored by being passed into a method like the following:
public void store(Object value) throws IOException {
ObjectOutputStream out = getStream();
out.writeObject(value);
out.close();
}
After I store just one of these objects in a file this way, the file has a size of 128 bytes (and 256 for two of them, so it's not amortized). Looking at the file, it is clear that it is writing in a header similar to the one used in default serialization (which, for the record, uses about 376 bytes to store one of these). I can see that my writeExternal method is getting invoked (I put in some logging), so that isn't the problem. Is this just a fundamental limitation of the way ObjectOutputStream deserializes things? Do I need to work on raw DataOutputStreams to get the kind of compactness I want?
[EDIT: In case anyone is wondering, I ended up using DataOutputStreams directly, which turned out to be easier than I'd feared]

AccessViolationException on SByte[].Length

First of all, I took a look into other AccessViolationException problems here on SO, but mostly I didn't understand much because of terms like "marshalling" and unsafe code, etc.
Context: I try to port some of the Java Netty code to C#. Maybe I mixed up the two languages and don't see it anymore.
I just have two methods, one forwarding parameters to the other.
public override ByteBuf WriteBytes(sbyte[] src)
{
WriteBytes(src, 0, src.Length);
return this;
}
public override ByteBuf WriteBytes(sbyte[] src, int srcIndex, int length)
{
SetBytes(WriterIndex, src, srcIndex, length); // AccessViolationException throws here
return this;
}
Now, I'm testing the first method in my unit test, like this:
var byteArray = new sbyte[256];
for (int i = 0, b = sbyte.MinValue; b <= sbyte.MaxValue; i++, b++) // -128 ... 127
{
byteArray[i] = (sbyte) b;
}
buffer.WriteBytes(byteArray); // this is the invokation
What I found so far is, that the problem seems to arise from the "length" parameter. I don't know, but maybe I'm not allowed to use "src.Length" in the first method to pass it to the second.
Also, please note that I am using sbyte, not byte. (I hope this doesn't matter.)
Does this have to do with pass-by-reference or pass-by-value of arrays?
Edit:
I found that the exception must be thrown anywhere in the depths of SetBytes. However, I believed that SetBytes never was called because I set a breakpoint on the method's entry. But the debugger never stopped there. I believe that the debugger doesn't work properly as it sometimes doesn't stop at the breakpoints I set. After I have managed to debug the whole depths of SetBytes, the AccessViolationException was never thrown. I then RUN the test 10x and the exception didn't appear. Then I DEBUGged the whole thing again and the exception appeared again.
Why is that??

Optimizing Java Array Copy

So for my research group I am attempting to convert some old C++ code to Java and am running into an issue where in the C++ code it does the following:
method(array+i, other parameters)
Now I know that Java does not support pointer arithmetic, so I got around this by copying the subarray from array+i to the end of array into a new array, but this causes the code to run horribly slow (I.e. 100x slower than the C++ version). Is there a way to get around this? I saw someone mention a built-in method on here, but is that any faster?
Not only does your code become slower, it also changes the semantic of what is happening: when you make a call in C++, no array copying is done, so any change the method may apply to the array is happening in the original, not in the throw-away copy.
To achieve the same effect in Java change the signature of your function as follows:
void method(array, offset, other parameters)
Now the caller has to pass the position in the array that the method should consider the "virtual zero" of the array. In other words, instead of writing something like
for (int i = 0 ; i != N ; i++)
...
you would have to write
for (int i = offset ; i != offset+N ; i++)
...
This would preserve the C++ semantic of passing an array to a member function.
The C++ function probably relied on processing from the beginning of the array. In Java it should be configured to run from an offset into the array so the array doesn't need to be copied. Copying the array, even with System.arraycopy, would take a significant amount of time.
It could be defined as a Java method with something like this:
void method(<somearraytype> array, int offset, other parameters)
Then the method would start at the offset into the array, and it would be called something like this:
method(array, i, other parameters);
If you wish to pass a sub-array to a method, an alternative to copying the sub-array into a new array would be to pass the entire array with an additional offset parameter that indicates the first relevant index of the array. This would require changes in the implementation of method, but if performance is an issue, that's probably the most efficient way.
The right way to handle this is to refactor the method, to take signature
method(int[] array, int i, other parameters)
so that you pass the whole array (by reference), and then tell the method where to start its processing from. Then you don't need to do any copying.

Hit-Count (reads) of an array in Java

For evaluating an algorithm I have to count how often the items of a byte-array are read/accessed. The byte-array is filled with the contents of a file and my algorithm can skip over many of the bytes in the array (like for example the Boyer–Moore string search algorithm). I have to find out how often an item is actually read. This byte-array is passed around to multiple methods and classes.
My ideas so far:
Increment a counter at each spot where the byte-array is read. This seems error-prone since there are many of these spots. Additionally I would have to remove this code afterwards such that it does not influence the runtime of my algorithm.
Use an ArrayList instead of a byte-array and overwrite its "get" method. Again, there are a lot of methods that would have to be modified and I suspect that there would be a performance loss.
Can I somehow use the Eclipse debug-mode? I see that I can specify a hit-count for watchpoints but it does not seem to be possible to output the hit count?!
Can maybe the Reflection API help me somehow?
Somewhat like 2), but in order to reduce the effort: Can I make a Java method accept an ArrayList where it wants an array such that it transparently calls the "get" method whenever an item is read?
There might be an out-of-the-box solution but I'd probably just wrap the byte array in a simple class.
public class ByteArrayWrapper {
private byte [] bytes;
private long readCount = 0;
public ByteArrayWrapper( byte [] bytes ) {
this.bytes = bytes;
}
public int getSize() { return bytes.length; }
public byte getByte( int index ) { readCount++; return bytes[ index ]; }
public long getReadCount() { return readCount; }
}
Something along these lines. Of course this does influence the running time but not very much. You could try it and time the difference, if you find it is significant, we'll have to find another way.
The most efficient way to do this is to add some code injection. However this is likely to be much more complicated to get right than writing a wrapper for your byte[] and passing this around. (tedious but at least the compiler will help you) If you use a wrapper which does basicly nothing (no counting) it will be almost as efficient as not using a wrapper and when you want counting you can use an implementation which does that.
You could use EHCache without too much overhead: implement an in-memory cache, keyed by array index. EHCache provides an API which will let you query hit rates "out of the box".
There's no way to do this automatically with a real byte[]. Using JVM TI might help here, but I suspect it's overkill.
Personally I'd write a simple wrapper around the byte[] with methods to read() and write() specific fields. Those methods can then track all accesses (either individually for each byte, or as a total or both).
Of course this would require the actual access to be modified, but if you're testing some algorithms that might not be such a big drawback. The same goes for performance: it will definitely suffer a bit, but the effect might be small enough not to worry about it.

Categories

Resources