Write from Java generic T[] to DataOutput stream? - java

Given a generic array T[], where T extends java.lang.Number, I would like to write the array to a byte[], using ByteArrayOutputStream. java.io.DataOutput (and an implementation such as java.io.DataOutputStream appears close to what I need, but there is no generic way to write the elements of the T[] array. I want to do something like
ByteArrayOutputStream out = new ByteArrayOutputStream();
DataOutputStream dataOut = new DataOutputStream(out);
for (T v : getData()) {
dataOut.write(v); // <== uh, oh
}
but there is no generic <T> void write(T v) method on DataOutput.
Is there any way to avoid having to write a whole bunch of isntanceof spaghetti?
Clarification
The byte[] is being sent to a non-Java client, so object serialization isn't an option. I need, for example, the byte[] generated from a Float[] to be a valid float[] in C.

No, there isn't. The instanceof "spaghetti" would have to exist somewhere anyway. Make a generic method that does that:
public <T> void write(DataOutputStream stream, T object) {
// instanceofs and writes here
}

You can just use an ObjectOutputStream instead of a DataOutputStream, since all Numbers are guaranteed to be serializable.

Regarding to the last edit, I would try this approach (if its ugly or not).
1) Check per instanceof which type you have
2) Store it into a primitive and extract the bytes you need (eg integer) like this (for the first two bytes)
byte[] bytes = new byte[2];
bytes[0]=(byte)(i>>8);
bytes[1]=(byte)i;
3) Send it via the byte[] array
4) Get stuck because different c implementations use different amout of bytes for integer, so nobody can guarantee that the results will equal your initial numbers. e.g. how do you want to handle the 4 byte integer of java with 2 byte integers of c? How do you handle Long?
So...i don't see a way to do, but, im not an expert in this area....
Please correct me if im wrong. ;-)

Related

Java - ByteBuffer or ArrayList<Byte>?

Recently I created a wrapper to read and write data into a byte array. To do it, I've been using an ArrayList<Byte>, but I was wondering if this is the most efficent way to do it, because:
addAll() doesn't work with byte arrays (even using Arrays.asList(), which returns me List<Byte[]>). To fix it I'm just looping and adding a byte at each loop, but I suppose this supposes a lot of function calls and so it has a performance cost.
The same happens for getting a byte[] from the ArrayList. I can't cast from Byte[] to byte[], so I have to use a loop for it.
I suppose storing Byte instead of byte uses more memory.
I know ByteArrayInputStream and ByteArrayOutputStream could be used for this, but it has some inconvenients:
I wanted to implement methods for reading different data types in different byte order (for example, readInt, readLEInt, readUInt, etc), while those classes only can read / write a byte or a byte array. This isn't really a problem because I could fix that in the wrapper. But here comes the second problem.
I wanted to be able to write and read at the same time because I'm using this to decompress some files. And so to create a wrapper for it I would need to include both ByteArrayInputStream and ByteArrayOutputStream. I don't know if those could be syncronized in some way or I'd have to write the entire data of one to the other each time I wrote to the wrapper.
And so, here comes my question: would using a ByteBuffer be more efficient? I know you can take integers, floats, etc from it, even being able to change the byte order. What I was wondering is if there is a real performance change between using a ByteBuffer and a ArrayList<Byte>.
Definitely ByteBuffer or ByteArrayOutputStream. In your case ByteBuffer seems fine. Inspect the Javadoc, as it has nice methods, For putInt/getInt and such, you might want to set order (of those 4 bytes)
byteBuffer.order(ByteBuffer.LITTLE_ENDIAN);
With files you could use getChannel() or variants and then use a MappedByteBuffer.
A ByteBuffer may wrap a byte array, or allocate.
Keep in mind that every object has overhead associated with it including a bit of memory per object and garbage collection once it goes out of scope.
Using List<Byte> would mean creating / garbage collecting an object per byte which is very wasteful.
ByteBuffer is a wrapper class around a byte array, it doesn't have dynamical size like ArrayList, but it consumes less memory per byte and is faster.
If you know the size you need, then use ByteBuffer, if you don't, then you could use ByteArrayOutputStream (and maybe wrapped by ObjectOutputStream, it has some methods to write different kinds of data). To read the data you have written to ByteArrayOutputStream you can extend the ByteArrayOutputStream, and then you can access the fields buf[] and count, those fields are protected, so you can access them from extending class, it look like:
public class ByteArrayOutputStream extends OutputStream {
/**
* The buffer where data is stored.
*/
protected byte buf[];
/**
* The number of valid bytes in the buffer.
*/
protected int count;
...
}
public class ReadableBAOS extends ByteArrayOutputStream{
public byte readByte(int index) {
if (count<index) {
throw new IndexOutOfBoundsException();
}
return buf[index];
}
}
so you can make some methods in your extending class to read some bytes from the underlying buffer without the need to make an copy of its content each time like toByteArray() method do.

Compact Java Externalization

I am trying to figure out a way to serialize simple Java objects (ie all the fields are primitive types) compactly, without the big header that normally gets added on when you use writeExternal. It does not need to be super general, backwards compatible across versions, or anything like that, I just want it to work with ObjectOutputStreams (or something similar) and not add ~100 bytes to the size of each object I serialize.
More concretely, I have a class that has 3 members: a boolean flag and two longs. I should be able to represent this object in 17 bytes. Here is a simplified version of the code:
class Record implements Externalizable {
bool b;
long id;
long uid;
public void writeExternal(ObjectOutput out) throws IOException {
int size = 1 + 8 + 8; //I know, I know, but there's no sizeof
ByteBuffer buff = ByteBuffer.allocate(size);
if (b) {
buff.put((byte) 1);
} else {
buff.put((byte) 0);
}
buff.putLong(id);
buff.putLong(uid);
out.write(buff.array(), 0, size);
}
}
Elsewhere, these are stored by being passed into a method like the following:
public void store(Object value) throws IOException {
ObjectOutputStream out = getStream();
out.writeObject(value);
out.close();
}
After I store just one of these objects in a file this way, the file has a size of 128 bytes (and 256 for two of them, so it's not amortized). Looking at the file, it is clear that it is writing in a header similar to the one used in default serialization (which, for the record, uses about 376 bytes to store one of these). I can see that my writeExternal method is getting invoked (I put in some logging), so that isn't the problem. Is this just a fundamental limitation of the way ObjectOutputStream deserializes things? Do I need to work on raw DataOutputStreams to get the kind of compactness I want?
[EDIT: In case anyone is wondering, I ended up using DataOutputStreams directly, which turned out to be easier than I'd feared]

Convert OutputStream to ByteArrayOutputStream

I am trying to convert an OutputStream to a ByteArrayOutput Stream. I was unable to find any clear simple answers on how to do this. This question was asked in the title of the question on StackOverflow, but the body of the question aske how to change a ByteArrayStream to OuputStream. I have an OutputStream that is already created and this example given in the answer will not compile!
That Question is Here
I have an OutputStream that is already constructed and has a length of 44 bytes called waveHeader. I want to convert that to a ByteArrayOutputStream because I want to be able to change that into a byte[] with waveHeader.ToByteArray() for simplicity in later processes;
Is there a simple type of casting or something that will allow this?
If not then:
Is there a way to construct a pointer to the data in the original OutputStream if it is not possible to convert it?
How would someone go about accessing the data that is contained in the OutputStream?
I am new to JAVA. This is just a hobby for me. Streams In VisualBasic .net where much easier!
There are multiple possible scenarios:
a) You have a ByteArrayOutputStream, but it was declared as OutputStream. Then you can do a cast like this:
void doSomething(OutputStream os)
{
// fails with ClassCastException if it is not a BOS
ByteArrayOutputStream bos = (ByteArrayOutputStream)os;
...
b) if you have any other type of output stream, it does not really make sense to convert it to a BOS. (You typically want to cast it, because you want to access the result array). So in this case you simple set up a new stream and use it.
void doSomething(OutputStream os)
{
ByteArrayOutputStream bos = new ByteArrayOutputStream();
bos.write(something);
bos.close();
byte[] arr = bos.toByteArray();
// what do you want to do?
os.write(arr); // or: bos.writeTo(os);
...
c) If you have written something to any kind of OutputStream (which you do not know what it is, for example because you get it from a servlet), there is no way to get that information back. You must not write something you need later. A solution is the answer b) where you write it in your own stream, and then you can use the array for your own purpose as well as writing it to the actual output stream.
Keep in mind ByteArrayOutputStreams keep all Data in Memory.
You could use the writeTo method of ByteArrayOutputStream.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] bytes = new byte[8];
bos.write(bytes);
bos.writeTo(oos);
You can create an instance of ByteArrayOutputStream. You then need to write the data to this ByteOutputStream instance and then using the writeTo method, which accepts an OutputStream, you can enable the ByteArrayOutputStream to write the output, to the instance of OutputStream which you passed as the argument.
Hope it works!
You can use toByteArray function on the output stream you have.That's is let say you have outputStream buffer So you can do buffer.toByteArray .
For more you can look at the answer of Convert InputStream to byte array in Java .

Why does serializing Integer take so many (81) bytes?

I wrote a small test program to show how many bytes we need to serialize Integer object:
ByteArrayOutputStream data = new ByteArrayOutputStream();
try {
ObjectOutputStream output = new ObjectOutputStream(data);
output.writeObject(1);
output.flush();
System.out.println(data.toByteArray().length);
} catch (IOException e) {
e.printStackTrace();
}
However, the result is surprising, it takes 81 bytes. If I serialize String "1", it only takes 8 bytes instead. I know java has optimization for String serialization, but why not do the same thing for Integer? I think it shouldn't be very difficult.
Or does anyone has some workaround? I need a method which can serialize everything include objects and basic types. Thanks for your answers!
It's a balancing act, between making the serialization protocol more complicated by having direct support for lots of types, and between keeping it simple.
In my experience, Integer values are relatively rare compared with int values - and the latter does have built-in support, along with all the other primitive types. It's also worth noting that although serializing a single Integer object is expensive, the incremental cost is much smaller, because there's already a reference to the class in the stream. So after the first Integer has been written, a new Integer only takes 10 bytes - and a reference to an Integer which has already been written to the stream (common if you're boxing small values) is only 5 bytes.
Personally I would try to avoid native Java binary serialization anyway - it's platform specific and very brittle, as well as not being terribly compact. I like Protocol Buffers but there are lots of other alternatives available too.

Is it possible to use struct-like constructs in Java?

I'm considering using Java for a large project but I haven't been able to find anything that remotely represented structures in Java. I need to be able to convert network packets to structures/classes that can be used in the application.
I know that it is possible to use RandomAccessFile but this way is NOT acceptable. So I'm curious if it is possible to "cast" a set of bytes to a structure like I could do in C. If this is not possible then I cannot use Java.
So the question I'm asking is if it is possible to cast aligned data to a class without any extra effort beyond specifying the alignment and data types?
No. You cannot cast a array of bytes to a class object.
That being said, you can use a java.nio.Buffer and easily extract the fields you need to an object like this:
class Packet {
private final int type;
private final float data1;
private final short data2;
public Packet(byte[] bytes) {
ByteBuffer bb = ByteBuffer.wrap(bytes);
bb.order(ByteOrder.BIG_ENDIAN); // or LITTLE_ENDIAN
type = bb.getInt();
data1 = bb.getFloat();
data2 = bb.getShort();
}
}
You're basically asking whether you can use a C-specific solution to a problem in another language. The answer is, predictably, 'no'.
However, it is perfectly possible to construct a class that takes a set of bytes in its constructor and constructs an appropriate instance.
class Foo {
int someField;
String anotherField;
public Foo(byte[] bytes) {
someField = someFieldFromBytes(bytes);
anotherField = anotherFieldFromBytes(bytes);
etc.
}
}
You can ensure there is a one-to-one mapping of class instances to byte arrays. Add a toBytes() method to serialize an instance into bytes.
No, you cannot do that. Java simply doesn't have the same concepts as C.
You can create a class that behaves much like a struct:
public class Structure {
public int field1;
public String field2;
}
and you can have a constructor that takes an array or bytes or a DataInput to read the bytes:
public class Structure {
...
public Structure(byte[] data) {
this(new DataInputStream(new ByteArrayInputStream(data)));
}
public Structure(DataInput in) {
field1 = in.readInt();
field2 = in.readUTF();
}
}
then read bytes off the wire and pump them into Structures:
byte[] bytes = network.read();
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(bytes));
Structure structure1 = new Structure(stream);
Structure structure2 = new Structure(stream);
...
It's not as concise as C but it's pretty close. Note that the DataInput interface cleanly removes any mucking around with endianness on your behalf, so that's definitely a benefit over C.
As Joshua says, serialization is the typical way to do these kinds of things. However you there are other binary protocols like MessagePack, ProtocolBuffers, and AvRO.
If you want to play with the bytecode structures, look at ASM and CGLIB; these are very common in Java applications.
There is nothing which matches your description.
The closest thing to a struct in Java is a simple class which holds values either accessible through it's fields or set/get methods.
The typical means to convert between Java class instances and on-the-wire representations is Java serialization which can be heavily customized as need be. It is what is used by Java's Remote Method Invocation API and works extremely well.
ByteBuffer.wrap(new byte[] {}).getDouble();
No, this is not possible. You're trying to use Java like C, which is bound to cause complications. Either learn to do things the Java way, or go back to C.
In this case, the Java way would probably involve DataInputStream and/or DataOutputStream.
You cannot cast array of bytes to instance of class.
But you can do much much more with java.
Java has internal, very strong and very flexible mechanism of serialization. This is what you need. You can read and write object to/from stream.
If both sides are written in java, there are no problem at all. If one of sides is not java you can customeze your serialization. Start from reading javadoc of java.util.Serializable.

Categories

Resources