Mimic C++ in java structure and writing to byte[]

Mimic C++ in java structure and writing to byte[] - java

I am porting some C++ code over to java, and in my particular instance, i am writing data to a byte[] to be written to a file. The first portion, as defined in C++ is a structure consisting of a uint, and 3 ushorts. The second portion is the main part of the data, which i will just append on the end of the byte[] before i send it to the outputstream.
My question is this: What is the simplest way to write the header values to the byte[]? I know i can put 1 value in there, then offset the specific number of bytes, and repeat as necessary, but is this the best way to do it?
Also, how do i manage byte alignment? The C++ code appears to use the default values (4-byte?) for alignment.
Thanks,
Jason

You might find it easier to use ByteBuffer, which is probably the nicest way in Java to organize byte-by-byte output.
ByteBuffer doesn't directly care about alignment, though, and I don't know how C++ is aligning its output -- but in a pinch, you can just advance it manually.

Use a DataOutputStream that wraps a ByteArrayOutputStream
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream()
DataOutputStream stream = new DataOutputStream(byteArrayOutputStream);
try {
stream.writeInt(i);
stream.writeShort(s0);
stream.writeShort(s1);
stream.writeShort(s2);
stream.flush();
} catch (IOException e) {
// this can't happen, but you still require a try catch block
}
byte[] array = byteArrayOutputStream.toByteArray();
If your code might be parsing a similar byte array with unsigned integers then you're going to have a bit of headache. That is, you will have to check for negative numbers and deal with them appropriately.
eg.
int unsignedIntAsSignedInt = inStream.readInt();
long realData;
if (unsignedIntAsSignedInt < 0) {
realData = ((long) unsignedIntAsSignedInt) - (((long)Integer.MIN_VALUE) * 2);
} else {
realData = unsignedIntAsSignedInt;
}

Related

How to clone input stream but still re-use original

I am trying to copy the InputStream from a URLConnection which is returning a stream of type HttpInputStream (inner class of HttpUrlConnection)
In other cases, I can copy the original stream to a ByteArrayOutputStream and then use mark/reset on the original, but HttpInputStream does not support mark/reset.
Is there a way I can still copy the stream and reset the original or keep it from being consumed? The original stream inside URLConnection has to be readable because it is passed into another library. I only need to copy the stream so I can read the first two lines of data. Here is what I have for streams that support mark/reset:
InputStream input = null;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
input = connection.getInputStream();
byte[] buffer = new byte[200];
input.mark(200);
int len = input.read(buffer);
input.reset();
baos.write(buffer, 0, len);
baos.flush();
String content = baos.toString("UTF-8");
//I set flags based on the value of content, but omitting here for the sake of simplicity.
} catch (IOException ex) {
//I do stuff here, but omitting for sake of simplicity in this
}

ImputStreams are not generally cloneable, and neither do all streams support mark/reset. There are some possible workarounds within the standard JRE.
Wrap the InputStream into a BufferedInputStream. That one supports mark/reset within the limits of its buffer size. That enables you to read a limited amount of data from the beginning, then reset the stream.
Another alternative is PushBackInputStream, which allows you to "unread" data previously read. You need to buffer the data to be pushed back yourself though, so it may be a bit inconvinient to handle.
If the whole stream isn't terribly big, you could also read the entire stream first, then construct as many ByteArrayInputStreams as needed from the pre-read data. Only feasible if the data fits in the heap (e.g. less than approximately 2GB max).

Apache commons library has a really nice TeeInput stream.
https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/input/TeeInputStream.html

What Java class allows to write to a file, both in Binary and ASCII?

I need to write files, with Headers in ASCII and values in Binary.
For now, I'm using this:
File file = new File("~/myfile");
FileOutputStream out = new FileOutputStream(file);
// Write in ASCII
out.write(("This is a header\n").getBytes());
// Write a byte[] is quite easy
byte[] buffer = new buffer[4];
out.write(buffer, 0, 4);
// Write an int in binary gets complicated
out.write(ByteBuffer.allocate(4).putInt(6).array());
//Write a float in binary gets even more complicated
out.write(ByteBuffer.allocate(4).order(ByteOrder.BIG_ENDIAN)
.putFloat(4.5).array());
The problem is that it's very slow (in terms of performance) to write that way, way slower than writing the values in ASCII actually. But it should be shorter since in I'm writing less data.
I've looked at other Java classes, and it seems to me that they are either only for ASCII writing, or only for Binary writing.
Would you have any other proposition for this problem ?

You can use FileOutputStream to write binary. To include text you have to convert it to a byte[] before writing to the stream.
The problem is that it's very long to write that way, way longer than writing the values in ASCII actually. But it should be shorter since in I'm writing less data.
Mixing text and data is complex and error prone. The size of the data does matter, rather the complexity of the data is important. I suggest considering using DataOutputStream if you want to keep things simple.
To perform your example you can do
DataOutputStream out = new DataOutputStream(
new BufferedOutputStream(
new FileOutputStream("~/myfile")));
// Write in ASCII
out.write("This is a header\n".getBytes());
// Write a 32-bit int
out.writeInt(6);
//Write a float in binary
out.writeFloat(4.5f);
out.flush(); // the buffer.

Writing Bits to a file using BitSet & FileOutputStream

I've run into a bit of a problem when it comes to writing specific bits to a file. I apologise if this is a duplicate of anything but I could not find a reasonable answer with the searches I ran.
I have a number of difficulties with the following:
Writing a header (Long) bit by bit (converted to a byte array so the
FileOutputStream can utilise it) to the file.
Writing single bits to the file. For example, at one stage I am required to write a single bit set to 0 to the file so my initial thought would be to use a BitSet but Java seems to treat this as a null?
BitSet initialPadding = new BitSet();
initialPadding.set(0, false);
fileOutputStream.write(initialPadding.toByteArray());
1)
I create a FileOutputStream as shown below with the necessary file name:
FileOutputStream fileOutputStream = new FileOutputStream(file.getAbsolutePath());
I am attempting to create an ".amr" file so the first step before I perform any bit manipulation is to write a header to the beginning of the file. This has the following value:
Long defaultHeader = 0x2321414d520aL;
I've tried writing this to the file using the following method but I am pretty sure it does not write the correct result:
fileOutputStream.write(defaultHeader.byteValue());
Am I using the correct streams? Are my convertions completely wrong?
2)
I have a public BitSet fileBitSet;which has bits read in from a ".raw" file as the input. I need to be able to extract certain bits from the BitSet in order to write them to the file later. I do this using the following method:
public int getOctetPayloadHeader(int startPoint) {
int readLength = 0;
octetCMR = fileBitSet.get(0, 3);
octetRES = fileBitSet.get(4, 7);
if (octetRES.get(0, 3).isEmpty()) {
/* Keep constructing the payload header. */
octetFBit = fileBitSet.get(8, 8);
octetMode = fileBitSet.get(9, 12);
octetQuality = fileBitSet.get(13, 13);
octetPadding = fileBitSet.get(14, 15);
... }
What would be the best way to go for writing these bits to a file bearing in mind that I may be required to sometimes write a single bit or 81 bits at a particular offset in the fileBitSet ?

There is only one thing you can write to an OutputStream: bytes. You have to do the composing of your bits into bytes yourself; only you know the rules how the bits are to be put together into bytes.
As for stuff like:
Long defaultHeader = 0x2321414d520aL;
fileOutputStream.write(defaultHeader.byteValue());
You should take a close look at the javadocs for the methods you are using. byteValue() returns a single byte; so of course its not doing what you expect. Working with streams is well explained in oracles tutorials: http://docs.oracle.com/javase/tutorial/essential/io/streams.html
For writing single bits or groups of bits, you will need a custom OutputStream that handles grouping the bits into bytes to be written. Thats commonly called a BitStream (there is no such class in the JDK); you have to either write it yourself (which I highly recommend, its a very good excercise to teach you about bits and bytes) or find one on the web.

how to convert php unpack() in a similar method in Java

I've no coding experience in PHP at all. But while looking for a solution for my Java project, i found an example of the problem in PHP, which incidentally is alien to me.
Can anyone please explain the working and the result of the unpack('N*',"string") function of PHP and how to implement it in Java?
An example would help me a lot!
Thanks!

In PHP (and in Perl, where PHP copied it from), unpack("N*", ...) takes a string (actually representing a sequence of bytes) and parses each 4-byte segment of it as a signed 32-bit big-endian ("Network byte order") integer, returning them in an array.
There are several ways to do the same in Java, but one way would be to wrap the input byte array in a java.nio.ByteBuffer, convert it to an IntBuffer and then read the integers from that:
public static int[] unpackNStar ( byte[] bytes ) {
// first, wrap the input array in a ByteBuffer:
ByteBuffer byteBuf = ByteBuffer.wrap( bytes );
// then turn it into an IntBuffer, using big-endian ("Network") byte order:
byteBuf.order( ByteOrder.BIG_ENDIAN );
IntBuffer intBuf = byteBuf.asIntBuffer();
// finally, dump the contents of the IntBuffer into an array
int[] integers = new int[ intBuf.remaining() ];
intBuf.get( integers );
return integers;
}
Of course, if you just want to iterate over the integers, you don't really need the IntBuffer or the array:
ByteBuffer buf = ButeBuffer.wrap( bytes );
buf.order( ByteOrder.BIG_ENDIAN );
while ( buf.hasRemaining() ) {
int num = buf.getInt();
// do something with num...
}
In fact, iterating over a ByteBuffer like this is a convenient way to emulate the behavior of even more complicated examples of unpack() in Perl or PHP.
(Disclaimer: I have not tested this code. I believe it should work, but it's always possible that I may have mistyped or misunderstood something. Please test before using.)
Ps. If you're reading the bytes from an input stream, you could also wrap it in a DataInputStream and use its readInt() method. Of course, it's also possible to use a ByteArrayInputStream to read the input from a byte array, achieving the same results as the ByteBuffer examples above.

How to initialize a ByteBuffer if you don't know how many bytes to allocate beforehand?

Is this:
ByteBuffer buf = ByteBuffer.allocate(1000);
...the only way to initialize a ByteBuffer?
What if I have no idea how many bytes I need to allocate..?
Edit: More details:
I'm converting one image file format to a TIFF file. The problem is the starting file format can be any size, but I need to write the data in the TIFF to little endian. So I'm reading the stuff I'm eventually going to print to the TIFF file into the ByteBuffer first so I can put everything in Little Endian, then I'm going to write it to the outfile. I guess since I know how long IFDs are, headers are, and I can probably figure out how many bytes in each image plane, I can just use multiple ByteBuffers during this whole process.

The types of places that you would use a ByteBuffer are generally the types of places that you would otherwise use a byte array (which also has a fixed size). With synchronous I/O you often use byte arrays, with asynchronous I/O, ByteBuffers are used instead.
If you need to read an unknown amount of data using a ByteBuffer, consider using a loop with your buffer and append the data to a ByteArrayOutputStream as you read it. When you are finished, call toByteArray() to get the final byte array.
Any time when you aren't absolutely sure of the size (or maximum size) of a given input, reading in a loop (possibly using a ByteArrayOutputStream, but otherwise just processing the data as a stream, as it is read) is the only way to handle it. Without some sort of loop, any remaining data will of course be lost.
For example:
final byte[] buf = new byte[4096];
int numRead;
// Use try-with-resources to auto-close streams.
try(
final FileInputStream fis = new FileInputStream(...);
final ByteArrayOutputStream baos = new ByteArrayOutputStream()
) {
while ((numRead = fis.read(buf)) > 0) {
baos.write(buf, 0, numRead);
}
final byte[] allBytes = baos.toByteArray();
// Do something with the data.
}
catch( final Exception e ) {
// Do something on failure...
}
If you instead wanted to write Java ints, or other things that aren't raw bytes, you can wrap your ByteArrayOutputStream in a DataOutputStream:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
DataOutputStream dos = new DataOutputStream(baos);
while (thereAreMoreIntsFromSomewhere()) {
int someInt = getIntFromSomewhere();
dos.writeInt(someInt);
}
byte[] allBytes = baos.toByteArray();

Depends.
Library
Converting file formats tends to be a solved problem for most problem domains. For example:
Batik can transcode between various image formats (including TIFF).
Apache POI can convert between office spreadsheet formats.
Flexmark can generate HTML from Markdown.
The list is long. The first question should be, "What library can accomplish this task?" If performance is a consideration, your time is likely better spent optimising an existing package to meet your needs than writing yet another tool. (As a bonus, other people get to benefit from the centralised work.)
Known Quantities
Reading a file? Allocate file.size() bytes.
Copying a string? Allocate string.length() bytes.
Copying a TCP packet? Allocate 1500 bytes, for example.
Unknown Quantities
When the number of bytes is truly unknown, you can do a few things:
Make a guess.
Analyze example data sets to buffer; use the average length.
Example
Java's StringBuffer, unless otherwise instructed, uses an initial buffer size to hold 16 characters. Once the 16 characters are filled, a new, longer array is allocated, and then the original 16 characters copied. If the StringBuffer had an initial size of 1024 characters, then the reallocation would not happen as early or as often.
Optimization
Either way, this is probably a premature optimization. Typically you would allocate a set number of bytes when you want to reduce the number of internal memory reallocations that get executed.
It is unlikely that this will be the application's bottleneck.

The idea is that it's only a buffer - not the whole of the data. It's a temporary resting spot for data as you read a chunk, process it (possibly writing it somewhere else). So, allocate yourself a big enough "chunk" and it normally won't be a problem.
What problem are you anticipating?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.