FileOutputStream:Something That I am Missing Out? - java

I have this program that reads 2 Kb Data from a binary file adds some header to it and then writes it to a new file.
The code is
try {
FileInputStream fis = new FileInputStream(bin);
FileOutputStream fos = new FileOutputStream(bin.getName().replace(".bin", ".xyz"));
DataOutputStream dos=new DataOutputStream(fos);
fos.write(big, 0, big.length);
for (int n = 1; n <= pcount; n++) {
fis.read(file, mark, 2048);
mark = mark + 2048;
prbar.setValue(n);
prbar.setString("Converted packets:" + String.valueOf(n));
metas = "2048";
meta = metas.getBytes();
pc = String.valueOf(file.length).getBytes();
nval = String.valueOf(n).getBytes();
System.arraycopy(pc, 0, bmeta, 0, pc.length);
System.arraycopy(meta, 0, bmeta, 4, meta.length);
System.arraycopy(nval, 0, bmeta, 8, nval.length);
fos.write(bmeta, 0, bmeta.length);
fos.flush();
fos.write(file, 0, 2048);
fos.flush();
}
}catch (Exception ex) {
erlabel.setText(ex.getMessage());
}
First it should write the header and then the file.But the output file is full of data that does not belong to the file.It is writing some garbage data.What may be the problem?

It's not quite clear with some of the declarations missing, but it looks like your problem is with the fis.read() method: the second argument is an offset in the byte array, not the file (common mistake).
You probably want to use relative reads. You also need to check the return value from .read() to see how many bytes were actually read, before writing the buffer out.
The common idiom is:
InputStream is = ...
OutputStream os = ...
byte[] buf = new byte[2048];
int len;
while((len = is.read(buf)) != -1)
os.write(buf, 0, len);
is.close();
os.close();
Edit
That's a pretty weird way of writing out your metadata, I assume that's what the (unused) DataOutputStream is for?
You don't need to keep flushing the output stream, just close it when you're done.

In addition to what #Dmitri has pointed out, there is something seriously wrong with the way you are writing the metadata.
You are writing the metadata every time around the loop, which cannot be right.
You are essentially allocating 4 bytes for it, via "2048".getBytes(), then copying many more than 4 bytes into it, then writing the 4 bytes. This cannot be right either, in fact it should really be throwing ArrayIndexExceptions at you.
It looks as though the metadata is supposed to contain three binary integers. However you are putting String data into it. I suspect you should be using DataOutputStream.writeInt() directly for these fields, without all the String.valueOf()/getBytes() and System.arraycopy() nonsense.

I would like suggest to use lib community supported like apache common-io for IO features.
There are usefule classes and method;
org.apache.commons.io.DirectoryWalker;
org.apache.commons.io.FileUtils;
org.apache.commons.io.IOCase;
FileUtils.copyDirectory(from, to);
FileUtils.writeByteArrayToFile(file, data);
FileUtils.writeStringToFile(file, data);
FileUtils.deleteDirectory(dir);
FileUtils.forceDelete(dir);

Related

Java: is it possible to store raw data in source file?

OK, I know this is a bit of a weird question:
I'm writing this piece of java code and need to load raw data (approx 130000 floating points):
This data never changes, and since I don't want to write different loading methods for PC and Android, I was thinking of embedding it into the source file as a float[].
Too bad, there seems to be a limit of 65535 entries; is there an efficient way to do it?
Store that data in a file in the classpath; then read that data as a ByteBuffer which you then "convert" to a FloatBuffer. Note that the below code assumes big endian:
final InputStream in = getClass().getResourceAsStream("/path/to/data");
final ByteArrayOutputStream out = new ByteArrayOutputStream();
final byte[] buf = new byte[8192];
int count;
try {
while ((count = in.read(buf)) != -1)
out.write(buf, 0, count);
} finally {
out.close();
in.close();
}
final FloatBuffer buf = ByteBuffer.wrap(out.toByteArray()).asFloatBuffer();
You can then .get() from the FloatBuffer.
You could use 2 or 3 arrays to get around the limit, if that was your only problem with that approach.

Extract tar.gz file in memory in Java

I'm using the Apache Compress library to read a .tar.gz file, something like this:
final TarArchiveInputStream tarIn = initializeTarArchiveStream(this.archiveFile);
try {
TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
while (tarEntry != null) {
byte[] btoRead = new byte[1024];
BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(destPath)); //<- I don't want this!
int len = 0;
while ((len = tarIn.read(btoRead)) != -1) {
bout.write(btoRead, 0, len);
}
bout.close();
tarEntry = tarIn.getNextTarEntry();
}
tarIn.close();
}
catch (IOException e) {
e.printStackTrace();
}
Is it possible not to extract this into a seperate file, and read it in memory somehow? Maybe into a giant String or something?
You could replace the file stream with a ByteArrayOutputStream.
i.e. replace this:
BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(destPath)); //<- I don't want this!
with this:
ByteArrayOutputStream bout = new ByteArrayOutputStream();
and then after closing bout, use bout.toByteArray() to get the bytes.
Is it possible not to extract this into a seperate file, and read it in memory somehow? Maybe into a giant String or something?
Yea sure.
Just replace the code in the inner loop that is openning files and writing to them with code that writes to a ByteArrayOutputStream ... or a series of such streams.
The natural representation of the data that you read from the TAR (like that) will be bytes / byte arrays. If the bytes are properly encoded characters, and you know the correct encoding, then you can convert them to strings. Otherwise, it is better to leave the data as bytes. (If you attempt to convert non-text data to strings, or if you convert using the wrong charset/encoding you are liable to mangle it ... irreversibly.)
Obviously, you are going to need to think through some of these issues yourself, but basic idea should work ... provided you have enough heap space.
copy the value of btoread to a String like
String s = String.valueof(byteVar);
and goon appending the byte value to the string untill end of the file reaches..

difference between input.read and input.read(array, offset, length)

I'm trying to understand how inputstreams work. The following block of code is one of the many ways to read data from a text file:-
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int data = 0;
while (data != -1) (-1 means we reached the end of the file)
{
data = input.read(); //if a character was read, it'll be turned to a bite and we get the integer representation of it so a is 97 b is 98
System.out.println(data + (char)data); //this will print the numbers followed by space then the character
}
input.close();
Now to use input.read(byte, offset, length) i have this code. I got it from here
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int totalBytesRead = 0, bytesRemaining, bytesRead;
byte[] result = new byte[ ( int ) file.length()];
while ( totalBytesRead < result.length )
{
bytesRemaining = result.length - totalBytesRead;
bytesRead = input.read ( result, totalBytesRead, bytesRemaining );
if ( bytesRead > 0 )
totalBytesRead = totalBytesRead + bytesRead;
//printing integer version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print(result[i] + " ");
System.out.println();
//printing character version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print((char)result[i]);
}
input.close();
I'm assuming that based on the name BYTESREAD, this read method is returning the number of bytes read. In the documentation, it says that the function will try to read as many as possible. So there might be a reason why it wouldn't.
My first question is: What are these reasons?
I could replace that entire while loop with one line of code: input.read(result, 0, result.length)
I'm sure the creator of the article thought about this. It's not about the output because I get the same output in both cases. So there has to be a reason. At least one. What is it?
The documentation of read(byte[],int,int says that it:
Reads up to len bytes of data.
An attempt is made to read as many as len bytes
A smaller number may be read.
Since we are working with files that are right there in our hard disk, it seems reasonable to expect that the attempt will read the whole file, but input.read(result, 0, result.length) is not guaranteed to read the whole file (it's not said anywhere in the documentation). Relying in undocumented behaviors is a source for bugs when the undocumented behavior change.
For instance, the file stream may be implemented differently in other JVMs, some OS may impose a limit on the number of bytes that you may read at once, the file may be located in the network, or you may later use that piece of code with another implementation of stream, which doesn't behave in that way.
Alternatively, if you are reading the whole file in an array, perhaps you could use DataInputStream.readFully
About the loop with read(), it reads a single byte each time. That reduces performance if you are reading a big chunk of data, since each call to read() will perform several tests (has the stream ended? etc) and may ask the OS for one byte. Since you already know that you want file.length() bytes, there is no reason for not using the other more efficient forms.
Imagine you are reading from a network socket, not from a file. In this case you don't have any information about the total amount of bytes in the stream. You would allocate a buffer of fixed size and read from the stream in a loop. During one iteration of the loop you can't expect there are BUFFERSIZE bytes available in the stream. So you would fill the buffer as much as possible and iterate again, until the buffer is full. This can be useful, if you have data blocks of fixed size, for example serialized object.
ArrayList<MyObject> list = new ArrayList<MyObject>();
try {
InputStream input = socket.getInputStream();
byte[] buffer = new byte[1024];
int bytesRead;
int off = 0;
int len = 1024;
while(true) {
bytesRead = input.read(buffer, off, len);
if(bytesRead == len) {
list.add(createMyObject(buffer));
// reset variables
off = 0;
len = 1024;
continue;
}
if(bytesRead == -1) break;
// buffer is not full, adjust size
off += bytesRead;
len -= bytesRead;
}
} catch(IOException io) {
// stream was closed
}
ps. Code is not tested and should only point out, how this function can be useful.
You specify the amount of bytes to read because you might not want to read the entire file at once or maybe you couldn't or might not want to create a buffer as large as the file.

Buffered Input Stream does not load file correctly

I have the following code to download a List of files. After downloading I compare the md5 of the online File with the downloaded.
They are similar when the download size is lower than 1024 bytes. For all over 1024bytes, there is an different md5 sum.
Now I don't know the reason. I think, it depends on the Array-Size with 1024 bytes? Maybe it writes on every time the full 1024 bytes to the file but then the question is, why does it work with files lower than 1kb??
String fileUrl= url_str;
URL url = new URL(fileUrl);
BufferedInputStream bufferedInputStream = new BufferedInputStream(url.openStream());
FileOutputStream fileOutputStream =new FileOutputStream(target);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, 1024);
byte data[] = new byte[1024];
while(bufferedInputStream.read(data, 0, 1024) >0 )
{
bufferedOutputStream.write(data);
}
bufferedOutputStream.close();
bufferedInputStream.close();
This is broken:
while(bufferedInputStream.read(data, 0, 1024) >0 )
{
bufferedOutputStream.write(data);
}
You're assuming that every read call fills up the entire buffer. You should use the return value of read:
int bytesRead;
while((bytesRead = bufferedInputStream.read(data, 0, 1024)) >0 )
{
bufferedOutputStream.write(data, 0, bytesRead);
}
(Additionally, you should be closing all your streams in finally blocks, but that's another matter.)
After the first read the data[] will be containing bytes. So during the last read the array will contain the last n bytes, and some bytes from the previous read. Actually you should check the return of the read. It indicates how many bytes has been read into the array, and write just that many bytes out.

java servlet serving a file over HTTP connection

I have the following code(Server is Tomcat/Linux).
// Send the local file over the current HTTP connection
FileInputStream fin = new FileInputStream(sendFile);
int readBlockSize;
int totalBytes=0;
while ((readBlockSize=fin.available())>0) {
byte[] buffer = new byte[readBlockSize];
fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, readBlockSize);
totalBytes+=readBlockSize;
}
With some files of type 3gp
When i attach the debugger, in line:
outStream.write(buffer, 0, readBlockSize);
it breaks out the while with the following error;
ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) line:299
And the file is not served.
Any clues?
Thanks
A.K.
You can't guarantee that InputStream.read(byte[], int, int) will actually read the desired number of bytes: it may read less. Even your call to available() will not provide that guarantee. You should use the return value from fin.read to find out how many bytes were actually read and only write that many to the output.
I would guess that the problem you see could be related to this. If the block read is less than the available size then your buffer will be partially filled and that will cause problems when you write too many bytes to the output.
Also, don't allocate a new array every time through the loop! That will result in a huge number of needless memory allocations that will slow your code down, and will potentially cause an OutOfMemoryError if available() returns a large number.
Try this:
int size;
int totalBytes = 0;
byte[] buffer = new byte[BUFFER_SIZE];
while ((size = fin.read(buffer, 0, BUFFER_SIZE)) != -1) {
outStream.write(buffer, 0, size);
totalBytes += size;
}
Avoiding these types of problems is why I start with Commons IO. If that's an option, your code would be as follows.
FileInputStream fin = new FileInputStream(sendFile);
int totalBytes = IOUtils.copy(fin, outStream);
No need reinventing the wheel.
It is possible that the .read() call returns less bytes than you requested. This means you need to use te returnvalue of .read() as argument to the .write() call:
int bytesRead = fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, bytesRead);
apart from this, it is better to pre-allocate a buffer and use it (your could could try to use a 2Gb buffer if your file is large :-))
byte[] buffer = new byte[4096]; // define a constant for this max length
while ((readBlockSize=fin.available())>0) {
if (4096 < readBlockSize) {
readBlockSise = 4096;
}

Categories

Resources