Java downloading files sometimes result in CRC - java

I've written code to automatically download a batch of files using an InputStream and a FileOutputStream.
The code is very straightforward:
is = urlConn.getInputStream();
fos = new FileOutputStream(outputFile);
eventBus.fireEvent(this, new DownloadStartedEvent(item));
int read;
byte[] buffer = new byte[2048];
while ((read = is.read(buffer)) != -1) {
fos.write(buffer, 0, read);
}
eventBus.fireEvent(this, new DownloadCompletedEvent(item));
At first sight this works very well, files get downloaded without any problems, however,
occasionally while trying to extract a batch of downloaded rar files, extraction fails with one of the rar parts having a CRC error.
As this happened a few times already, although not consistently, I started to suspect that something in this code is not correct/optimal.
It will be helpful to know that there are 4 downloads executing concurrently using the JDK FixedThreadPool mechanism:
execService.execute(new Runnable() {
#Override
public void run() {
if (item.getState().equals(DownloadCandidateState.WAITING)) {
Downloader downloader = new Downloader(eventBus);
downloader.download(item, item.getName());
}
}
});
But because every download thread uses a new instance of the Downloader class, I believe this problem is not a side effect of concurrency?
Any ideas if this occasional CRC error has to do with the code or if it has to do with something else?
UPDATE
I can verify that the file size of a problematic file is correct.
I also did a diff (on linux) on the automatically downloaded file and the manually downloaded file.
The filesize is the exact same for both files, however, diff says that the binary content differs between the 2 files:
Binary files file.rar and file(2).rar differ
UPDATE 2
I used a visual binary diff tool and could see that a sequence of 128 bytes was different, somewhere in the middle of the file. I don't understand how that could happen, as the file being downloaded doesn't change and it is being read byte per byte using an input stream. Any ideas??

You can also use Apache's HttpClient if you don't want to handle that entity streaming yourself. It's a well written and documented library. There are several usable entity / entity wrapper classes available.
Here you can have a look at entity retrieval: http://hc.apache.org/httpcomponents-client-4.0.1/tutorial/html/fundamentals.html#d4e152

You should run a diff (unix tool) comparing the original with the result to find out what has actually changed. You May see a pattern right away.

I would start by flushing (or closing) the FileOutputStream

Your code is correct provided everything is closed and no exceptions are thrown. The problem lies elsewhere, probably in the original files.

Problem seemed to have been the Linux atheros driver for my NIC.

Related

JAI create seems to leave file descriptors open

I have some old code that was working until recently, but seems to barf now that it runs on a new server using OpenJDK 6 rather than Java SE 6.
The problem seems to revolve around JAI.create. I have jpeg files which I scale and convert to png files. This code used to work with no leaks, but now that the move has been made to a box running OpenJDK, the file descriptors seem to never close, and I see more and more tmp files accumulate in the tmp directory on the server. These are not files I create, so I assume it is JAI that does it.
Another reason might be the larger heap size on the new server. If JAI cleans up on finalize, but GC happens less frequently, then maybe the files pile up because of that. Reducing the heap size is not an option, and we seem to be having unrelated issues with increasing ulimit.
Here's an example of a file that leaks when I run this:
/tmp/imageio7201901174018490724.tmp
Some code:
// Processor is an internal class that aggregates operations
// performed on the image, like resizing
private byte[] processImage(Processor processor, InputStream stream) {
byte[] bytes = null;
SeekableStream s = null;
try {
// Read the file from the stream
s = SeekableStream.wrapInputStream(stream, true);
RenderedImage image = JAI.create("stream", s);
BufferedImage img = PlanarImage.wrapRenderedImage(image).getAsBufferedImage();
// Process image
if (processor != null) {
image = processor.process(img);
}
// Convert to bytes
bytes = convertToPngBytes(image);
} catch (Exception e){
// error handling
} finally {
// Clean up streams
IOUtils.closeQuietly(stream);
IOUtils.closeQuietly(s);
}
return bytes;
}
private static byte[] convertToPngBytes(RenderedImage image) throws IOException {
ByteArrayOutputStream out = null;
byte[] bytes = null;
try {
out = new ByteArrayOutputStream();
ImageIO.write(image, "png", out);
bytes = out.toByteArray();
} finally {
IOUtils.closeQuietly(out);
}
return bytes;
}
My questions are:
Has anyone run into this and solved it? Since the tmp files created are not mine, I don't know what their names are and thus can't really do anything about them.
What're some of the libraries of choice for resizing and reformatting images? I heard of Scalr - anything else I should look into?
I would rather not rewite the old code at this time, but if there is no other choice...
Thanks!
Just a comment on the temp files/finalizer issue, now that you seem to have solved the root of the problem (too long for a comment, so I'll post it as an answer... :-P):
The temp files are created by ImageIO's FileCacheImageInputStream. These instances are created whenever you call ImageIO.createImageInputStream(stream) and the useCache flag is true (the default). You can set it to false to disable the disk caching, at the expense of in-memory caching. This might make sense as you have a large heap, but probably not if you are processing very large images.
I also think you are (almost) correct about the finalizer issue. You'll find the following ´finalize´ method on FileCacheImageInputStream (Sun JDK 6/1.6.0_26):
protected void finalize() throws Throwable {
// Empty finalizer: for performance reasons we instead use the
// Disposer mechanism for ensuring that the underlying
// RandomAccessFile is closed/deleted prior to garbage collection
}
There's some quite "interesting" code in the class' constructor, that sets up automatic stream closing and disposing when the instance is finalized (should client code forget to do so). This might be different in the OpenJDK implentation, at least it seems kind of hacky. It's also unclear to me at the moment exactly what "performance reasons" we are talking about...
In any case, it seems calling close on the ImageInputStream instance, as you now do, will properly close the file descriptor and delete the temp file.
Found it!
So a stream gets wrapped by another stream in a different area in the code:
iis = ImageIO.createImageInputStream(stream);
And further down, stream is closed.
This doesn't seem to leak any resources when running with Sun Java, but does seem to cause a leak when running with Open JDK.
I'm not sure why that is (I have not looked at source code to verify, though I have my guesses), but that's what seems to be happening. Once I explicitly closed the wrapping stream, all was well.

Java 7 filechannel not closing properly after calling a map method

I'm working on a sc2replay parsing tool. I build it on top of MPQLIB http://code.google.com/p/mpqlib/
Unfortunately the tool uses filechannels to read through the bzip files,
and uses map(MapMode.READ_ONLY, hashtablePosition, hashTableSize);
After calling that function closing the file channel does not release the file in the process.
To be specific I cannot rename/move the file.
The problem occurs in Java 7 and it works fine on Java 6.
Here is a simple code snippet to replicate it:
FileInputStream f = new FileInputStream("test.SC2Replay");
FileChannel fc = f.getChannel();
fc.map(MapMode.READ_ONLY, 0,1);
fc.close();
new File("test.SC2Replay").renameTo(new File("test1.SC2Replay"));
commenting out the fc.map will allow you to rename the file.
P.S. from here Should I close the FileChannel?
It states that you do not need to close both filechannel and filestream because closing one will close another. I also tried closing either or both and still did not worked.
Is there a workaround on renaming the file after reading the data using FileChannel.map on Java 7, because every one seems to have Java 7 nowadays?
Good day,
it seems that FileChannel.map causes the problem on java 7. if you use FileChannel.map, you can no longer close the the file.
a quick work around is instead of using FileChannel.map(MapMode.READ_ONLY, position, length)
you can use
ByteBuffer b = ByteBuffer.allocate(length);
fc.read(b,position);
b.rewind();
It's a documented bug. The bug report referes to Java 1.4, and they consider it a documentation bug. Closing the filechannel does not close the underlying stream.
If you're using Sun JRE, you can cheat by casting to their implementation and telling it to release itself. I'd only recommend doing this if you're not reliant on the file being closed or never plan to use another JRE.
At some point, I hope that something like this will make it into the proper public API.
try (FileInputStream stream = new FileInputStream("test.SC2Replay");
FileChannel channel = stream.getChannel()) {
MappedByteBuffer mappedBuffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, 1);
try {
// do stuff with it
} finally {
if (mappedBuffer instanceof DirectBuffer) {
((DirectBuffer) mappedBuffer).cleaner().clean();
}
}
}

How to open a file without saving it to disk

My Question: How do I open a file (in the system default [external] program for the file) without saving the file to disk?
My Situation: I have files in my resources and I want to display those without saving them to disk first. For example, I have an xml file and I want to open it on the user's machine in the default program for reading xml file without saving it to the disk first.
What I have been doing: So far I have just saved the file to a temporary location, but I have no way of knowing when they no longer need the file so I don't know when/if to delete it. Here's my SSCCE code for that (well, it's mostly sscce, except for the resource... You'll have to create that on your own):
package main;
import java.io.*;
public class SOQuestion {
public static void main(String[] args) throws IOException {
new SOQuestion().showTemplate();
}
/** Opens the temporary file */
private void showTemplate() throws IOException {
String tempDir = System.getProperty("java.io.tmpdir") + "\\BONotifier\\";
File parentFile = new File(tempDir);
if (!parentFile.exists()) {
parentFile.mkdirs();
}
File outputFile = new File(parentFile, "template.xml");
InputStream inputStream = getClass().getResourceAsStream("/resources/template.xml");
int size = 4096;
try (OutputStream out = new FileOutputStream(outputFile)) {
byte[] buffer = new byte[size];
int length;
while ((length = inputStream.read(buffer)) > 0) {
out.write(buffer, 0, length);
}
inputStream.close();
}
java.awt.Desktop.getDesktop().open(outputFile);
}
}
Because of this line:
String tempDir = System.getProperty("java.io.tmpdir") + "\\BONotifier\\";
I deduce that you're working on Windows. You can easily make this code multiplatform, you know.
The answer to your question is: no. The Desktop class needs to know where the file is in order to invoke the correct program with a parameter. Note that there is no method in that class accepting an InputStream, which could be a solution.
Anyway, I don't see where the problem is: you create a temporary file, then open it in an editor or whatever. That's fine. In Linux, when the application is exited (normally) all its temporary files are deleted. In Windows, the user will need to trigger the temporary files deletion. However, provided you don't have security constraints, I can't understand where the problem is. After all, temporary files are the operating system's concern.
Depending on how portable your application needs to be, there might be no "one fits all" solution to your problem. However, you can help yourself a bit:
At least under Linux, you can use a pipe (|) to direct the output of one program to the input of another. A simple example for that (using the gedit text editor) might be:
echo "hello world" | gedit
This will (for gedit) open up a new editor window and show the contents "hello world" in a new, unsaved document.
The problem with the above is, that this might not be a platform-independent solution. It will work for Linux and probably OS X, but I don't have a Windows installation here to test it.
Also, you'd need to find out the default editor by yourself. This older question and it's linked article give some ideas on how this might work.
I don't understand your question very well. I can see only two possibilities to your question.
Open an existing file, and you wish to operate on its stream but do not want to save any modifications.
Create a file, so that you could use file i/o to operate on the file stream, but you don't wish to save the stream to file.
In either case, your main motivation is to exploit file i/o existingly available to your discretion and programming pleasure, am I correct?
I have feeling that the question is not that simple and this my answer is probably not the answer you seek. However, if my understanding of the question does coincide with your question ...
If you wish to use Stream io, instead of using FileOutputStream or FileInputStream which are consequent to your opening a File object, why not use non-File InputStream or OutputStream? Your file i/o utilities will finally boil down to manipulating i/o streams anyway.
http://docs.oracle.com/javase/7/docs/api/java/io/OutputStream.html
http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html
No need to involve temp files.

join multiple files of database using java (android) results in corrupted database

I'm trying to join several files of a database in android. The file was split in the first place because of the limitation in 1MB of size of any files within the apk for android versions prior to 2.3. The file I'm trying to join together is a database, which is complicated in its own way to be placed in the right folder for SQLite to read it, and I'm using the following method:
private void copyDB() throws IOException {
// list the files the db is splitted into
int[] cc = { R.raw.cc1, R.raw.cc2, R.raw.cc3 };
OutputStream out = new FileOutputStream(DB_PATH+DB_NAME);
for (int k : cc) {
InputStream in = mContext.getResources().openRawResource(k);
int count;
byte[] filebytes = new byte[1024];
while((count = in.read(filebytes)) != -1)
out.write(filebytes, 0, count);
in.close();
}
// clean up
out.flush();
out.close();
}
As you can see, I'm using plain streams to join the files back together, and I'm almost positive that it works because it performs as expected with android 2.3. However, when I test on android 2.2 I'm getting a corrupt database (I don't have with me the exact message from the LogCat but the SQL error was 'Error: database disk image is malformed')
Is there anything that I can do to prevent the database from corrupting when I place it in the /data/data/package.name/database/ folder?
Thank you for your time.
This is not strictly the answer to your question, but may resolve your problem. When I encountered this problem, I believe I renamed the file to force Android to treat it as an uncompressable binary (gave it an extension of .mp3).
I had the same problem, however I realized that it was the way I was splitting my db incorrectly. Originally I split it this way;
split largeFile.db -b 1M largeFile
i was forgetting the .db at the end of the second largeFile hence,
split largeFile.db -b 1M largeFile.db
solved my problem. I then moved all my largeFile.dbaa, largeFile.dbab, ... in the assets (I suppose raw your case works as well) folder and renamed them to largeFileA, largeFileB,...
Maybe this helps

Java - open existing file or create one if doesn't exist using IO streams

I was following instructions from a Java website (http://java.sun.com/docs/books/tutorial/essential/io/file.html#createStream) on creating or writing a file using an IO stream. However, the code it provides seems to be broken in multiple places:
import static java.nio.file.StandardOpenOption.*;
Path logfile = ...;
//Convert the string to a byte array.
String s = ...;
byte data[] = s.getBytes();
OutputStream out = null;
try {
out = new BufferedOutputStream(logfile.newOutputStream(CREATE, APPEND));
...
out.write(data, 0, data.length);
} catch (IOException x) {
System.err.println(x);
} finally {
if (out != null) {
out.flush();
out.close();
}
}
For example, Eclipse crashes on the import, and on using the Path class, for starters. However, this tutorial seemed to provide exactly what I want to do - I want to write to a file if it exists (overwrite) or create a file if it doesn't exist, and ultimately I will be writing with an output stream (which gets created here using the .newOutputStream() method). So creating/writing with an output stream seemed like a likely candidate. Does anyone know how to either fix the above so that it works, or a better way to do what I want to do?
That example seems to be using APIs that are not part of Sun Java 6.
Class Path and the package java.nio.file are part of an API that is going to be added in Sun JDK 7. Note that the link to the documentation of Path points to the API documentation of OpenJDK, Sun's open source development version of Java.
So, you cannot use those APIs if you are using regular Sun Java 6.
Read the warning on the start page of the tutorial:
File I/O (Featuring NIO.2)
This section is being updated to reflect features and conventions of the upcoming release, JDK7. You can download the current JDK7 Snapshot from java.net. We've published this preliminary version so you can get the most current information now, and so you can tell us about errors, omissions, or improvements we can make to this tutorial.
In Sun Java 6 you can just use FileOutputStream. It will automatically create a new file if the file doesn't exist or overwrite an existing file if it exists:
FileOutputStream out = new FileOutputStream("filename.xyz");
out.write(data, 0, data.length);
Note: For writing text files (what is what you seem to want to do), use a Writer (for example FileWriter) instead of using an OutputStream directly. The Writer will take care of converting the text using a character encoding.
See the Java SE 6 API Documentation (especially the docs of the packages java.io) for information about what's available in Java SE 6.

Categories

Resources