JAI create seems to leave file descriptors open - java

I have some old code that was working until recently, but seems to barf now that it runs on a new server using OpenJDK 6 rather than Java SE 6.
The problem seems to revolve around JAI.create. I have jpeg files which I scale and convert to png files. This code used to work with no leaks, but now that the move has been made to a box running OpenJDK, the file descriptors seem to never close, and I see more and more tmp files accumulate in the tmp directory on the server. These are not files I create, so I assume it is JAI that does it.
Another reason might be the larger heap size on the new server. If JAI cleans up on finalize, but GC happens less frequently, then maybe the files pile up because of that. Reducing the heap size is not an option, and we seem to be having unrelated issues with increasing ulimit.
Here's an example of a file that leaks when I run this:
/tmp/imageio7201901174018490724.tmp
Some code:
// Processor is an internal class that aggregates operations
// performed on the image, like resizing
private byte[] processImage(Processor processor, InputStream stream) {
byte[] bytes = null;
SeekableStream s = null;
try {
// Read the file from the stream
s = SeekableStream.wrapInputStream(stream, true);
RenderedImage image = JAI.create("stream", s);
BufferedImage img = PlanarImage.wrapRenderedImage(image).getAsBufferedImage();
// Process image
if (processor != null) {
image = processor.process(img);
}
// Convert to bytes
bytes = convertToPngBytes(image);
} catch (Exception e){
// error handling
} finally {
// Clean up streams
IOUtils.closeQuietly(stream);
IOUtils.closeQuietly(s);
}
return bytes;
}
private static byte[] convertToPngBytes(RenderedImage image) throws IOException {
ByteArrayOutputStream out = null;
byte[] bytes = null;
try {
out = new ByteArrayOutputStream();
ImageIO.write(image, "png", out);
bytes = out.toByteArray();
} finally {
IOUtils.closeQuietly(out);
}
return bytes;
}
My questions are:
Has anyone run into this and solved it? Since the tmp files created are not mine, I don't know what their names are and thus can't really do anything about them.
What're some of the libraries of choice for resizing and reformatting images? I heard of Scalr - anything else I should look into?
I would rather not rewite the old code at this time, but if there is no other choice...
Thanks!

Just a comment on the temp files/finalizer issue, now that you seem to have solved the root of the problem (too long for a comment, so I'll post it as an answer... :-P):
The temp files are created by ImageIO's FileCacheImageInputStream. These instances are created whenever you call ImageIO.createImageInputStream(stream) and the useCache flag is true (the default). You can set it to false to disable the disk caching, at the expense of in-memory caching. This might make sense as you have a large heap, but probably not if you are processing very large images.
I also think you are (almost) correct about the finalizer issue. You'll find the following ´finalize´ method on FileCacheImageInputStream (Sun JDK 6/1.6.0_26):
protected void finalize() throws Throwable {
// Empty finalizer: for performance reasons we instead use the
// Disposer mechanism for ensuring that the underlying
// RandomAccessFile is closed/deleted prior to garbage collection
}
There's some quite "interesting" code in the class' constructor, that sets up automatic stream closing and disposing when the instance is finalized (should client code forget to do so). This might be different in the OpenJDK implentation, at least it seems kind of hacky. It's also unclear to me at the moment exactly what "performance reasons" we are talking about...
In any case, it seems calling close on the ImageInputStream instance, as you now do, will properly close the file descriptor and delete the temp file.

Found it!
So a stream gets wrapped by another stream in a different area in the code:
iis = ImageIO.createImageInputStream(stream);
And further down, stream is closed.
This doesn't seem to leak any resources when running with Sun Java, but does seem to cause a leak when running with Open JDK.
I'm not sure why that is (I have not looked at source code to verify, though I have my guesses), but that's what seems to be happening. Once I explicitly closed the wrapping stream, all was well.

Related

Does java.nio.file.Files.copy call sync() on the file system?

i'm developing an application that has to reboot the system after a file has been uploaded and verified. The file system is on an sd card, so it must be synced to be sure the uploaded file has actually been saved on the device.
I was wondering if java.io.file.Files.copy does the sync or not.
My code runs like this:
public int save(MultipartFile multipart) throws IOException {
Files.copy(multipart.getInputStream(), file, standardCopyOption.REPLACE_EXISTING);
if (validate(file)) {
sync(file); <-- is it useless?
reboot();
return 0;
} else {
Files.delete(file);
return -1;
}
}
I tried to find a way to call sync on the fs in the nio package, but the only solution that i've found is:
public void sync(Path file) {
final FileOutputStream fos = new FileOutputStream(file.toFile());
final FileDescriptor fd = fos.getFD();
fd.sync();
}
which relies on old java.io.File .
If you look at the source code for Files.copy(...), you will see that it doesn't perform a sync(). In the end, it will perform a copy of an input stream into an output stream corresponding to the first 2 arguments passed to Files.copy(...).
Furthermore, the FileDescriptor is tied to the stream from which it is obtained. If you don't perform any I/O operation with this stream, other than creating a file with new FileOutputStream(...), there will be nothing to sync() with the fie system, as is the case with the code you shared.
Thus, the only way I see to accomplish your goal is to "revert" to the old-fashioned java.io API and implement a stream-to-stream copy yourself. This will allow you to sync() on the file descriptor obtained from the same FileOutputStream that is used for the copy operation.
I'll say the copy operation is depending on your OS JRE code, so if you want to be sure of the file Copy at OS level, continue to explicitly call the sync() method.
This was because SYNC and DSYNC were annoyingly omitted from StandardCopyOption enum, yet were provided in StandardOpenOption enum for file targets, so you need to use FileChannel and SeekableByteChannel if supported by FileSystemProvider, like :
Set<? extends OpenOption> TARGET_OPEN_OPTIONS = EnumSet.of(StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE);
FileChannel.out = target.getFileSystem().provider().newFileChannel(target, TARGET_OPEN_OPTIONS);
SeekableByteChannel = Files.newByteChannel(source, StandardOpenOption.READ);
out.transferFrom(source, 0, source.size());
out.force(boolean metadata); // false == DSYNC, true == SYNC
Using java.io.FileOutputStream.getFD().sync() is an obsolete "solution", because you lose all support for NIO2 FileSystems, like the often bundled ZipFileSystem, and it can still fail if not supported by the native class implementations or OS!
Using DSYNC or SYNC when opening an OutputStream via a FileSystemProvider is another option, but may cause premature flushing of a FileSystem cache.

Merge big file parts faster in java

I'm writing a java rest service to support parallel upload of parts of a large file. I am writing these parts in separate files and merging them using file channel. I have a sample implemented in Golang, it does the same but when it merges the parts, it takes no time. When I use file channel or read from one stream and write to the final file, it takes long time. The difference I think is, Golang has ability to keep the data on the disk as it is and just merge them by not actually moving the data. Is there any way I can do the same in java?
Here is my code that merges parts, I loop through this method for all parts:
private void mergeFileUsingChannel(String destinationPath, String sourcePath, long partSize, long offset) throws Exception{
FileChannel outputChannel = null;
FileChannel inputChannel = null;
try{
outputChannel = new FileOutputStream(new File(destinationPath)).getChannel();
outputChannel.position(offset);
inputChannel = new FileInputStream(new File(sourcePath)).getChannel();
inputChannel.transferTo(0, partSize, outputChannel);
}catch(Exception e){
e.printStackTrace();
}
finally{
if(inputChannel != null)
inputChannel.close();
if(outputChannel != null){
outputChannel.close();
}
}
}
The documentation of FileChannel transferTo states:
"Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them."
So the code you have written is correct, and the inefficiency you are seeing is probably related to the underlying file-system type.
One small optimization I could suggest would be to open the file in append mode.
"Whether the advancement of the position and the writing of the data are done in a single atomic operation is system-dependent"
Beyond that, you may have to think of a way to work around the problem. For example, by creating a large enough contiguous file as a first step.
EDIT: I also noticed that you are not explicitly closing your FileOutputStream. It would be best to hang on to that and close it, so that all the File Descriptors are closed.

Java 7 filechannel not closing properly after calling a map method

I'm working on a sc2replay parsing tool. I build it on top of MPQLIB http://code.google.com/p/mpqlib/
Unfortunately the tool uses filechannels to read through the bzip files,
and uses map(MapMode.READ_ONLY, hashtablePosition, hashTableSize);
After calling that function closing the file channel does not release the file in the process.
To be specific I cannot rename/move the file.
The problem occurs in Java 7 and it works fine on Java 6.
Here is a simple code snippet to replicate it:
FileInputStream f = new FileInputStream("test.SC2Replay");
FileChannel fc = f.getChannel();
fc.map(MapMode.READ_ONLY, 0,1);
fc.close();
new File("test.SC2Replay").renameTo(new File("test1.SC2Replay"));
commenting out the fc.map will allow you to rename the file.
P.S. from here Should I close the FileChannel?
It states that you do not need to close both filechannel and filestream because closing one will close another. I also tried closing either or both and still did not worked.
Is there a workaround on renaming the file after reading the data using FileChannel.map on Java 7, because every one seems to have Java 7 nowadays?
Good day,
it seems that FileChannel.map causes the problem on java 7. if you use FileChannel.map, you can no longer close the the file.
a quick work around is instead of using FileChannel.map(MapMode.READ_ONLY, position, length)
you can use
ByteBuffer b = ByteBuffer.allocate(length);
fc.read(b,position);
b.rewind();
It's a documented bug. The bug report referes to Java 1.4, and they consider it a documentation bug. Closing the filechannel does not close the underlying stream.
If you're using Sun JRE, you can cheat by casting to their implementation and telling it to release itself. I'd only recommend doing this if you're not reliant on the file being closed or never plan to use another JRE.
At some point, I hope that something like this will make it into the proper public API.
try (FileInputStream stream = new FileInputStream("test.SC2Replay");
FileChannel channel = stream.getChannel()) {
MappedByteBuffer mappedBuffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, 1);
try {
// do stuff with it
} finally {
if (mappedBuffer instanceof DirectBuffer) {
((DirectBuffer) mappedBuffer).cleaner().clean();
}
}
}

How to open a file without saving it to disk

My Question: How do I open a file (in the system default [external] program for the file) without saving the file to disk?
My Situation: I have files in my resources and I want to display those without saving them to disk first. For example, I have an xml file and I want to open it on the user's machine in the default program for reading xml file without saving it to the disk first.
What I have been doing: So far I have just saved the file to a temporary location, but I have no way of knowing when they no longer need the file so I don't know when/if to delete it. Here's my SSCCE code for that (well, it's mostly sscce, except for the resource... You'll have to create that on your own):
package main;
import java.io.*;
public class SOQuestion {
public static void main(String[] args) throws IOException {
new SOQuestion().showTemplate();
}
/** Opens the temporary file */
private void showTemplate() throws IOException {
String tempDir = System.getProperty("java.io.tmpdir") + "\\BONotifier\\";
File parentFile = new File(tempDir);
if (!parentFile.exists()) {
parentFile.mkdirs();
}
File outputFile = new File(parentFile, "template.xml");
InputStream inputStream = getClass().getResourceAsStream("/resources/template.xml");
int size = 4096;
try (OutputStream out = new FileOutputStream(outputFile)) {
byte[] buffer = new byte[size];
int length;
while ((length = inputStream.read(buffer)) > 0) {
out.write(buffer, 0, length);
}
inputStream.close();
}
java.awt.Desktop.getDesktop().open(outputFile);
}
}
Because of this line:
String tempDir = System.getProperty("java.io.tmpdir") + "\\BONotifier\\";
I deduce that you're working on Windows. You can easily make this code multiplatform, you know.
The answer to your question is: no. The Desktop class needs to know where the file is in order to invoke the correct program with a parameter. Note that there is no method in that class accepting an InputStream, which could be a solution.
Anyway, I don't see where the problem is: you create a temporary file, then open it in an editor or whatever. That's fine. In Linux, when the application is exited (normally) all its temporary files are deleted. In Windows, the user will need to trigger the temporary files deletion. However, provided you don't have security constraints, I can't understand where the problem is. After all, temporary files are the operating system's concern.
Depending on how portable your application needs to be, there might be no "one fits all" solution to your problem. However, you can help yourself a bit:
At least under Linux, you can use a pipe (|) to direct the output of one program to the input of another. A simple example for that (using the gedit text editor) might be:
echo "hello world" | gedit
This will (for gedit) open up a new editor window and show the contents "hello world" in a new, unsaved document.
The problem with the above is, that this might not be a platform-independent solution. It will work for Linux and probably OS X, but I don't have a Windows installation here to test it.
Also, you'd need to find out the default editor by yourself. This older question and it's linked article give some ideas on how this might work.
I don't understand your question very well. I can see only two possibilities to your question.
Open an existing file, and you wish to operate on its stream but do not want to save any modifications.
Create a file, so that you could use file i/o to operate on the file stream, but you don't wish to save the stream to file.
In either case, your main motivation is to exploit file i/o existingly available to your discretion and programming pleasure, am I correct?
I have feeling that the question is not that simple and this my answer is probably not the answer you seek. However, if my understanding of the question does coincide with your question ...
If you wish to use Stream io, instead of using FileOutputStream or FileInputStream which are consequent to your opening a File object, why not use non-File InputStream or OutputStream? Your file i/o utilities will finally boil down to manipulating i/o streams anyway.
http://docs.oracle.com/javase/7/docs/api/java/io/OutputStream.html
http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html
No need to involve temp files.

Java downloading files sometimes result in CRC

I've written code to automatically download a batch of files using an InputStream and a FileOutputStream.
The code is very straightforward:
is = urlConn.getInputStream();
fos = new FileOutputStream(outputFile);
eventBus.fireEvent(this, new DownloadStartedEvent(item));
int read;
byte[] buffer = new byte[2048];
while ((read = is.read(buffer)) != -1) {
fos.write(buffer, 0, read);
}
eventBus.fireEvent(this, new DownloadCompletedEvent(item));
At first sight this works very well, files get downloaded without any problems, however,
occasionally while trying to extract a batch of downloaded rar files, extraction fails with one of the rar parts having a CRC error.
As this happened a few times already, although not consistently, I started to suspect that something in this code is not correct/optimal.
It will be helpful to know that there are 4 downloads executing concurrently using the JDK FixedThreadPool mechanism:
execService.execute(new Runnable() {
#Override
public void run() {
if (item.getState().equals(DownloadCandidateState.WAITING)) {
Downloader downloader = new Downloader(eventBus);
downloader.download(item, item.getName());
}
}
});
But because every download thread uses a new instance of the Downloader class, I believe this problem is not a side effect of concurrency?
Any ideas if this occasional CRC error has to do with the code or if it has to do with something else?
UPDATE
I can verify that the file size of a problematic file is correct.
I also did a diff (on linux) on the automatically downloaded file and the manually downloaded file.
The filesize is the exact same for both files, however, diff says that the binary content differs between the 2 files:
Binary files file.rar and file(2).rar differ
UPDATE 2
I used a visual binary diff tool and could see that a sequence of 128 bytes was different, somewhere in the middle of the file. I don't understand how that could happen, as the file being downloaded doesn't change and it is being read byte per byte using an input stream. Any ideas??
You can also use Apache's HttpClient if you don't want to handle that entity streaming yourself. It's a well written and documented library. There are several usable entity / entity wrapper classes available.
Here you can have a look at entity retrieval: http://hc.apache.org/httpcomponents-client-4.0.1/tutorial/html/fundamentals.html#d4e152
You should run a diff (unix tool) comparing the original with the result to find out what has actually changed. You May see a pattern right away.
I would start by flushing (or closing) the FileOutputStream
Your code is correct provided everything is closed and no exceptions are thrown. The problem lies elsewhere, probably in the original files.
Problem seemed to have been the Linux atheros driver for my NIC.

Categories

Resources