Java SSH Recursive Download causes memory leaks - java

I am using JSch to provide an utility that backs up an entire server data for my company.
The application is developped using Java 8 & JavaFX 2
My problem is that I believe that my recursive download is at fault because my program RAM usage is growing by the second and never seems to free up.
This is the order of the operations I perform :
Connexion to remote server : OK;
Opning SFT Channel -> session.openChannel("sftp") : OK
Retrieving local directory -> sftpChannel.cd(MAIN_DIRECTORY) : OK
Listing directory content -> final Vector<ChannelSftp.LsEntry> entries= sftpChannel.ls(".");
Calling recursive method to :
if (entry.getAttrs().isDir())-> calling recursive method
else -> it's a file there are no more sub folder to go to ;
Process download
Now, where I think the memory leak occurs in the Download Part :
Starting download & retrieving the inputstream
final InputStream is = sftpChannel.get(remoteFilePath, new SftpProgressMonitor());
Where SftpProgressMonitor() is an interface to provide progress monitoring which I use for updating UI (progressbar). this interface never references internally the inputstream just to make that clear. But it's still an non-static anonymous class so it does hold a reference to the DownloadMethod scope.
While it's downloading, I create the file to save and open an OutputStream to write the downloaded content in it :
final BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(fileToSave));
This is where I write to file as the remote file gets downloaded :
Code:
int readCount;
final byte[] buffer = new byte[8 * 1024];
while ((readCount = is.read(buffer)) > 0) {
bos.write(buffer, 0, readCount);
bos.flush();
}
And of course, once this is completed, I don't forget to close both streams:
is.close(); //the inputstream from sftChannel.get()
bos.close(); //the FileOutputStream
So as you can understand I recursively process these operations meaning :
List current directory content ;
Check first entry
if it's a directory, go inside, and do 1.
it it's a file, download it
Check second entry
etc.
Multiple tests show the exact same behaviour (and the content to download remain exactly the same during these tests). This means that my memory usage keeps growing and at the same pace.
[UPDATE 1]
I tried a solution where I let JSch write to the FileOutputStream itself :
final BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(fileToSave));
sftpChannel.get(remoteFilePath, bos, new SftpProgressMonitor()
And in SftpProgressMonitor.end() I close -> bos.close().
No changed at all.
I also tried listing all files, still recursively, adding their respective bytes length to a private long totalBytesToDownload and my program memory remained very stable : only 20Mb taken during the whole process (on the account that totalBytesToDownload kept increasing) which confirms that my downloading method is really at fault.
If I do close my streams, why the GC won't collect them ?

Related

FileChannel works even after removing backing file

I noticed this weird thing that opened FileChannel object works even after linked file is deleted while a file channel is in use. I have created 15GB test file and following program reads 100MB of file content consequently per second.
Path path = Paths.get("/home/elbek/tmp/file.txt");
FileChannel fileChannel = FileChannel.open(path, StandardOpenOption.READ);
ByteBuffer byteBuffer = ByteBuffer.allocate(1024 * 1024);
while (true) {
int read = fileChannel.read(byteBuffer);
if (read < 0) {
break;
}
Thread.sleep(10);
byteBuffer.clear();
System.out.println(fileChannel.position());
}
fileChannel.close();
After program runs ~5 seconds (it has read 0.5GB) I delete the file from the file system and expect an error to be thrown after a few reads, but the program goes on and reads the file till the end, I was initially thinking maybe it is being served from file cache and made file huge so it won't fit into cache, 15GB is big enough I think not to fit into it.
Anyways, how OS is serving read requests while the file itself is not there anymore? The OS I am testing this is Fedora.
Thanks.

memory problems with large file download from WCF SOAP service to JAVA client

We currently have an existing WCF SOAP service that runs fine with a large range of clients. We present a StreamBody as a way to download larger filesets. I have tried virtually every way to attempt to download large files without loading the file completely into memory by the client. I have in every attempt. Essentially, by calling the following, the JAVA client wants to load the complete file into memory. I am looking for suggestions. Below is my latest attempt:
OrderServiceStub stub = getOrderServiceStub();
OrderServiceStub.GetStreamedOrderOutputRequestMessage getStreamedOrderOutputRequestMessage = new OrderServiceStub.GetStreamedOrderOutputRequestMessage();
OrderServiceStub.GetStreamedOrderOutputRequest getStreamedOrderOutputRequest = new OrderServiceStub.GetStreamedOrderOutputRequest();
for (OrderServiceStub.OrderOutput o : orderoutput.getOrderOutput()){
OrderServiceStub.Guid guidOutput = o.getOrderOutputTicket();
String fileName = o.getOrderOutputName();
getStreamedOrderOutputRequest.setOrderOutputTicket(guidOutput);
getStreamedOrderOutputRequestMessage.setGetStreamedOrderOutputRequest(getStreamedOrderOutputRequest);
int bufferSize = 1024;
InputStream is = new BufferedInputStream(stub.getStreamedOrderOutput(getStreamedOrderOutputRequestMessage).getFileData().getStreamBody().getInputStream());
OutputStream os = new FileOutputStream(new File("C:\\temp\\" + fileName));
org.apache.commons.io.IOUtils.copyLarge(is, os);
org.apache.commons.io.IOUtils.closeQuietly(is);
org.apache.commons.io.IOUtils.closeQuietly(os);
}
Instead of copying the whole thing via apache, you might try putting it into a loop, reading your bufferSize, then flushing the output stream. Then continue, doing the same. Hopefully that will help.

IOException insufficient disk space when accessing Citrix mounted drive

I'm having a really strange problem. I'm trying to download some file and store. My code is relatively simple and straight forward (see below) and works fine on my local machine.
But it is intended to run on a Windows Terminal Server accessed through Citrix and a VPN. The file is to be saved to a mounted network drive. This mount is the local C:\ drive mounted through the Citrix VPN, so there might be some lag involved. Unfortunately I have no inside detail about how exactly the whole infrastructure is set up...
Now my problem is that the code below throws an IOException telling me there is no space left on the disk, when attempting to execute the write() call. The directory structure is created alright and a zero byte file is created, but content is never written.
There is more than a gigabyte space available on the drive, the Citrix client has been given "Full Access" permissions and copying/writing files on that mapped drive with Windows explorer or notepad works just fine. Only Java is giving me trouble here.
I also tried downloading to a temporary file first and then copying it to the destination, but since copying is basically the same stream operation as in my original code, there was no change in behavior. It still fails with a out of disk space exception.
I have no idea what else to try. Can you give any suggestions?
public boolean downloadToFile(URL url, File file){
boolean ok = false;
try {
file.getParentFile().mkdirs();
BufferedInputStream bis = new BufferedInputStream(url.openStream());
byte[] buffer = new byte[2048];
FileOutputStream fos = new FileOutputStream(file);
BufferedOutputStream bos = new BufferedOutputStream( fos , buffer.length );
int size;
while ((size = bis.read(buffer, 0, buffer.length)) != -1) {
bos.write(buffer, 0, size);
}
bos.flush();
bos.close();
bis.close();
ok = true;
}catch(Exception e){
e.printStackTrace();
}
return ok;
}
Have a try with commons-io. Esspecially the Util Classes FileUtils and IOUtils
After changing our code to use commons-io all file operations went much smouther. Even with mapped network drives.

Java downloading files sometimes result in CRC

I've written code to automatically download a batch of files using an InputStream and a FileOutputStream.
The code is very straightforward:
is = urlConn.getInputStream();
fos = new FileOutputStream(outputFile);
eventBus.fireEvent(this, new DownloadStartedEvent(item));
int read;
byte[] buffer = new byte[2048];
while ((read = is.read(buffer)) != -1) {
fos.write(buffer, 0, read);
}
eventBus.fireEvent(this, new DownloadCompletedEvent(item));
At first sight this works very well, files get downloaded without any problems, however,
occasionally while trying to extract a batch of downloaded rar files, extraction fails with one of the rar parts having a CRC error.
As this happened a few times already, although not consistently, I started to suspect that something in this code is not correct/optimal.
It will be helpful to know that there are 4 downloads executing concurrently using the JDK FixedThreadPool mechanism:
execService.execute(new Runnable() {
#Override
public void run() {
if (item.getState().equals(DownloadCandidateState.WAITING)) {
Downloader downloader = new Downloader(eventBus);
downloader.download(item, item.getName());
}
}
});
But because every download thread uses a new instance of the Downloader class, I believe this problem is not a side effect of concurrency?
Any ideas if this occasional CRC error has to do with the code or if it has to do with something else?
UPDATE
I can verify that the file size of a problematic file is correct.
I also did a diff (on linux) on the automatically downloaded file and the manually downloaded file.
The filesize is the exact same for both files, however, diff says that the binary content differs between the 2 files:
Binary files file.rar and file(2).rar differ
UPDATE 2
I used a visual binary diff tool and could see that a sequence of 128 bytes was different, somewhere in the middle of the file. I don't understand how that could happen, as the file being downloaded doesn't change and it is being read byte per byte using an input stream. Any ideas??
You can also use Apache's HttpClient if you don't want to handle that entity streaming yourself. It's a well written and documented library. There are several usable entity / entity wrapper classes available.
Here you can have a look at entity retrieval: http://hc.apache.org/httpcomponents-client-4.0.1/tutorial/html/fundamentals.html#d4e152
You should run a diff (unix tool) comparing the original with the result to find out what has actually changed. You May see a pattern right away.
I would start by flushing (or closing) the FileOutputStream
Your code is correct provided everything is closed and no exceptions are thrown. The problem lies elsewhere, probably in the original files.
Problem seemed to have been the Linux atheros driver for my NIC.

Resources.openRawResource() issue Android

I have a database file in res/raw/ folder. I am calling Resources.openRawResource() with the file name as R.raw.FileName and I get an input stream, but I have an another database file in device, so to copy the contents of that db to the device db I use:
BufferedInputStream bi = new BufferedInputStream(is);
and FileOutputStream, but I get an exception that database file is corrupted. How can I proceed?
I try to read the file using File and FileInputStream and the path as /res/raw/fileName, but that also doesn't work.
Yes, you should be able to use openRawResource to copy a binary across from your raw resource folder to the device.
Based on the example code in the API demos (content/ReadAsset), you should be able to use a variation of the following code snippet to read the db file data.
InputStream ins = getResources().openRawResource(R.raw.my_db_file);
ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
int size = 0;
// Read the entire resource into a local byte buffer.
byte[] buffer = new byte[1024];
while((size=ins.read(buffer,0,1024))>=0){
outputStream.write(buffer,0,size);
}
ins.close();
buffer=outputStream.toByteArray();
A copy of your file should now exist in buffer, so you can use a FileOutputStream to save the buffer to a new file.
FileOutputStream fos = new FileOutputStream("mycopy.db");
fos.write(buffer);
fos.close();
InputStream.available has severe limitations and should never be used to determine the length of the content available for streaming.
http://developer.android.com/reference/java/io/FileInputStream.html#available():
"[...]Returns an estimated number of bytes that can be read or skipped without blocking for more input. [...]Note that this method provides such a weak guarantee that it is not very useful in practice."
You have 3 solutions:
Go through the content twice, first just to compute content length, second to actually read the data
Since Android resources are prepared by you, the developer, hardcode its expected length
Put the file in the /asset directory and read it through AssetManager which gives you access to AssetFileDescriptor and its content length methods. This may however give you the UNKNOWN value for length, which isn't that useful.

Categories

Resources