How to handle imperfections in transfering thousands of files using Jsch?

How to handle imperfections in transfering thousands of files using Jsch? - java

I have written and tested Java code that uses Jsch to transfer files. The transfer from the source computer, HERE, to a destination computer, THERE, works flawlessly. Using the same unmodified code, the transfer from HERE to another computer, PROBLEM, works about half the time. The problem being that the code will, on random file, hang on the write or close, and there are no exceptions thrown, even after a extremely long timeout. (After the next use of the channel after the extremely long timeout causes an exception.) Using the linux command "scp" (openssh-clients) works flawlessly copying the same set of files, both from HERE to THERE and HERE to PROBLEM.
I assume there is an imperfection in the transmission or reception of the files that openssh::scp has been designed to detect and work around. Any suggestions as to how to
proceed.
Detail(s):
methods used for write/close
OutputStream fos = put(String rp3);
fos.close();
Is there a means similar to unix alarm/SIGALRM to interrupt the write/close so that
the attempt can be retried?
Is there a session setConfig parameter that instructs Jsch to be more fault tolerant? Where are these documented?
Should I switch to another Java implementation of scp?

Related

How to read from a file that is in use

There's a file I wanted to get into, but whenever I try to open it I get the message "The process cannot access the file because it is being used by another process".
Well, I want in! So, how can i do it?
I've been brainstorming a few ways to try, I'm hoping to get some input on other ways, or if my ideas wouldn't work for some reason that is not apparent to me.
Idea 1 The folder knows where the file is, it just won't open it. What if I create a program to read from the memory address of the file, copy it, then rebuild it somewhere else? I'm not sure if this has hope, because it relies on the file being the issue.
Idea 2 How does my process know that another process is using the file? If it's checking against all the other processes, maybe I can also figure out which process is using that file and pause it or end it.
Either of these ideas will probably take me weeks. Is anyone more creative and can think of another way; or more knowledgeable and eliminate an impractical idea?

In Windows, applications are allowed to obtain exclusive locks on files. When the process opens the file, one thing you specify is who else can access it while your process does (those are the .NET methods, but equivalents exist in other languages). Excel, for example, is notorious for getting an exclusive lock when you open a file. The way around it is usually to find the offending process and kill it to break the lock. Unlocker is the app that I'm most familiar with to accomplish this. If the process is a System process, however, you may not be able to kill it. You'd have to reboot to reset the lock.
Reading directly from another process's memory is unlikely to be reliable. The application may not have an in-memory copy, may not have a complete in memory copy, may not have a consistent in memory copy, and may not have an in memory copy that matches what's on disk (If they're editing the document, for example).
Your process knows that the file is locked because when it tries to open the file, it does so by asking the operating system for access to the file. The operating system responds saying, "Request denied. Another process has this file open and locked." The OS doesn't tell your process what process has the file open because trying to open a file doesn't include asking for who already has it open. Your process must ask the right question to get the answer you're looking for.

Windows makes you specify a sharing modes when opening a file. The sharing mode may prevent the file from being read, written, or deleted while you have it open. If you want to allow simultaneous read access you should include FILE_SHARE_READ in the dwShareMode parameter when you call CreateFile (http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx).
In other words, if you want to enable concurrent access to an open file you must modify the way the file is opened in the first place.
The portable standard libraries in C and Java don't offer a way to set the sharing mode when opening a file, but their usual implementations on windows set the sharing mode to READ+WRITE.

Creating an inputstream for use with JSCH API. (Java)

Why am I trying to do this?
Right now I'm trying to make a multi tabbed SSH client for use with a few servers. I have 8 at the moment, soon to be 9. As you can imagine, there are a few redundant tasks one has to do when working with Linux. Connecting to each server to make changes one at a time is a terribly tedious process. That's why I'm trying to make an SSH client that can connect to multiple servers at the same time so I can send a command ONCE to have it affect all servers I own.
How far am I right now?
I have a nice UI set up that can connect, log-in, and receive data from the servers. For input, the API requires I specify an inputstream. If I specify System.in as my inputstream, I can then run the program, and have whatever I type into console be broadcast out to the different servers, via the API.
The problem
is that no end user will ever want to work with a separate console to use this program. It will look dinky. So I need some way to take input from the text field to send it through a specified inputstream. That means I'll need an inputstream that never closes unless the program closes. Like System.in. Also, I can't easily redefine the stream once I set it. I searched for an answer yesterday for around 10 hours. Couldn't find anything. If anyone can help, please do. Thank you.
I need
an inputstream that works exactly like an outputstream. It stays open even when nothing is being sent through it, but as soon as it gets data, the data is sent automatically to anything that is using it. This API is very strange, but this last inputstream part is the only thing keeping me from finishing up my program. Thank you for your time.

JSCH sudo su command "tty" error
I was using the API incorrectly. Stupid, yes. I don't want anyone else making the same mistake though. I guess I was following a bad example found somewhere else on the internet.
Essentially, you don't even need to set the input stream. You just need to use the output stream that already exists. Write directly to the output stream. Pretty sure I was trying to do this at 3am last night. It was right in front of me the whole time.

Tomcat and open file handlers

I'm writing a web service using Axis and Apache Tomcat 7.
The service uses a third party library to do some conversions to a file and ends up creating a folder which contains more files (subfolders and regular files). When the conversion is completed the service creates a zip archive and returns it.
When it receives a new request, first of all it removes the files created during the last request and the it starts handling the request.
The service itself works fine, at least the first request is satisfied.
The problem is that when a second request is received, the service cannot delete all the files generated during the last request.
I'm using Windows XP and with Process Explorer i see that Tomcat is keeping some files (bt not all of them) open and that's why i can't delete it.
Is that possibile that the library i'm using keeps the files open even when the service operation ends?
In the code that i use to create the zip archive it seems that i close all the streams that i open. Btw even if i forgive to close them, can they stay still open after the service operation returns his results to the client?
And if so, why the process Tomcat keeps open only some of the files?
It seems that after some time some file are "released", but other file are always kept open...
I hope someone can give me some advice on how to handle this situation :)

Repost of my comment which seems to be useful.
If a file handler is not released, it will never be released until the servlet container is shutdown. Some implementations may also delay the releasing of file handlers to when the object is garbage collected. Nothing you can do except to make sure that you close all handlers. If it's the third party libary, then you have to report a bug or fix it yourself.
My best practice to prevent this sort of problem is to make sure that the file handler is closed in the same method it is opened. If it is not opened in that method, never close it.
public void method() {
//open file handler
//do something
//close file handler. make sure it is closed even if there is an exception.
}
And never make file handler a field.
public class A {
private FileInputStream fin = null; // never do this. you will have hard time keeping track of when to release it.
}

Well, as I was wondering, the library I'm using doesn't close all the file streams..
I found a workaround that can be used in similar situations. I post it, maybe someone can find it useful or advise me a better way to solve the problem :)
Lucky the jar library can be executed from command line with the java command, so i just executed it with Runtime's exec method in this way:
try {
Process p = Runtime.getRuntime().exec("java -jar library.jar parameters");
p.waitFor();
} catch (InterruptedException | IOException e) {
e.printStackTrace();
}
In this way a new process is created by the JVM, and, when it dies, all the pending handlers are relased.
If you can't execute the library with the java command, such library can be simply wrapped in a custom jar made by a simple class who uses it and takes the needed parameters. The wrapper jar can be then executed with exec().

Java monitor file system when java is not running

I recently implemented Java 7's WatchService and it works perfectly. Now I wondered if there is a way to get all the Events which occured since the last run of my program. For example:
I run my program, create some files, edit some files and I get all the corresponding Events
I close my program
I create a file named foo.txt
I start my program, and the first event i get is an ENTRY_CREATE for foo.txt
I thought about saving the lastModifiedDate and searching for files and directorys newer than the last execution of my program. Is there another (and better) way to do this?

There is no better way to do this if your program is meant to scan for all file changes (apart from storing files in a content / source control repository, but that would be external to your program).
Java 7's WatchService is only a more performant way than continuously looping and comparing file dates / folder contents, hence you need to implement your own logic to solve your own problem.

There is no way to do this in Java, or in any other programming language.
The operating system doesn't (and can't) buffer file system events on the off-chance that someone might start a program to process them. The event monitor / delivery system captures the events for a running application that is listening for them. When nothing is listening, the events are not captured.

You could write a small daemon (system service on Windows) which runs continuously and listens for file system changes. It could write these to a file. When your application runs, rather than listening for changes itself, it could just read the file. As events happen while it runs, the daemon will continue to receive them and send them through the file to the application.
You would need to ensure that the file was organised in such a way that it could be written to and read from safely at the same time, and that it did not grow indefinitely.

blocking (synchronous) ftp download in java?

I'm currently using commons-net library for FTP client in my app. I have to download from remote server some files, by some criteria based on the file name. This is a very simplified and reduced version of my actual code (because I do some checks and catch all possible exceptions), but the essence is there:
//ftp is FTPClient object
//...
files = ftp.listFiles();
for (FTPFile ftpFile : files) {
String name = ftpFile.getName();
if(conformsCriteria(name)) {
String path = outDirectory + File.separatorChar + name;
os = new FileOutputStream(path);
ftp.retrieveFile(name, os);
}
}
Now, what I noticed is that when I run this code, wait a few seconds, and then plug out network cable, output directory contains some "empty" files plus the files actually downloaded, which leads me to believe that this method is working somewhat asynchronously... But then again, some files are downloaded (size > 0KB), and there are these empty files (size = 0KB), which leads me to believe that it is still serialized download... Also, function retrieveFile() returns, I quote documentation:
True if successfully completetd, false if not
What I need is serialized download, because I need to log every unsuccessful download.
What I saw browsing through the commons-net source is that, if I'm not wrong, new Socket is created for each retrieveFile() call.
I'm pretty confused about this, so If someone could explain what is actually happening, and offer solution with this library, or recommend some other FTP java library that supports blocking download per file, that would be nice.
Thanks.

You could just use the java.net.URLConnection class that has been present forever. It should know how to handle FTP URLs just fine. Here is a simple example that should give the blocking behavior that you are looking for.
The caveat is that you have to manage the input/output streams yourself, but this should be pretty simple.

Ok, to briefly answer this in order not to confuse people who might see this question.
Yes, commons-net for FTP is working as I thought it would, that is, retrieveFile() method blocks until it's finished with the download.
It was (of course) my own "mistake" in the code that let me think otherwise.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.