How to test "Too Many Files Open" problem

How to test "Too Many Files Open" problem - java

I have a batch process that converts WAV to MP3 sequentially. The problem is that after a few thousand there are too many files left open, and it runs up against the file limit.
The reason it does this is because of the code in SystemCommandTasklet:
FutureTask<Integer> systemCommandTask = new FutureTask<Integer>(new Callable<Integer>() {
public Integer call() throws Exception {
Process process = Runtime.getRuntime().exec(command, environmentParams, workingDirectory);
return process.waitFor();
}
});
This has the nasty side effect of making me rely on the JVM to clean up the processes, leaving files open and such.
I've rewritten it to be so:
FutureTask<Integer> systemCommandTask = new FutureTask<Integer>(new Callable<Integer>() {
public Integer call() throws Exception {
Process process = Runtime.getRuntime().exec(command, environmentParams, workingDirectory);
int status = process.waitFor();
process.getErrorStream().close();
process.getInputStream().close();
process.getOutputStream().flush();
process.getOutputStream().close();
process.destroy();
return status;
}
});
I'm 95% certain that this works on my mac (thanks to lsof), but how do I make a proper test that will work on any system to PROVE that what I am trying to do is actually working?

You should not use Runtime#exec() for that, because the process is not attached to the JVM's. Please take a look on the j.l.ProcessBuilder which returns a Process which is controlled by the JVM's process. So dumping the process may force the system to free / close resources.
You should schedule and limit the processes as well using j.u.c.Executors.
You also may read the limit using "ulimit -Sn" ("ulimit -Hn" should not be prefered due to system health ;).
Check your tool that converts your media whether it keeps resources reserved after completion (leakage, waiting for caller signals etc).

A proof will be difficult. But ...
Create a (Dummy)command that doesn't do much but keeps a lock on the files, just as the real thing. This makes sure your test doesn't depend on the actual command used.
Create a test that starts SystemCommandTask, using the old version, but the DummyCommand. Make it start the task often, until you get the expected exception. Lets call the number of Tasks needed N
Change the test to start 100xN Tasks.
Change the Task to the new version. If the test goes green you should be reasonably sure that your code works.

You could try and code to avoid it.
Why don't you proactively limit number of tasks at a time say 100 ?
In that case you could use some pooling mechanism to execute you work like Thread Pools.

Related

(Blocking) interactive shell via ProcessBuilder

I built an interactive EXE which means that you can continuously send new commands to it and it will process them.
An automation of this can be implemented in Java according to this answer. However, when sending the command, the code will not wait till the command has finished. Instead, it will return the control back to the caller right away which might lead to race conditions: If the sent command was supposed to write a file, maybe the file isn't created yet before it is accessed. How can a command be sent, the output read and as soon as some input command is expected again, the sendCommand() call returns?
public synchronized void sendCommand(String command) throws IOException
{
byte[] commandBytes = (command + "\n").getBytes(UTF_8.name());
outputStream.write(commandBytes);
outputStream.flush();
}
Preferably also returning the process output in the meantime. This would be the default behavior of a non-interactive shell command which terminates once finished executing. read() blocks indefinitely until the process terminates and I do not want to hardcode the length of the expected process output or similar hacks to circumvent this shortcoming.

I decided to rewrite my binary to be non-interactive again. It turns out the expected performance gain was negligible so there was no more reason to keep it interactive and go through an increased implementation hassle.

Can a speculative task continue running shortly after a Spark job returned?

I'm having issues with a simple Spark job of mine, which looks like this after simplification.
JavaRDD<ObjectNode> rdd = pullAndProcessData();
ManifestFilesystem fs = getOutputFS();
List<WriteObjectResult> writeObjectResults = rdd.mapPartitions(fs::write).collect();
fs.writeManifest(Manifest.makeManifest(writeObjectResults));
My expectation with this code is that whatever happens, writeManifest is going to be called if and only if all the tasks are finished and have successfully written their partition to S3. The problem is that apparently, some tasks are writing to S3 after the manifest, which should never happen.
In ManifestFilesystem.write, I delete the existing manifest (if there is one) to invalidate it because the normal workflow should be:
write all the partitions to S3
write the manifest to S3
I'm suspecting it could happen because of speculated tasks, in the following scenario:
some tasks are marked speculatable and re-send to other slaves
all speculated tasks return on at least one slave they were sent to, but some of them keep running on slower slaves
Spark does not interrupt the tasks or returns the result of collect to the driver before the tasks are interrupted
the speculated tasks which were still running finally execute ManifestTimeslice.write and delete the manifest before writing their partition
Is that something that can happen ? Does anybody have another hypothesis for such behaviour ?
Note: using built-in data publishing methods is not an option
Note 2: I actually found this which tends to confirm my intuition, but it would still be great to have a confirmation because I'm not using standard HDFS or S3 read/write methods for reasons outside of the scope of this question.

Spark does not proactively kill speculative tasks. It just waits until the task is finished and ignore the result. I think it's entirely possible that your speculative tasks continue writing after the collect call.

I'll answer my own question after realizing that there was no way around it from the perspective of Spark: how would you make sure you kill all the speculative tasks before they have the time to complete ? It's actually better to let them run entirely, otherwise they might be killed while writing to a file, which would then be truncated.
There are different possible approaches :
a few messages in this thread suggest that one common practice is to write to a temporary attempt file before performing an atomic rename (cheap on most filesystems because it's a mere pointer switch). If a speculative task tries to rename its temporary file to an existing name, which won't happen concurrently if the operation is atomic, then the rename request is ignored and the temp file deleted.
to my knowledge, S3 does not provide atomic rename. Plus, although the process described above is fairly easy to implement, we are currently trying to limit homebrew solutions to the maximum and keep the system simple. Therefore, my final solution will be to use a jobId (for example, the timestamp at which the job started) and pass it around to the slaves as well as write it in the manifest. When writing a file to the FS, the following logic will be applied:
public WriteObjectResult write(File localTempFile, long jobId) {
// cheap operation to check if the manifest is already there
if (manifestsExists()) {
long manifestJobId = Integer.parseInt(getManifestMetadata().get("jobId"));
if (manifestJobId == jobId) {
log.warn("Job " + jobId + " has already completed successfully and published a manifest. Ignoring write request."
return null;
}
log.info("A manifest has already been published by job " + jobId + " for this dataset. Invalidating manifest.");
deleteExistingManifest();
}
return publish(localTempFile);
}

Executing Java line by line

I'm trying to set up my java code such that it can be potentially stopped after any point of code execution.
I was thinking of putting all of my code inside a thread and calling Thread.interrupt() on it when I want it to stop. Note that this method will only cause the code to throw an InterruptedException if any thread blocking method is being run (Like sleep, join, wait, etc.). Otherwise it will just set the interrupted flag and we have to check it via the isInterrupted() after every line.
So now all I need to do is insert the following code...
if (myThread.isInterrupted()) {
System.exit(0);
}
after every line of code. If my java code was stored in a String, how can I insert this line of code after every point of execution of my code?
I was thinking of using the split method on semicolons and inserting the thread code between every element in the resulting array but it doesn't work because, for example, for loops have semicolons that don't represent the end of a statement. I think I would also have to split on closing curly braces too because the also represent the end of a code statement.
EDIT: solution attempt:
final String javaCode = "if (myString.contains(\"foo\")) { return true; } int i = 0;";
final String threadDelimiter = "if (thisThread.isInterrupted()) { System.exit(0); }";
final StringBuffer sb = new StringBuffer();
for (int i = 0; i < javaCode.length(); i++) {
final char currChar = javaCode.charAt(i);
sb.append(currChar);
if ("{};".contains(currChar + "")) {
sb.append(threadDelimiter);
}
}
System.out.println(sb);
This code is almost correct but it would not work for any sort of loops that use semicolon. It also wouldn't work for for loops that don't have braces.

First off, if your goal is to trigger System.exit() there's no need to inject such calls into another thread and then interrupt that thread; just call System.exit() where you would have called otherThread.interrupt(); and the process will exit just as quickly.
Second, your plan will not accomplish your goal. Suppose you added your interrupt-then-exit code around this statement:
new Scanner(System.in).next();
If nothing is being written to the program's stdin this statement will block indefinitely, and will not respect thread interruption. There are countless other examples of similar code snippets that will cause a thread to block. Your interrupted check may never be reached, no matter how granularity you inject it.
Third, manipulating source code as a string is a path down which madness lies. There exist standard tools to parse Java syntax into structured data types that you can safely manipulate. If you must do source code manipulation use the right equipment.
Fourth, there's no particular reason you need to interact with source code to simply inject additional commands; compile the source and use bytecode injection to inject your commands directly. Bytecode has a much more limited syntax than Java source, therefore it's easier to reason about and mutate safely.
All of that aside, however, you simply don't need to do any of this
It sounds like your goal is to execute some unknown snippet of Java code but be able to cause the process to stop at any time. This is exactly what your operating system is designed to do - manage process executions. To that end POSIX provides several standard signals you can send to a process to gracefully (or not-so-gracefully) cause that process to terminate. Sending a Java process a SIGINT (Ctrl-C), SIGTERM, or SIGHUP will cause the JVM to initiate its shutdown sequence, effectively triggering a System.exit().
There are two key steps that occur when the JVM shuts down:
In the first phase all registered shutdown hooks, if any, are started in some unspecified order and allowed to run concurrently until they finish. In the second phase all uninvoked finalizers are run if finalization-on-exit has been enabled. Once this is done the virtual machine halts.
In other words, as long as you prevent the code you're running from registering malicious shutdown hooks or calling runFinalizersOnExit() sending these signals will cause the JVM to terminate promptly.
If even that isn't acceptable for your use case send a SIGKILL instead; the OS will immediately cause the process to terminate, without giving it a chance to run shutdown hooks or finalizers. You can also call Runtime.halt() from within the JVM to accomplish roughly the same thing.

Multiprocessing in Java with Killable thread

I have a scenario in which I am running unreliable code in java (the scenario is not unlike this). I am providing the framework classes, and the intent is for the third party to overwrite a base class method called doWork(). However, if the client doWork() enters a funked state (such as an infinite loop), I need to be able to terminate the worker.
Most of the solutions (I've found this example and this example) revolve around a loop check for a volatile boolean:
while (keepRunning) {
//some code
}
or checking the interrupted status:
while (isInterrupted()) {
//some code
}
However, neither of these solutions deal with the the following in the '//some code' section:
for (int i = 0; i < 10; i++) {
i = i - 1;
}
I understand the reasons thread.stop() was depreciated, and obviously, running faulty code isn't desirable, though my situation forces me to run code I can't verify myself. But I find it hard to believe Java doesn't have some mechanism for handling threads which get into an unacceptable state. So, I have two questions:
Is it possible to launch a Thread or Runnable in Java which can be reliably killed? Or does Java require cooperative multithreading to the point where a thread can effectively hose the system?
If not, what steps can be taken to pass live objects such that the code can be run in a Process instead (such as passing active network connections) where I can actually kill it.?

If you really don't want to (or probably cannot due to requirement of passing network connections) spawn new processes, you can try to instrument code of this 'plugin' when you load it's class. I mean change it's bytecode so it will include static calls to some utility method (eg ClientMentalHealthChecker.isInterrupted()). It's actually not that hard to do. Here you can find some tools that might help: https://java-source.net/open-source/bytecode-libraries. It won't be bullet proof because there are other ways of blocking execution. Also keep in mind that clients code can catch InterruptedExceptions.

Windows: signaling a Java process to show its window

I have a Java process that runs in the background. How can I quickly signal the process to show its window? I want a really light-weight script that can do this and can be launched from the Start Menu.
I think maybe a BAT file that checks if the lock file has been touched in the last few seconds, signal the process, otherwise, create a new one. The signal could be by creating another file that the process would be listening for.
That works, but it seems inefficient. Hesitation sounds unavoidable.
I could use Java instead of a BAT file, still that leaves the question of how to signal the background process. This only has to work in Windows, but I am good with Java so that is what I am using.
Any ideas?

One option would be to have that process having a listener on a port (as an example 8888), then you could send a message to that port (or do something like telnet localhost 8888). The running processes could have a separate thread listening on that port.
Another option would be to use JMX communication with the JVM - see http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html

Is there anything preventing you from checking the lock file from your Java process? You could use the Observer pattern to alert the main thread (or which ever thread) to changes in the file.
For example:
public class FileWatcher implements Observable {
private long lastModified;
private final File file;
public FileWatcher(File f) {
this.file = f;
this.lastModified = file.lastModified();
Thread t = new Thread() {
public void run() {
while(!stopThread) {
if(lastModified < file.lastModified()) {
lastModified = file.lastModified();
setChanged();
notifyObservers();
}
Thread.currentThread().sleep(5);
}
}
};
t.start();
}
}
DISCLAIMER: not tested or verified at all, but I'm sure you get the idea.
EDIT: oops, forgot the loop.
EDIT: new idea.
I have another idea. (I know you already accepted an answer, but I wanted to throw this out there.) Would it be possible to use the select function? In my very limited skim of the MSDN docs, this is only mentioned this in the context of sockets. I know the Linux equivalent is applicable to any file descriptor.
Instead of simply polling the file in the thread I mentioned above, let the OS do it! Pass the file into the writefds set to select and then it'll return when the file is modified. This means your process isn't spending valuable CPU time waiting for changes to the file.
I haven't verified whether or not Java exposes this call in the JDK, so it might require writing a JNI interface to get it to work. My knowledge in this area is a little fuzzy, sorry.
EDIT again again:
Found it! Java's Selector class looks like it implements select. Unfortunately, FileChannel isn't selectable, which is probably required in this case. :(

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.