I/O completion ports and stdout processing - java

I'm using I/O completion ports for a process management library (yes, there's a reason for this). You can find the source for what I'm talking about here: https://github.com/jcommon/process/blob/master/src/main/java/jcommon/process/platform/win32/Win32ProcessLauncher.java (take a look at lines 559 and 1137 -- yes, that class needs to be refactored and cleaned up).
I'm launching a child process and using named pipes (not anonymous pipes b/c I need asynchronous, overlapped ReadFile()/WriteFile()) in order to process the child process' stdout and stderr. This is mostly actually working. In a test, I launch 1,000 concurrent processes and monitor their output, ensuring they emit the proper information. Typically either all 1,000 work fine or 998 of them, leaving a couple which have problems.
Those couple of processes are showing that not all their messages are being received. I know the message is being output, but the thread processing GetQueuedCompletionStatus() for that process returns from the read with ERROR_BROKEN_PIPE.
The expected behavior is that the OS (or the C libs) would flush any remaining bytes on the stdout buffer upon process exit. I would then expect for those bytes to be queued to my iocp before getting a broken pipe error. Instead, those bytes seem to disappear and the read completes with an ERROR_BROKEN_PIPE -- which in my code causes it to initiate the teardown for the child process.
I wrote a simple application to test and figure out the behavior (https://github.com/jcommon/process/blob/master/src/test/c/stdout-1.c). This application disables buffering on stdout so all writes should effectively be flushed immediately. Using that program in my tests yields the same issues as launching "cmd.exe /c echo hi". And at any rate, shouldn't the application (or OS?) flush any remaining bytes on stdout when the process exits?
The source is in Java, using direct-mapped JNA, but should be fairly easy for C/C++ engineers to follow.
Thanks for any help you can provide!

Are you sure that the broken pipe error isn't occurring with a non zero ioSize? If ioSize is not zero then you should process the data that was read as well as noting that the file is now closed.
My C++ code which does this basically ignores ERROR_BROKEN_PIPE and ERROR_HANDLE_EOF and simply waits for either the next read attempt to fail with one of the above errors or the current read to complete with zero bytes read. The code in question works with files and pipes and I've never seen the problem that you describe when running the kind of tests that you describe.

Related

Program hangs when trying to kill process, until eventually it is killed

I am working on fixing a bug that makes our CI/CD pipeline fails. During an integration test, we spin up a local database instance. In order to do this, we are using some mariadb wrappers to launch it from a java codebase.
This process can (potentially) take a long time to finish, which will cause our tests to timeout. In this case, we have added a functionality to kill a process if it cannot install within 20 seconds and should try again.
This part seems to be working.
The strange bit comes when trying to destroy the process. It seems to randomly take ~2-3 MINUTES to be unblocked. This is problematic for the same reason that the above problem was problematic.
Upon investigation into the underlying libraries, it seems like we are using ExecuteWatchdog to manage the process. The is a bit of code that is blocking is:
watchDog.destroyProcess();
// this part usually returns nearly instantly
try {
// this part can take minutes...
resultHandler.waitFor();
} catch (InterruptedException e) {
throw handleInterruptedException(e);
}
In addition to this, there is different behavior on Mac/Linux. If I do something like resultHandler.waitFor(1000) // Wait with 1000ms timeout before just exiting, it will work fine on a macbook, but on linux i see an error like: java.io.FileNotFoundException: {{executable}} (Text file busy)
Any ideas on this?
I have done some research and it seems like watchDog.destroyProcess is sending a SIGTERM instead of a SIGKILL. But I do not have any hooks to get the Process object in order to send it the KILL instead.
Thanks.
A common cause for blocking when working with processes is that the process is blocked on output, either to stdout or (the more likely to be overlooked) stderr.
In this context, setting up tests on a CI server, you might try setting the output and error output to INHERIT.
Note that this means that you won't be able to read the sub-process output or error stream in your Java code. My assumption is that you aren't trying to do that anyway, and that's why the process hangs. Instead, that output will be redirected to the output of the Java process, and I expect your CI server will log it as part of the build.

Why does process hang if the parent does not consume stdout/stderr in Java?

I know that if you use ProcessBuilder.start in Java to start an external process you have to consume its stdout/stderr (e.g. see here). Otherwise the external process hangs on start.
My question is why it works this way. My guess is that JVM redirects stdout/stderr of the executed process to pipes and if the pipes have no room the writes to the pipes block. Does it make sense?
Now I wonder why Java does it way. What is the rationale behind this design?
Java doesn't do anything in this area. It just uses OS services to create the pipes.
All Unix like OSs and Windows behave the same in this regard: A pipe with a 4K is created between parent and child. When that pipe is full (because one side isn't reading), the writing process blocks.
This is the standard since the inception of pipes. There is not much Java can do.
What you can argue is that the process API in Java is clumsy and doesn't have good defaults like simply connecting child streams to the same stdin/stdout as the parent unless the developer overrides them with something specific.
I think there are two reasons for the current API. First of all, the Java developers (i.e. the guys at Sun/Oracle) know exactly how the process API works and what you need to do. They know so much that it didn't occur to them that the API could be confusing.
The second reason is that there is no good default that will work for the majority. You can't really connect stdin of the parent and the child; if you type something on the console, to which process should the input go?
Similarly, if you connect stdout, the output will go somewhere. If you have a web app, there might be no console or the output might go somewhere where no one will expect it.
You can't even throw an exception when the pipe is full since that can happen during the normal operation as well.
It is explained in the javadoc of Process:
By default, the created subprocess does not have its own terminal or console. All its standard I/O (i.e. stdin, stdout, stderr) operations will be redirected to the parent process, where they can be accessed via the streams obtained using the methods getOutputStream(), getInputStream(), and getErrorStream(). The parent process uses these streams to feed input to and get output from the subprocess. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, or even deadlock.

Python - communicate with subprocess

I'm running Python 2.7 on a Win32 OS, but i'm hoping to write platform-independent code. I'm trying to use Python to interact in real-time with a Java program i wrote, and figured pipes would be the best way to do this. My Python script is calling Java as a subprocess. Essentially, Java is the GUI and Python is the back end. (I don't want to use Jython or wxPython because i only want to be dependent upon the standard libraries each language provides.) Trying to set up communication between the two has been terrible. I can send a message from the (parent) Python script to the (child) Java class using
process.stdin.write(msg)
process.stdin.flush()
but reading Java's output has not worked. I use
process.stdout.read()
but apparently this blocks forever if there's nothing to read. And process.communicate() is off limits because it doesn't do anything until the subprocess terminates. According to my research, a common method people use to get around this problem is to "use threads" (although someone suggested appending a newline when writing -- didn't work), but being new to Python and threading in general i have no idea how that would look. I've tried looking over the standard library's subprocess.py source but that hasn't helped. Is there a way to see if stdout is empty, at least? If not, how do i accomplish this?
process.stdout.read()
but apparently this blocks forever if there's nothing to read.
well not exactly, it will basically block while its either reading/waiting until it hits EOF which is set when the file closes, one way to circumvent this is by stating how many bytes you want to read process.stdout.read(1) this will read 1 byte and return if theres no byte then again it will wait until theres at least one byte or EOF.
You may also use python select module which has an optional timeout period where select waits for this long or simply returns with empty values http://docs.python.org/library/select.html
though it may not be fully supported on windows.
(although someone suggested appending a newline when writing -- didn't work)
I've actually done this though from/to python, coupled with process.stdout.readline().rstrip() so data is a set of line(s) though you still have to strip them, due note you may have to flush in order for both processes to register the data.
I did find this java: how to both read and write to & from process thru pipe (stdin/stdout) which may help you.
good luck.

Launch a process without consuming its output

I used this line to execute a python script from a java application:
Process process = Runtime.getRuntime().exec("python foo.py", null, directory);
The script runs a TCP server that communicates with my java app and other clients.
When I was debugging the script I had a few console prints here and there and everything was fine. As soon as the script was launched from Java Code, after a fixed time my TCP server was not responding. Following some time of debugging and frustration I removed my prints from the script and everything worked as expected.
It seems that there is some memory allocated for the process' standard output and error stream and if you don't consume it the process gets stuck while trying to write to a full buffer.
How to launch the process so that I don't have to consume the standard output stream? I'd like to keep the prints though for debugging, but don't want to start a thread that reads a stream that I don't need.
You have to consume the child process's output, or eventually it will block because the output buffer is full (don't forget about stderr, too). If you don't want to modify your Java program to consume it, perhaps you could add a flag to your script to turn off debugging altogether or at least direct it to a file (which could be /dev/null).
Java exposes ProcessBuilder. This is something you can use to execute what you want. This does a fork-exec and also lets you handle the output and error streams.
The fact that your process hangs is not because of java. Its a standard problem where your process is blocked because the stream is full and no one is consuming it.
Try using the ProcessBuilder with a thread that reads the streams.
I did some similar in the past and I redirect the exit of the process to a particular file log relative with the process. Later you could see what is happening. And you could maintenance your trace of your python script.

Child process stops when Thread.sleep() is called (in Java under Windows)

I have a Java application that launches an external process (Internet Explorer) using ProcessBuilder. Strangely enough, this child process freezes when the parent Java thread calls Thread.sleep. It does not happen with all processes, for instance Firefox, but with IE it happens all the time.
Any ideas ?
P.S. I tried Robot.delay() with the same effect
How are you consuming the child process stdout and stderr ? It may be worth posting your code.
You need to consume the output streams concurrently, otherwise either your stdout or stderr buffer will fill up, and your child process will block. See here for more details.

Categories

Resources