Calling a shell script from java hangs - java

So I'm trying to execute a shell script which produces a lot of output(in 100s of MBs) from a Java file.
This hangs the process and never completes.
However, within the shell script, if I redirect the output of the script to some log file or /dev/null Java file executes and completes in a jiffy.
Is it because of amount of data that the Java program never completes?
If so, is there any documentation as such? or is there any limit on the amount of data(documented)?
Here's how you can simulate this scenario.
Java file will look like:
import java.io.InputStream;
public class LotOfOutput {
public static void main(String[] args) {
String cmd = "sh a-script-which-outputs-huuggee-data.sh";
try {
ProcessBuilder pb = new ProcessBuilder("bash", "-c", cmd);
pb.redirectErrorStream(true);
Process shell = pb.start();
InputStream shellIn = shell.getInputStream();
int shellExitStatus = shell.waitFor();
System.out.println(shellExitStatus);
shellIn.close();
} catch (Exception ignoreMe) {
}
}
}
The script 'a-script-which-outputs-huuggee-data.sh' may look like:
#!/bin/sh
# Toggle the line below
exec 3>&1 > /dev/null 2>&1
count=1
while [ $count -le 1000 ]
do
cat some-big-file
((count++))
done
echo
echo Yes I m done
Free beer for the right answer. :)

It's because you're not reading from the Process' output.
As per the class' Javadocs, if you don't do this then you may end up with a deadlock; the process fills its IO buffer and waits for the "shell" (or listening process) to read from it and empty it. Meanwhile your process, which should be doing this, is blocking waiting for the process to exit.
You'll want to call getInputStream() and read from that reliably (perhaps from another thread) to stop the process blocking.
Also take a look at Five Java Process Pitfalls and When Runtime.exec() Won't - both informative articles about common problems with Process.

You're never reading the input stream, so it's probably blocking because the input buffer is full.

The input/output buffer have a limited size (depending on the operating system). If I remember correctly this wasn't big or Windows XP at least. Try creating a thread that reads the InputStream as fast as possible.
Something along these lines:
class StdInWorker
implements Worker
{
private BufferedReader br;
private boolean run = true;
private int linesRead = 0;
private StdInWorker (Process prcs)
{
this.br = new BufferedReader(
new InputStreamReader(prcs.getInputStream()));
}
public synchronized void run ()
{
String in;
try {
while (this.run) {
while ((in = this.br.readLine()) != null) {
this.buffer.add(in);
linesRead++;
}
Thread.sleep(50);
}
}
catch (IOException ioe) {
ioe.printStackTrace();
}
catch (InterruptedException ie) {}
}
}
}

Related

Execute spark-submit programmatically from java

I am trying to execute it via:
Process process = Runtime.getRuntime().exec(spark_cmd);
with no luck. The command ran via shell starts my application which succeeds. Running it via exec start a process which dies shortly after and does nothing.
When I try
process.waitFor();
it hangs and waits forever. Real magic begins when I try to read something from the process:
InputStreamReader isr = new InputStreamReader(process.getErrorStream());
BufferedReader br = new BufferedReader(isr);
To do so I start a thread that reads from the stream in a while loop:
class ReadingThread extends Thread {
BufferedReader reader;
Wontekk(BufferedReader reader) {
this.reader = reader;
}
#Override
public void run() {
String line;
try {
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Application starts, does some stuff, and hangs. When I abort my application, spark application wakes up (??????????) and completes remaining work. Does anyone have reasonable explanation of what is happening?
thanks
You can send spark job as spark-submit with the help of java code with the help of SparkLauncher so you can go though below link and check it our
https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/launcher/SparkLauncher.html
One way is Spark launcher as told by #Sandeep Purohit
I'd offer shell script approach with nohup command to submit job like this...
This worked for me incase of mapreduce executions... same way you can
try for spark background jobs as well.
Have a look https://en.wikipedia.org/wiki/Nohup
"nohup spark-submit <parameters> 2>&1 < /dev/null &"
When ever, you get messages then you can poll that event and call this shell script. Below is the code snippet to do this...
/**
* This method is to spark submit
* <pre> You can call spark-submit or mapreduce job on the fly like this.. by calling shell script... </pre>
* #param commandToExecute String
*/
public static Boolean executeCommand(final String commandToExecute) {
try {
final Runtime rt = Runtime.getRuntime();
// LOG.info("process command -- " + commandToExecute);
final String[] arr = { "/bin/sh", "-c", commandToExecute};
final Process proc = rt.exec(arr);
// LOG.info("process started ");
final int exitVal = proc.waitFor();
LOG.trace(" commandToExecute exited with code: " + exitVal);
proc.destroy();
} catch (final Exception e) {
LOG.error("Exception occurred while Launching process : " + e.getMessage());
return Boolean.FALSE;
}
return Boolean.TRUE;
}
Moreover to debug
ps -aef | grep "your pid or process name"
Below command will list the open files opened by the process..
lsof -p <your process id >
Also, have a look at process.waitFor() never returns

Read stdout stream from subprocess as it becomes available

In my Java application, I need to execute some scripts as subprocesses and monitor the output on stdout from Java so that I can react when necessary to some output.
I am using apache commons-exec to spawn the subprocess and redirect stdout of the executed script to an input stream.
The problem that I am having is that when reading from the stream, the Java process is blocked until the subprocess is finished execution. I cannot wait until the end of the subprocess to react to the output, but I need to read it asynchronously as it becomes available.
Below is my Java code:
public class SubProcessReact {
public static class LogOutputStreamImpl extends LogOutputStream {
#Override
protected void processLine(String line, int logLevel) {
System.out.println("R: " + line);
}
}
public static void main (String[] args) throws IOException, InterruptedException {
CommandLine cl = CommandLine.parse("python printNumbers.py");
DefaultExecutor e = new DefaultExecutor();
ExecuteStreamHandler sh = new PumpStreamHandler(new LogOutputStreamImpl());
e.setStreamHandler(sh);
Thread th = new Thread(() -> {
try {
e.execute(cl);
} catch (IOException e1) {
e1.printStackTrace();
}
});
th.start();
}
}
For this example, the subprocess is a python script which counts upwards with a one second delay between outputs so that I can verify that the Java code is responding as data comes in.
Python Code:
import time
for x in range(0,10):
print x
time.sleep(1)
I would expect LogOutputStreamImpl to print each line as it comes, but what is actually happening is that it reading the stream blocks until the subprocess is completed, and then all of the output is printed.
Is there something I could do to make this work as I intend?
Why use a third-party library to do something Java SE already does well? Personally, I prefer to depend on as few external libraries as possible, in order to make my programs easily portable and to reduce the points of failure:
ProcessBuilder builder = new ProcessBuilder("python", "printNumbers.py");
builder.inheritIO().redirectOutput(ProcessBuilder.Redirect.PIPE);
Process process = builder.start();
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(process.getInputStream()))) {
reader.lines().forEach(line -> System.out.println("R: " + line));
}
process.waitFor();

Reading InputStream from a batch file's process skips to next line

I am trying to run a batch file with Runtime.exec() and then output its InputStream into a JTextArea. What I have works, but only partially. What happens is the batch file runs, but if it executes a command other than something like "echo" that command immediately terminates and the next line executes. For example, let's say I try to run a simple batch file like this:
#echo off
echo hello. waiting 5 seconds.
timeout /t 5 /nobreak > NUL
echo finished. goodbye.
The batch file executes, and the JTextArea says
hello. waiting 5 seconds.
finished. goodbye.
but it doesn't wait for 5 seconds in the middle.
I can't figure out why it's doing this. Here's what I use to run the batch file and read its InputStream.
private class ScriptRunner implements Runnable {
private final GUI.InfoGUI gui; // the name of my GUI class
private final String script;
public ScriptRunner(final GUI.InfoGUI gui, final File script) {
this.gui = gui;
this.script = script.getAbsolutePath();
}
#Override
public void run() {
try {
final Process p = Runtime.getRuntime().exec(script);
StreamReader output = new StreamReader(p.getInputStream(), gui);
Thread t = new Thread(output);
t.start();
int exit = p.waitFor();
output.setComplete(true);
while (t.isAlive()) {
sleep(500);
}
System.out.println("Processed finished with exit code " + exit);
} catch (final Exception e) {
e.printStackTrace();
}
}
}
private class StreamReader implements Runnable {
private final InputStream is;
private final GUI.InfoGUI gui;
private boolean complete = false;
public StreamReader(InputStream is, GUI.InfoGUI gui) {
this.is = is;
this.gui = gui;
}
#Override
public void run() {
BufferedReader in = new BufferedReader(new InputStreamReader(is));
try {
while (!complete || in.ready()) {
while (in.ready()) {
gui.setTextAreaText(in.readLine() + "\n");
}
sleep(250);
}
} catch (final Exception e) {
e.printStackTrace();
} finally {
try {
in.close();
} catch (final Exception e) {
e.printStackTrace();
}
}
public void setComplete(final boolean complete) {
this.complete = complete;
}
}
public void sleep(final long ms) {
try {
Thread.sleep(ms);
} catch (final InterruptedException ie) {
}
}
I know my code is pretty messy, and I'm sure it contains grammatical errors.
Thanks for anything you can do to help!
You're creating a Process but you're not reading from its standard error stream. The process might be writing messages to its standard error to tell you that there's a problem, but if you're not reading its standard error, you won't be able to read these messages.
You have two options here:
Since you already have a class that reads from a stream (StreamReader), wire up another one of these to the process's standard error stream (p.getErrorStream()) and run it in another Thread. You'll also need to call setComplete on the error StreamReader when the call to p.waitFor() returns, and wait for the Thread running it to die.
Replace your use of Runtime.getRuntime().exec() with a ProcessBuilder. This class is new in Java 5 and provides an alternative way to run external processes. In my opinion its most significant improvement over Runtime.getRuntime().exec() is the ability to redirect the process's standard error into its standard output, so you only have one stream to read from.
I would strongly recommend going for the second option and choosing to redirect the process's standard error into its standard output.
I took your code and replaced the line
final Process p = Runtime.getRuntime().exec(script);
with
final ProcessBuilder pb = new ProcessBuilder(script);
pb.redirectErrorStream(true);
final Process p = pb.start();
Also, I don't have your GUI code to hand, so I wrote the output of the process to System.out instead.
When I ran your code, I got the following output:
hello. waiting 5 seconds.
ERROR: Input redirection is not supported, exiting the process immediately.
finished. goodbye.
Processed finished with exit code 0
Had you seen that error message, you might have twigged that something was up with the timeout command.
Incidentally, I noticed in one of your comments that none of the commands suggested by ughzan worked. I replaced the timeout line with ping -n 5 127.0.0.1 > NUL and the script ran as expected. I couldn't reproduce a problem with this.
The problem is definitely in timeout.exe. If you add echo %errorlevel% after line with timeout, you will see that it returns 1 if running from java. And 0 if running in usual way. Probably, it requires some specific console functionality (i.e. cursor positioning) that is suppressed when running from java process.
Is there anything I can do to get this to work while running from Java
If you don't need ability to run any batch file then consider to replace timeout with ping. Otherwise... I've tried to run batch file with JNA trough Kernel32.CreateProcess and timeout runs fine. But then you need to implement reading of process output trough native calls also.
I hope someone will suggest better way.
The ready method only tells if the stream can guarantee that something can be read immediately, without blocking. You can't really trust it because always returning false is a valid implementation. Streams with buffers may return true only when they have something buffered. So I suspect your problem is here:
while (!complete || in.ready()) {
while (in.ready()) {
gui.setTextAreaText(in.readLine() + "\n");
}
sleep(250);
}
It should rather read something like this:
String line;
while (!complete || (line=in.readLine()) != null) {
gui.setTextAreaText(line + "\n");
}
It's probably because your "timeout ..." command returned with an error.
Three ways to test it:
Check if the "timeout ..." command works in the Windows command prompt.
Replace "timeout ..." in the script with "ping -n 5 127.0.0.1 > NUL" (it essentially does the same thing)
Remove everything but "timeout /t 5 /nobreak > NUL" from your script. The process should return with an error (1) if the timeout failed because it is the last command executed.

ReadLine on TCPDump-Buffer sometimes blocks until kill tcpdump

I have a problem using TCPDump from my Android-Application.
It is supposed to read the output from tcpdump line by line and process it within my Application. The Problem is: Sometimes the code works fine, it reads the captured packets immediately. But sometimes, ReadLine blocks until I kill the tcpdump process from the Linux-Console (killall tcpdump). After doing that, my loop is processed for each line (sometimes 10, sometimes 1 or 2) - which means, the readLine should have worked, but didnĀ“t.
I read about similar problems, but did not find any solution for this problem... THANKS!!
public class ListenActivity extends Activity {
static ArrayList<Packet> packetBuffer = new ArrayList<Packet>();
static Process tcpDumpProcess = null;
static ListenThread thread = null;
public static final String TCPDUMP_COMMAND = "tcpdump -A -s0 | grep -i -e 'Cookie'\n";
private InputStream inputStream = null;
private OutputStream outputStream = null;
#Override
protected void onStart() {
super.onStart();
try {
tcpDumpProcess = new ProcessBuilder().command("su").redirectErrorStream(true).start();
inputStream = tcpDumpProcess.getInputStream();
outputStream = tcpDumpProcess.getOutputStream();
outputStream.write(TCPDUMP_COMMAND.getBytes("ASCII"));
} catch (Exception e) {
Log.e("FSE", "", e);
}
thread = new ListenThread(new BufferedReader(new InputStreamReader(inputStream)));
thread.start();
}
private class ListenThread extends Thread {
public ListenThread(BufferedReader reader) {
this.reader = reader;
}
private BufferedReader reader = null;
#Override
public void run() {
reader = new BufferedReader(new InputStreamReader(inputStream));
while (true) {
try {
String received = reader.readLine();
Log.d("FS", received);
Packet pReceived = Packet.analyze(received);
if (pReceived != null) {
packetBuffer.add(pReceived);
}
} catch (Exception e) {
Log.e("FSE", "", e);
}
}
}
}
}
Because output sent to pipes is usually block buffered, both the tcpdump process and the grep process will be waiting until they've received enough data to bother sending it onto your program. You're very lucky though, both programs you have chosen to use are prepared to modify their buffer behavior (using the setvbuf(3) function internally, in case you're curious about the details):
For tcpdump(8):
-l Make stdout line buffered. Useful if you want to see
the data while capturing it. E.g.,
``tcpdump -l | tee dat'' or ``tcpdump -l >
dat & tail -f dat''.
For grep(1):
--line-buffered
Use line buffering on output. This can cause a
performance penalty.
Try this:
"tcpdump -l -A -s0 | grep --line-buffered -i -e 'Cookie'\n";
I don't understand why, but even with the -l option the buffer is too large if you read on the standard output of the process wherein you run tcpdump.
I solve this problem by redirect TcpDump's output to a file and read this file in another thread. The TcpDump command should be something like :
tcpdump -l-A -s0 > /data/local/output.txt
The run method inside your thread have to be change to read in the output file :
File dumpedFile = new File("/data/local/output.txt");
//open a reader on the tcpdump output file
BufferedReader reader = new BufferedReader(new FileReader(dumpedFile));
String temp = new String();
//The while loop is broken if the thread is interrupted
while (!Thread.interrupted()) {
temp = reader.readLine();
if (temp!=null) {
Log.e("READER",new String(temp));
}
}
I dont exactly know what you want to do with grep but I think it's possible do achieve the same actions with a regexp inside the Java code.
You should also be aware that the TcpDump's process will never end, so you have to kill it when your activity is paused or distroy.
You can have a look here to my blog post, I explain my whole code to start/stop tcpdump.

Why executing interactive process with redirected input/output streams causes application being stopped?

I have a console Java program that executes sh -i in a separate process and copies the data between the processes' input/output stream and corresponding System streams:
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
class StreamCopier implements Runnable {
private InputStream in;
private OutputStream out;
public StreamCopier(InputStream in, OutputStream out) {
this.in = in;
this.out = out;
}
public void run() {
try {
int n = 0;
byte[] buffer = new byte[4096];
while (-1 != (n = in.read(buffer))) {
out.write(buffer, 0, n);
out.flush();
}
} catch (IOException e) {
System.out.println(e);
}
}
}
public class Test {
public static void main(String[] args)
throws IOException, InterruptedException {
Process process = Runtime.getRuntime().exec("sh -i");
new Thread(new StreamCopier(
process.getInputStream(), System.out)).start();
new Thread(new StreamCopier(
process.getErrorStream(), System.err)).start();
new Thread(new StreamCopier(
System.in, process.getOutputStream())).start();
process.waitFor();
}
}
Running it under Linux results in the following:
$
[1]+ Stopped java -cp . Test
Could anyone clarify why is the application stopped and how to avoid it?
This is related to my question on copying streams, but I think this particular issue deserves separate attention.
You can turn off job control by invoking the shell like sh -i +m, which will stop it taking over the tty. This means that the fg and bg commands will not work and a Ctrl+Z will suspend your Java application, the shell and all programs started from it.
If you still want job control, you should use a pseudo terminal to communicate with the shell, which creates a new tty for the shell to use, but I don't think Java supports that.
You are being stopped by SIGTTIN or SIGTTOU. These signals are sent to a background process when they attempt to do IO to a TTY. In this case "background" means "not the controlling process group of the terminal". I suspect the subshell you're forking off is creating a new pgrp and taking over your tty. Then the parent program (java) does IO (in your case probably reading from the TTY) and gets SIGTTIN.
An easy way to confirm this theory would be to replace sh with something simpler (not a shell) which will not try to take over the tty.

Categories

Resources