Capturing large amounts of output from Apache Commons-Exec - java

I am writing a video application in Java by executing ffmpeg and capturing its output to standard output. I decided to use Apache Commons-Exec instead of Java's Runtime, because it seems better. However, I am have a difficult time capturing all of the output.
I thought using pipes would be the way to go, because it is a standard way of inter-process communication. However, my setup using PipedInputStream and PipedOutputStream is wrong. It seems to work, but only for the first 1042 bytes of the stream, which curiously happens to be the value of PipedInputStream.PIPE_SIZE.
I have no love affair with using pipes, but I want to avoid use disk I/O (if possible), because of speed and volume of data (a 1m 20s video at 512x384 resolution produces 690M of piped data).
Thoughts on the best solution to handle large amounts of data coming from a pipe? My code for my two classes are below. (yes, sleep is bad. Thoughts on that? wait() and notifyAll() ?)
WriteFrames.java
public class WriteFrames {
public static void main(String[] args) {
String commandName = "ffmpeg";
CommandLine commandLine = new CommandLine(commandName);
File filename = new File(args[0]);
String[] options = new String[] {
"-i",
filename.getAbsolutePath(),
"-an",
"-f",
"yuv4mpegpipe",
"-"};
for (String s : options) {
commandLine.addArgument(s);
}
PipedOutputStream output = new PipedOutputStream();
PumpStreamHandler streamHandler = new PumpStreamHandler(output, System.err);
DefaultExecutor executor = new DefaultExecutor();
try {
DataInputStream is = new DataInputStream(new PipedInputStream(output));
YUV4MPEGPipeParser p = new YUV4MPEGPipeParser(is);
p.start();
executor.setStreamHandler(streamHandler);
executor.execute(commandLine);
} catch (IOException e) {
e.printStackTrace();
}
}
}
YUV4MPEGPipeParser.java
public class YUV4MPEGPipeParser extends Thread {
private InputStream is;
int width, height;
public YUV4MPEGPipeParser(InputStream is) {
this.is = is;
}
public void run() {
try {
while (is.available() == 0) {
Thread.sleep(100);
}
while (is.available() != 0) {
// do stuff.... like write out YUV frames
}
} catch (IOException e) {
e.printStackTrace();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

The problem is in the run method of YUV4MPEGPipeParser class. There are two successive loops. The second loop terminates immediately if there are no data currently available on the stream (e.g. all input so far was processed by parser, and ffmpeg or stream pump were not fast enough to serve some new data for it -> available() == 0 -> loop is terminated -> pump thread finishes).
Just get rid of these two loops and sleep and just perform a simple blocking read() instead of checking if any data are available for processing. There is also probably no need for wait()/notify() or even sleep() because the parser code is started on a separate thread.
You can rewrite the code of run() method like this:
public class YUV4MPEGPipeParser extends Thread {
...
// optimal size of buffer for reading from pipe stream :-)
private static final int BUFSIZE = PipedInputStream.PIPE_SIZE;
public void run() {
try {
byte buffer[] = new byte[BUFSIZE];
int len = 0;
while ((len = is.read(buffer, 0, BUFSIZE) != -1) {
// we have valid data available
// in first 'len' bytes of 'buffer' array.
// do stuff.... like write out YUV frames
}
} catch ...
}
}

Related

Java exec method, how to handle streams correctly

What is the proper way to produce and consume the streams (IO) of external process from Java? As far as I know, java end input streams (process output) should be consumed in threads parallel to producing the process input due the possibly limited buffer size.
But I'm not sure if I eventually need to synchronize with those consumer threads, or is it enough just to wait for process to exit with waitFor method, to be certain that all the process output is actually consumed? I.E is it possible, even if the process exits (closes it's output stream), there is still unread data in the java end of the stream? How does the waitFor actually even know when the process is done? For the process in question, EOF (closing the java end of it's input stream) signals it to exit.
My current solution to handle the streams is following
public class Application {
private static final StringBuffer output = new StringBuffer();
private static final StringBuffer errOutput = new StringBuffer();
private static final CountDownLatch latch = new CountDownLatch(2);
public static void main(String[] args) throws IOException, InterruptedException {
Process exec = Runtime.getRuntime().exec("/bin/cat");
OutputStream procIn = exec.getOutputStream();
InputStream procOut = exec.getInputStream();
InputStream procErrOut = exec.getErrorStream();
new Thread(new StreamConsumer(procOut, output)).start();
new Thread(new StreamConsumer(procErrOut, errOutput)).start();
PrintWriter printWriter = new PrintWriter(procIn);
printWriter.print("hello world");
printWriter.flush();
printWriter.close();
int ret = exec.waitFor();
latch.await();
System.out.println(output.toString());
System.out.println(errOutput.toString());
}
public static class StreamConsumer implements Runnable {
private InputStream input;
private StringBuffer output;
public StreamConsumer(InputStream input, StringBuffer output) {
this.input = input;
this.output = output;
}
#Override
public void run() {
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
String line;
try {
while ((line = reader.readLine()) != null) {
output.append(line + System.lineSeparator());
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
reader.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
latch.countDown();
}
}
}
}
}
Is it necessary to use the latch here, or does the waitFor implicate all the output is already consumed? Also, if the output doesn't end/contain new line, will the readLine miss the output, or still read all that is left? Does reading null mean process has closed it's end of the stream - is there any other scenario where null could be read?
What is the correct way to handle streams, could I do something better than in my example?
waitFor signals that the process ended, but you cannot be sure the threads which collect strings from its stdout and stderr finished also, so using a latch is a step in the right direction, but not an optimal one.
Instead of waiting for the latch, you can wait for the threads directly:
Thread stdoutThread = new Thread(new StreamConsumer(procOut, output)).start();
Thread stderrThread = ...
...
int ret = exec.waitFor();
stdoutThread.join();
stderrThread.join();
BTW, storing lines in StringBuffers is useless work. Use ArrayList<String> instead, put lines there without any conversion, and finally retrieve them in a loop.
Your appapproach is right, but is't better to remove CountDownLatch and use ThreadPool, and not create new Thread directly. From ThreadPool you will get two futures, which you can wait after to completion.
But I'm not sure if I eventually need to synchronize with those consumer threads, or is it enough just to wait for process to exit with waitFor method, to be certain that all the process output is actually consumed? I.E is it possible, even if the process exits (closes it's output stream), there is still unread data in the java end of the stream?
Yes, this situation may occurs. Termination and reading IO streams is unrelated processes.

How to use java.lang.Process class to Provide inputs to another process

Just assume there is a program which takes inputs from the standard input.
For example:
cin>>id;
What I want to figure out is how to execute the process and give some input to its standard input. Getting the output of the process is not an issue for me. It works properly. The question is how to feed inputs for such processes using java.lang.Process class.
If there are any other third party libraries like Apache commons please mention them also.
Thanks in advance!
Use Process.getOutputStream() and write() to it. It's a bit tricky since you use the output stream to input data to the process, but the name reflects the interface that is returned (from your app's point of view it's output because you are writing to it).
You need to start a separate thread which reads from the output of one process and writes it as input to the other process.
Something like this should do:
class DataForwarder extends Thread {
OutputStream out;
InputStream in;
public DataForwarder(InputStream in, OutputStream out) {
this.out = out;
this.in = in;
}
#Override
public void run() {
byte[] buf = new byte[1024];
System.out.println("Hej");
try {
int n;
while (-1 != (n = in.read(buf)))
out.write(buf, 0, n);
out.close();
} catch (IOException e) {
// Handle in some suitable way.
}
}
}
Which would be used for prod >> cons as follows:
class Test {
public static void main(String[] args) throws IOException {
Process prod = new ProcessBuilder("ls").start();
Process cons = new ProcessBuilder("cat").start();
// Start feeding cons with output from prod.
new DataForwarder(prod.getInputStream(), cons.getOutputStream()).start();
}
}

How to write a byte array to OutputStream of process builder (Java)

byte[] bytes = value.getBytes();
Process q = new ProcessBuilder("process","arg1", "arg2").start();
q.getOutputStream().write(bytes);
q.getOutputStream().flush();
System.out.println(q.getInputStream().available());
I'm trying to stream file contents to an executable and capture the output but the output(InputStream) is always empty. I can capture the output if i specify the the file location but not with streamed input.
How might I overcome this?
Try wrapping your streams with BufferedInputStream() and BufferedOutputStream():
http://download.oracle.com/javase/6/docs/api/java/lang/Process.html#getOutputStream%28%29
Implementation note: It is a good idea for the output stream to be buffered.
Implementation note: It is a good idea for the input stream to be buffered.
Even with buffered streams, it is still possible for the buffer to fill if you're dealing with large amounts of data, you can deal with this by starting a separate thread to read from q.getInputStream(), so you can still be reading from the process while writing to the process.
Perhaps the program you execute only starts its work when it detects the end of its input data. This is normally done by waiting for an EOF (end-of-file) symbol. You can send this by closing the output stream to the process:
q.getOutputStream().write(bytes);
q.getOutputStream().close();
Try this together with waiting for the process.
I dont know if something else may also be wrong here, but the other process ("process") does not even have time to respond, you are not waiting for it (the method available() does not block). To try this out you can first insert a sleep(2000) after the flush(), and if that works you should switch to query'ing q.getInputStream().available() multiple times with short pauses in between.
I think, you have to wait, until the process finished.
I implemented something like this this way:
public class ProcessReader {
private static final int PROCESS_LOOP_SLEEP_MILLIS = 100;
private String result;
public ProcessReader(Process process) {
BufferedReader resultReader = new BufferedReader(new InputStreamReader(process.getInputStream()));
StringBuilder resultOutput = new StringBuilder();
try {
while (!checkProcessTerminated(process, resultReader, resultOutput)) {
}
} catch (Exception ex1) {
throw new RuntimeException(ex1);
}
result = resultOutput.toString();
}
public String getResult(){
return result;
}
private boolean checkProcessTerminated(Process process, BufferedReader resultReader, StringBuilder resultOutput) throws Exception {
try {
int exit = process.exitValue();
return true;
} catch (IllegalThreadStateException ex) {
Thread.sleep(PROCESS_LOOP_SLEEP_MILLIS);
} finally {
while (resultReader.ready()) {
String out = resultReader.readLine();
resultOutput.append(out).append("\n");
}
}
return false;
}
}
I just removed now some specific code, that you dont need, but it should work, try it.
Regards

design for a wrapper around command-line utilities

im trying to come up with a design for a wrapper for use when invoking command line utilities in java. the trouble with runtime.exec() is that you need to keep reading from the process' out and err streams or it hangs when it fills its buffers. this has led me to the following design:
public class CommandLineInterface {
private final Thread stdOutThread;
private final Thread stdErrThread;
private final OutputStreamWriter stdin;
private final History history;
public CommandLineInterface(String command) throws IOException {
this.history = new History();
this.history.addEntry(new HistoryEntry(EntryTypeEnum.INPUT, command));
Process process = Runtime.getRuntime().exec(command);
stdin = new OutputStreamWriter(process.getOutputStream());
stdOutThread = new Thread(new Leech(process.getInputStream(), history, EntryTypeEnum.OUTPUT));
stdOutThread.setDaemon(true);
stdOutThread.start();
stdErrThread = new Thread(new Leech(process.getErrorStream(), history, EntryTypeEnum.ERROR));
stdErrThread.setDaemon(true);
stdErrThread.start();
}
public void write(String input) throws IOException {
this.history.addEntry(new HistoryEntry(EntryTypeEnum.INPUT, input));
stdin.write(input);
stdin.write("\n");
stdin.flush();
}
}
And
public class Leech implements Runnable{
private final InputStream stream;
private final History history;
private final EntryTypeEnum type;
private volatile boolean alive = true;
public Leech(InputStream stream, History history, EntryTypeEnum type) {
this.stream = stream;
this.history = history;
this.type = type;
}
public void run() {
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line;
try {
while(alive) {
line = reader.readLine();
if (line==null) break;
history.addEntry(new HistoryEntry(type, line));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
my issue is with the Leech class (used to "leech" the process' out and err streams and feed them into history - which acts like a log file) - on the one hand reading whole lines is nice and easy (and what im currently doing), but it means i miss the last line (usually the prompt line). i only see the prompt line when executing the next command (because there's no line break until that point).
on the other hand, if i read characters myself, how can i tell when the process is "done" ? (either complete or waiting for input)
has anyone tried something like waiting 100 millis since the last output from the process and declaring it "done" ?
any better ideas on how i can implement a nice wrapper around things like runtime.exec("cmd.exe") ?
Use PlexusUtils it is used by Apache Maven 2 to execute all external processes.
I was looking for the same thing myself, and I found a Java port of Expect, called ExpectJ. I haven't tried it yet, but it looks promising
I would read the input in with the stream and then write it into a ByteArrayOutputStream. The byte array will continue to grow until there are no longer any available bytes to read. At this point you will flush the data to history by converting the byte array into a String and splitting it on the platform line.separator. You can then iterate over the lines to add history entries. The ByteArrayOutputStream is then reset and the while loop blocks until there is more data or the end of stream is reached (probably because the process is done).
public void run() {
ByteArrayOutputStream out = new ByteArrayOutputStream();
int bite;
try {
while((bite = stream.read()) != -1) {
out.write(bite);
if (stream.available() == 0) {
String string = new String(out.toByteArray());
for (String line : string.split(
System.getProperty("line.separator"))) {
history.addEntry(new HistoryEntry(type, line));
}
out.reset();
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
This will make sure you pick up that last line of input and it solves your problem of knowing when the stream is ended.

How do I use Java to read from a file that is actively being written to?

I have an application that writes information to file. This information is used post-execution to determine pass/failure/correctness of the application. I'd like to be able to read the file as it is being written so that I can do these pass/failure/correctness checks in real time.
I assume it is possible to do this, but what are the gotcha's involved when using Java? If the reading catches up to the writing, will it just wait for more writes up until the file is closed, or will the read throw an exception at this point? If the latter, what do I do then?
My intuition is currently pushing me towards BufferedStreams. Is this the way to go?
Could not get the example to work using FileChannel.read(ByteBuffer) because it isn't a blocking read. Did however get the code below to work:
boolean running = true;
BufferedInputStream reader = new BufferedInputStream(new FileInputStream( "out.txt" ) );
public void run() {
while( running ) {
if( reader.available() > 0 ) {
System.out.print( (char)reader.read() );
}
else {
try {
sleep( 500 );
}
catch( InterruptedException ex ) {
running = false;
}
}
}
}
Of course the same thing would work as a timer instead of a thread, but I leave that up to the programmer. I'm still looking for a better way, but this works for me for now.
Oh, and I'll caveat this with: I'm using 1.4.2. Yes I know I'm in the stone ages still.
If you want to read a file while it is being written and only read the new content then following will help you achieve the same.
To run this program you will launch it from command prompt/terminal window and pass the file name to read. It will read the file unless you kill the program.
java FileReader c:\myfile.txt
As you type a line of text save it from notepad and you will see the text printed in the console.
public class FileReader {
public static void main(String args[]) throws Exception {
if(args.length>0){
File file = new File(args[0]);
System.out.println(file.getAbsolutePath());
if(file.exists() && file.canRead()){
long fileLength = file.length();
readFile(file,0L);
while(true){
if(fileLength<file.length()){
readFile(file,fileLength);
fileLength=file.length();
}
}
}
}else{
System.out.println("no file to read");
}
}
public static void readFile(File file,Long fileLength) throws IOException {
String line = null;
BufferedReader in = new BufferedReader(new java.io.FileReader(file));
in.skip(fileLength);
while((line = in.readLine()) != null)
{
System.out.println(line);
}
in.close();
}
}
You might also take a look at java channel for locking a part of a file.
http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html
This function of the FileChannel might be a start
lock(long position, long size, boolean shared)
An invocation of this method will block until the region can be locked
I totally agree with Joshua's response, Tailer is fit for the job in this situation. Here is an example :
It writes a line every 150 ms in a file, while reading this very same file every 2500 ms
public class TailerTest
{
public static void main(String[] args)
{
File f = new File("/tmp/test.txt");
MyListener listener = new MyListener();
Tailer.create(f, listener, 2500);
try
{
FileOutputStream fos = new FileOutputStream(f);
int i = 0;
while (i < 200)
{
fos.write(("test" + ++i + "\n").getBytes());
Thread.sleep(150);
}
fos.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
private static class MyListener extends TailerListenerAdapter
{
#Override
public void handle(String line)
{
System.out.println(line);
}
}
}
The answer seems to be "no" ... and "yes". There seems to be no real way to know if a file is open for writing by another application. So, reading from such a file will just progress until content is exhausted. I took Mike's advice and wrote some test code:
Writer.java writes a string to file and then waits for the user to hit enter before writing another line to file. The idea being that it could be started up, then a reader can be started to see how it copes with the "partial" file. The reader I wrote is in Reader.java.
Writer.java
public class Writer extends Object
{
Writer () {
}
public static String[] strings =
{
"Hello World",
"Goodbye World"
};
public static void main(String[] args)
throws java.io.IOException {
java.io.PrintWriter pw =
new java.io.PrintWriter(new java.io.FileOutputStream("out.txt"), true);
for(String s : strings) {
pw.println(s);
System.in.read();
}
pw.close();
}
}
Reader.java
public class Reader extends Object
{
Reader () {
}
public static void main(String[] args)
throws Exception {
java.io.FileInputStream in = new java.io.FileInputStream("out.txt");
java.nio.channels.FileChannel fc = in.getChannel();
java.nio.ByteBuffer bb = java.nio.ByteBuffer.allocate(10);
while(fc.read(bb) >= 0) {
bb.flip();
while(bb.hasRemaining()) {
System.out.println((char)bb.get());
}
bb.clear();
}
System.exit(0);
}
}
No guarantees that this code is best practice.
This leaves the option suggested by Mike of periodically checking if there is new data to be read from the file. This then requires user intervention to close the file reader when it is determined that the reading is completed. Or, the reader needs to be made aware the content of the file and be able to determine and end of write condition. If the content were XML, the end of document could be used to signal this.
There are a Open Source Java Graphic Tail that does this.
https://stackoverflow.com/a/559146/1255493
public void run() {
try {
while (_running) {
Thread.sleep(_updateInterval);
long len = _file.length();
if (len < _filePointer) {
// Log must have been jibbled or deleted.
this.appendMessage("Log file was reset. Restarting logging from start of file.");
_filePointer = len;
}
else if (len > _filePointer) {
// File must have had something added to it!
RandomAccessFile raf = new RandomAccessFile(_file, "r");
raf.seek(_filePointer);
String line = null;
while ((line = raf.readLine()) != null) {
this.appendLine(line);
}
_filePointer = raf.getFilePointer();
raf.close();
}
}
}
catch (Exception e) {
this.appendMessage("Fatal error reading log file, log tailing has stopped.");
}
// dispose();
}
You can't read a file which is opened from another process using FileInputStream, FileReader or RandomAccessFile.
But using FileChannel directly will work:
private static byte[] readSharedFile(File file) throws IOException {
byte buffer[] = new byte[(int) file.length()];
final FileChannel fc = FileChannel.open(file.toPath(), EnumSet.of(StandardOpenOption.READ));
final ByteBuffer dst = ByteBuffer.wrap(buffer);
fc.read(dst);
fc.close();
return buffer;
}
Not Java per-se, but you may run into issues where you have written something to a file, but it hasn't been actually written yet - it might be in a cache somewhere, and reading from the same file may not actually give you the new information.
Short version - use flush() or whatever the relevant system call is to ensure that your data is actually written to the file.
Note I am not talking about the OS level disk cache - if your data gets into here, it should appear in a read() after this point. It may be that the language itself caches writes, waiting until a buffer fills up or file is flushed/closed.
I've never tried it, but you should write a test case to see if reading from a stream after you have hit the end will work, regardless of if there is more data written to the file.
Is there a reason you can't use a piped input/output stream? Is the data being written and read from the same application (if so, you have the data, why do you need to read from the file)?
Otherwise, maybe read till end of file, then monitor for changes and seek to where you left off and continue... though watch out for race conditions.

Categories

Resources