How to read Java InputStream without delay? - java

I have C++ program (feeder.exe), that prints some data:
printf("%s\n", line);
In average it produces 20-30 lines per second, but not uniformly.
I want to catch this data in Java program by running the exe from Java code:
package temp_read;
import java.io.*;
public class Main {
public static void main(String[] args) throws Throwable {
Process p = Runtime.getRuntime().exec("d:/feeder.exe");
InputStream is = p.getInputStream();
BufferedReader in = new BufferedReader(new InputStreamReader(is));
String line = null;
while ((line = in.readLine()) != null) {
System.out.println(System.currentTimeMillis() + "," + line);
}
}
}
But when I look into the output, I see that it receives a bulk of strings once per 3-5 seconds.
Question: how to receive the data from feeder.exe immediately without any delay when it prints to stdout?
PS: not related question: How to stop the feeder.exe if I stop the java by Ctrl+C?

If redirected, stdout is probably buffered, meaning that the problem is in the C++ code and not on the Java side. The C++ process will buffer the output and flush several "printf"s at once, as soon as the buffer is full.
If you are able to modify the C++ software, try to do a fflush(stdout); after the printf to force the output buffer to be flushed.

The most likely cause is that the feeder.exe is not flushing its output stream regularly, and the data is sitting in its output buffer until the buffer fills and is written as a result.
If that is the cause, there is nothing you can do on the Java side to avoid this. The problem can only be fixed by modifying the feeder application.
Note that if the data was in the "pipe" that connected the two processes, then reading on the Java side would get it. Assuming that the end-of-line had been written to the pipe, the readLine() call would deliver the line without blocking.

Related

TextArea uses a lot of RAM when reading a big output

My program executes system commands, and returns the output line by line, however, there are a couple of commands that produces lots of lines, in this case the RAM usage rises to ~700Mbs, knowing that the usual RAM usage in any other commands is 50-60Mbs.
This is the method that handles reading the output using BufferedReader, it is called by another method that handles the creation of the process of the command. it also passes the output line by line to showOutputLine() method, which will print it to the console or to a TextArea.
protected void formatStream(InputStream inputStream, boolean isError) {
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
String tempLine = null;
// Read output
try {
while ((tempLine = bufferedReader.readLine()) != null) {
showOutputLine(tempLine, isError);
}
} catch (IOException e) {// just stop
}
}
one example of the commands that causes the issue:
adb logcat
EDIT: it appears BufferedReader is innocent, however, the problem still persists. caused by JTextArea.
BufferedReader always uses about 16 KB (8K * 2 byte chars) in a fixed size array. If you are using more than this it is a side effect of generating so many Strings (esp if you have really long lines of text) not the BufferedReader itself.
TextArea can retain much more memory usage depending on how long the text is.
In any case, the memory usage which really matters is the size of the heap after a Full GC, the rest is overhead of various kinds.
BTW Mb = Megi-bit, MB = Mega-byte.

Why is BufferedReader readLine reading past EOF

I have a very large file (~6GB) that has fixed-width text separated by \r\n, and so I'm using buffered reader to read line by line. This process can be interrupted or stopped and if it is, it uses a checkpoint "lastProcessedLineNbr" to fast forward to the correct place to resume reading. This is how the reader is initialized.
private void initializeBufferedReader(Integer lastProcessedLineNbr) throws IOException {
reader = new BufferedReader(new InputStreamReader(getInputStream(), "UTF-8"));
if(lastProcessedLineNbr==null){lastProcessedLineNbr=0;}
for(int i=0; i<lastProcessedLineNbr;i++){
reader.readLine();
}
currentLineNumber = lastProcessedLineNbr;
}
This seems to work fine, and I read and process the data in this method:
public Object readItem() throws Exception {
if((currentLine = reader.readLine())==null){
return null;
}
currentLineNumber++;
return parse(currentLine);
}
And again, everything works fine until I reach the last line in the document. readLine() in the latter method throws an error:
17:06:49,980 ERROR [org.jberet] (Batch Thread - 1) JBERET000007: Failed to run job ProdFileRead, parse, org.jberet.job.model.Chunk#3965dcc8: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:569)
at java.lang.StringBuffer.append(StringBuffer.java:369)
at java.io.BufferedReader.readLine(BufferedReader.java:370)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.rational.batch.reader.TextLineReader.readItem(TextLineReader.java:55)
Curiously, it seems to be reading past the end of the file and allocating so much space that it runs out of memory. I tried looking at the contents of the file using Cygwin and "tail file.txt" and in the console it gave me the expected 10 lines. But when I did "tail file.txt > output.txt" output.txt ended up being like 1.8GB, much larger than the 10 lines I expected. So it seems Cygwin is doing the same thing. As far as I can tell there is no special EOF character. It's just the last byte of data and it ends abruptly.
Anyone have any idea on how I can get this working? I'm thinking I could resort to counting the number of bytes read until I get the full size of the file, but I was hoping there was a better way.
But when I did tail file.txt > output.txt output.txt ended up being like 1.8GB, much larger than the 10 lines I expected
What this indicates to me is that the file is padded with 1.8GB of binary zeroes, which Cygwin's tail command ignored when writing to the terminal, but which Java is not ignoring. This would explain your OutOfMemoryError as well, as the BufferedReader continued reading data looking for the next \r\n, never finding it before overflowing memory.

BufferedReader read() and readLine() both hang trying to get a response from a POP3 server

I'm developing a small terminal app that handles interactions with a POP3 server. However, I'm having a problem where both read() and readLine() from BufferedReader block. My initial attempts used readLine(), but after reading on SO and other sites, I figured that the server isn't returning the appropriate characters to mark the end of the line, so I attempted to use read(). But for some reason, that blocks as well.
Socket s = new Socket(InetAddress.getByName(this.HOST), 110);
BufferedReader in = new BufferedReader(new InputStreamReader(s.getInputStream()));
PrintWriter out = new PrintWriter(s.getOutputStream(), true);
String res = in.readLine(); // This works fine
System.out.println(res);
res = "";
char [] charRes = new char[1024];
out.println("USER " + this.username);
// res = in.readLine();
in.read(charRes); // Does not work
res = charRes.toString();
System.out.println(res);
The problem is not with the server because I tested it using Telnet and it works fine. I'm not sure what I'm doing wrong and I would appreciate any help.
My client software is running on a Linux system and I am connecting to a Windows server.
According to the documentation, the read(byte[]) method will block until at least one byte has been read from the stream.
To ensure there is data available before calling the read(byte[]) method, you can call the available() method, documented here.
Based on your original question and the subsequent comments, the main problem seems to be caused by the line endings. You have indicated that the client is running on Linux, where the standard line ending is LF (\n) but the POP3 RFC specifically requires each command to be terminated by CRLF (\r\n). Instead of using out.println() (which will use your system line ending of \n automatically), try using the PrintWriter.write(String) (documentation) method.
You appear to be using the PrintWriter constructor with autoFlush set to true. This meant that your stream would be flushed automatically when using the println() method, HOWEVER, it will not flush automatically when using the write() method, so you will need to also add a call to flush(). Again, it is advisable to refer to the documentation. The updated code would look something like this:
out.write("USER " + this.username + "\r\n");
out.flush();
You may want to consider similarly updating your reads from the input stream to make sure you don't have to manually remove \r characters from the input (the various readLine() methods will consume / discard the \n but not the \r when the client runs on Linux).

Reading other process' **unbuffered** output stream

I'm programming a little GUI for a file converter in java. The file converter writes its current progress to stdout. Looks like this:
Flow_1.wav: 28% complete, ratio=0,447
I wanted to illustrate this in a progress bar, so I'm reading the process' stdout like this:
ProcessBuilder builder = new ProcessBuilder("...");
builder.redirectErrorStream(true);
Process proc = builder.start();
InputStream stream = proc.getInputStream();
byte[] b = new byte[32];
int length;
while (true) {
length = stream.read(b);
if (length < 0) break;
// processing data
}
Now the problem is that regardless which byte array size I choose, the stream is read in chunks of 4 KB. So my code is being executed until length = stream.read(b); and then blocks for quite a while. Once the process generates 4 KB output data, my programm gets this chunk and works through it in 32 byte slices. And then waits again for the next 4 KB.
I tried to force java to use smaller buffers like this:
BufferedInputStream stream = new BufferedInputStream(proc.getInputStream(), 32);
Or this:
BufferedReader reader = new BufferedReader(new InputStreamReader(proc.getInputStream()), 32);
But neither changed anything.
Then I found this: Process source (around line 87)
It seems that the Process class is implemented in such a way that it pipes the process' stdout to a file. So what proc.getInputStream(); actually does, is returning a stream to a file. And this file seems to be written with a 4 KB buffer.
Does anyone know some kind of workaround for this situation? I just want to get the process' output instantly.
EDIT: As suggested by Ian Roberts, I also tried to pipe the converter's output into the stderr stream, since this stream doesn't seem to be wrapped in a BufferedInputStream. Still 4k chunks.
Another interesting thing is: I actually don't get exactly 4096 bytes, but about 5 more. I'm afraid the FileInputStream itself is buffered natively.
Looking at the code you linked to the process's standard output stream gets wrapped in a BufferedInputStream but its standard error remains unbuffered. So one possibility might be to execute not the converter directly, but a shell script (or Windows equivalent if you're on Windows) that sends the converter's stdout to stderr:
ProcessBuilder builder = new ProcessBuilder("/bin/sh", "-c",
"exec /path/to/converter args 1>&2");
Don't redirectErrorStream, and then read from proc.getErrorStream() instead of proc.getInputStream().
It may be the case that your converter is already using stderr for its progress reporting in which case you don't need the script bit, just turn off redirectErrorStream(). If the converter program writes to both stdout and stderr then you'll need to spawn a second thread to consume stdout as well (the script approach gets around this by sending everything to stderr).

Java socket communication problem

I am using Java for socket communication. The server is reading bytes from the client like this:
InputStream inputStream;
final int BUFFER_SIZE = 65536;
byte[] buffer = new byte[BUFFER_SIZE];
String msg="";
while (msg.indexOf(0)==-1 && (read = inputStream.read(buffer)) != -1)
{
msg += new String(buffer, 0, read);
}
handleMessage(msg)
There is a problem when a client is sending multiple messages at once the server mixes the messages e.g.
MSG1: <MyMessage><Hello/>nul
MSG2: </MyMessage><MyMessage><Hello again /></MyMessage>nul
So the tail of Message 1 is part of Message 2.
The null represents the java nul symbol.
Why does the inputstream mix the messages?
Thanks in advance!
You are doing the wrong comparison. You check if there is a \0 anywhere in the String, and then believe it is a single message. Wrong. In fact, in the second example, the \0 comes twice.
You should do it differently. Read from the Stream in char by char (Using a wrapping BufferedInputStream, else the performance will be awful), and skip when the \0 is reached. Now the message is complete, and you can handle it.
InputStream bin = new BufferedInputStream(inputStream);
InputStreamReader reader = new InputStreamReader(bin);
StringBuilder msgBuilder = new StringBuilder();
char c;
while ( (c=reader.read()) != -1 )
{
msgBuilder .append(c);
}
handleMessage(msgBuilder.toString())
Even better would be using the newline character for line separation. In this case you could just use the readline() functionality of BufferedReader.
Sockets and InputStream are only a stream of bytes, not messages.
If you want to break up the stream based on a char like \0 you need to do this yourself.
However, in your case it appears you have a bug on your sending side as the \0 isn't the right places and it is highly unlikley to be a bug on the client side.
btw: Using String += is very inefficient.
The data you read from InputStream will come in as it's available from the OS and there's no guarantee on how it will be split up. If you're looking to split on new lines you might want to consider something like this:
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
And then use reader.readLine() to get each line as a String that's your message.
I think your problem is in your approach:
There is a problem when a client is sending multiple messages at once
Sockets in general just get chunks of bytes and the trick is in how you send them, how do you mark start/end of a message and how do you check it for errors (quick hash can do a lot of good) so the behaviour is fine in my eyes, you just need to work on your messaging if you really need to send multiple messages at once.
Sockets will control if your message is integral physical wise but what is IN the message is your concern.

Categories

Resources