InputStream or Reader wrapper for progress reporting - java

So, I'm feeding file data to an API that takes a Reader, and I'd like a way to report progress.
It seems like it should be straightforward to write a FilterInputStream implementation that wraps the FileInputStream, keeps track of the number of bytes read vs. the total file size, and fires some event (or, calls some update() method) to report fractional progress.
(Alternatively, it could report absolute bytes read, and somebody else could do the math -- maybe more generally useful in the case of other streaming situations.)
I know I've seen this before and I may even have done it before, but I can't find the code and I'm lazy. Has anyone got it laying around? Or can someone suggest a better approach?
One year (and a bit) later...
I implemented a solution based on Adamski's answer below, and it worked, but after some months of usage I wouldn't recommend it. When you have a lot of updates, firing/handling unnecessary progress events becomes a huge cost. The basic counting mechanism is fine, but much better to have whoever cares about the progress poll for it, rather than pushing it to them.
(If you know the total size, you can try only firing an event every > 1% change or whatever, but it's not really worth the trouble. And often, you don't.)

Here's a fairly basic implementation that fires PropertyChangeEvents when additional bytes are read. Some caveats:
The class does not support mark or reset operations, although these would be easy to add.
The class does not check whether the total number bytes read ever exceeds the maximum number of bytes anticipated, although this could always be dealt with by client code when displaying progress.
I haven't test the code.
Code:
public class ProgressInputStream extends FilterInputStream {
private final PropertyChangeSupport propertyChangeSupport;
private final long maxNumBytes;
private volatile long totalNumBytesRead;
public ProgressInputStream(InputStream in, long maxNumBytes) {
super(in);
this.propertyChangeSupport = new PropertyChangeSupport(this);
this.maxNumBytes = maxNumBytes;
}
public long getMaxNumBytes() {
return maxNumBytes;
}
public long getTotalNumBytesRead() {
return totalNumBytesRead;
}
public void addPropertyChangeListener(PropertyChangeListener l) {
propertyChangeSupport.addPropertyChangeListener(l);
}
public void removePropertyChangeListener(PropertyChangeListener l) {
propertyChangeSupport.removePropertyChangeListener(l);
}
#Override
public int read() throws IOException {
int b = super.read();
updateProgress(1);
return b;
}
#Override
public int read(byte[] b) throws IOException {
return (int)updateProgress(super.read(b));
}
#Override
public int read(byte[] b, int off, int len) throws IOException {
return (int)updateProgress(super.read(b, off, len));
}
#Override
public long skip(long n) throws IOException {
return updateProgress(super.skip(n));
}
#Override
public void mark(int readlimit) {
throw new UnsupportedOperationException();
}
#Override
public void reset() throws IOException {
throw new UnsupportedOperationException();
}
#Override
public boolean markSupported() {
return false;
}
private long updateProgress(long numBytesRead) {
if (numBytesRead > 0) {
long oldTotalNumBytesRead = this.totalNumBytesRead;
this.totalNumBytesRead += numBytesRead;
propertyChangeSupport.firePropertyChange("totalNumBytesRead", oldTotalNumBytesRead, this.totalNumBytesRead);
}
return numBytesRead;
}
}

Guava's com.google.common.io package can help you a little. The following is uncompiled and untested but should put you on the right track.
long total = file1.length();
long progress = 0;
final OutputStream out = new FileOutputStream(file2);
boolean success = false;
try {
ByteStreams.readBytes(Files.newInputStreamSupplier(file1),
new ByteProcessor<Void>() {
public boolean processBytes(byte[] buffer, int offset, int length)
throws IOException {
out.write(buffer, offset, length);
progress += length;
updateProgressBar((double) progress / total);
// or only update it periodically, if you prefer
}
public Void getResult() {
return null;
}
});
success = true;
} finally {
Closeables.close(out, !success);
}
This may look like a lot of code, but I believe it's the least you'll get away with. (note other answers to this question don't give complete code examples, so it's hard to compare them that way.)

The answer by Adamski works but there is a small bug. The overridden read(byte[] b) method calls the read(byte[] b, int off, int len) method trough the super class.
So updateProgress(long numBytesRead) is called twice for every read action and you end up with a numBytesRead that is twice the size of the file after the total file has been read.
Not overriding read(byte[] b) method solves the problem.

If you’re building a GUI application there’s always ProgressMonitorInputStream. If there’s no GUI involved wrapping an InputStream in the way you describe is a no-brainer and takes less time than posting a question here.

To complete the answer given by #Kevin Bourillion, it can be applied to a network content as well using this technique (that prevents reading the stream twice : one for size and one for content) :
final HttpURLConnection httpURLConnection = (HttpURLConnection) new URL( url ).openConnection();
InputSupplier< InputStream > supplier = new InputSupplier< InputStream >() {
public InputStream getInput() throws IOException {
return httpURLConnection.getInputStream();
}
};
long total = httpURLConnection.getContentLength();
final ByteArrayOutputStream bos = new ByteArrayOutputStream();
ByteStreams.readBytes( supplier, new ProgressByteProcessor( bos, total ) );
Where ProgressByteProcessor is an inner class :
public class ProgressByteProcessor implements ByteProcessor< Void > {
private OutputStream bos;
private long progress;
private long total;
public ProgressByteProcessor( OutputStream bos, long total ) {
this.bos = bos;
this.total = total;
}
public boolean processBytes( byte[] buffer, int offset, int length ) throws IOException {
bos.write( buffer, offset, length );
progress += length - offset;
publishProgress( (float) progress / total );
return true;
}
public Void getResult() {
return null;
}
}

Related

Obtain the number of written bytes from BufferedOutputStream

I am wondering if BufferedOutputStream offers any way to provide the count of bytes it has written. I am porting code from C# to Java. The code uses Stream.Position to obtain the count of written bytes.
Could anyone shed some light on this? This is not a huge deal because I can easily add a few lines of code to track the count. It would be nice if BufferedOutputStream already has the function.
For text there is a LineNumberReader, but no counting the progress of an OutputStream. You can add that with a wrapper class, a FilterOutputStream.
public class CountingOutputStream extends FilterOutputStream {
private long count;
private int bufferCount;
public CountingOutputStream(OutputStream out) {
super(out);
}
public long written() {
return count;
}
public long willBeWritten() {
return count + bufferCount;
}
#Override
public void flush() {
count += bufferCount;
bufferCount = 0;
super.flush();
}
public void write​(int b)
throws IOException {
++bufferCount;
super.write(b);
}
#Override
public void write(byte[] b, int off, int len)
throws IOException {
bufferCount += len;
super.write(b, off, len);
}
#Override
public void write(byte[] b, int off, int len)
throws IOException {
bufferCount += len;
super.write(b, off, len);
}
}
One could also think using a MemoryMappedByteBuffer (a memory mapped file) for a better speed/memory behavior. Created from a RandomAccessFile or FileChannel.
If all is too circumstantial, use the Files class which has many utilities, like copy. It uses Path - a generalisation of (disk I/O) File -, for files from the internet, files inside zip files, class path resources and so on.

Invalidate Stream without Closing

This is a followup to anonymous file streams reusing descriptors
As per my previous question, I can't depend on code like this (happens to work in JDK8, for now):
RandomAccessFile r = new RandomAccessFile(...);
FileInputStream f_1 = new FileInputStream(r.getFD());
// some io, not shown
f_1 = null;
f_2 = new FileInputStream(r.getFD());
// some io, not shown
f_2 = null;
f_3 = new FileInputStream(r.getFD());
// some io, not shown
f_3 = null;
However, to prevent accidental errors and as a form of self-documentation, I would like to invalidate each file stream after I'm done using it - without closing the underlying file descriptor.
Each FileInputStream is meant to be independent, with positioning controlled by the RandomAccessFile. I share the same FileDescriptor to prevent any race conditions arising from opening the same path multiple times. When I'm done with one FileInputStream, I want to invalidate it so as to make it impossible to accidentally read from it while using the second FileInputStream (which would cause the second FileInputStream to skip data).
How can I do this?
notes:
the libraries I use require compatibiity with java.io.*
if you suggest a library (I prefer builtin java semantics if at all possible), it must be commonly available (packaged) for linux (the main target) and usable on windows (experimental target)
but, windows support isn't a absolutely required
Edit: in response to a comment, here is my workflow:
RandomAccessFile r = new RandomAccessFile(String path, "r");
int header_read;
int header_remaining = 4; // header length, initially
byte[] ba = new byte[header_remaining];
ByteBuffer bb = new ByteBuffer.allocate(header_remaining);
while ((header_read = r.read(ba, 0, header_remaining) > 0) {
header_remaining -= header_read;
bb.put(ba, 0, header_read);
}
byte[] header = bb.array();
// process header, not shown
// the RandomAccessFile above reads only a small amount, so buffering isn't required
r.seek(0);
FileInputStream f_1 = new FileInputStream(r.getFD());
Library1Result result1 = library1.Main.entry_point(f_1)
// process result1, not shown
// Library1 reads the InputStream in large chunks, so buffering isn't required
// invalidate f_1 (this question)
r.seek(0)
int read;
while ((read = r.read(byte[4096] buffer)) > 0 && library1.continue()) {
library2.process(buffer, read);
}
// the RandomAccessFile above is read in large chunks, so buffering isn't required
// in a previous edit the RandomAccessFile was used to create a FileInputStream. Obviously that's not required, so ignore
r.seek(0)
Reader r_1 = new BufferedReader(new InputStreamReader(new FileInputStream(r.getFD())));
Library3Result result3 = library3.Main.entry_point(r_2)
// process result3, not shown
// I'm not sure how Library3 uses the reader, so I'm providing buffering
// invalidate r_1 (this question) - bonus: frees the buffer
r.seek(0);
FileInputStream f_2 = new FileInputStream(r.getFD());
Library1Result result1 = library1.Main.entry_point(f_2)
// process result1 (reassigned), not shown
// Yes, I actually have to call 'library1.Main.entry_point' *again* - same comments apply as from before
// invalidate f_2 (this question)
//
// I've been told to be careful when opening multiple streams from the same
// descriptor if one is buffered. This is very vague. I assume because I only
// ever use any stream once and exclusively, this code is safe.
//
A pure Java solution might be to create a forwarding decorator that checks on each method call whether the stream is validated or not. For InputStream this decorator may look like this:
public final class CheckedInputStream extends InputStream {
final InputStream delegate;
boolean validated;
public CheckedInputStream(InputStream stream) throws FileNotFoundException {
delegate = stream;
validated = true;
}
public void invalidate() {
validated = false;
}
void checkValidated() {
if (!validated) {
throw new IllegalStateException("Stream is invalidated.");
}
}
#Override
public int read() throws IOException {
checkValidated();
return delegate.read();
}
#Override
public int read(byte b[]) throws IOException {
checkValidated();
return read(b, 0, b.length);
}
#Override
public int read(byte b[], int off, int len) throws IOException {
checkValidated();
return delegate.read(b, off, len);
}
#Override
public long skip(long n) throws IOException {
checkValidated();
return delegate.skip(n);
}
#Override
public int available() throws IOException {
checkValidated();
return delegate.available();
}
#Override
public void close() throws IOException {
checkValidated();
delegate.close();
}
#Override
public synchronized void mark(int readlimit) {
checkValidated();
delegate.mark(readlimit);
}
#Override
public synchronized void reset() throws IOException {
checkValidated();
delegate.reset();
}
#Override
public boolean markSupported() {
checkValidated();
return delegate.markSupported();
}
}
You can use it like:
CheckedInputStream f_1 = new CheckedInputStream(new FileInputStream(r.getFD()));
// some io, not shown
f_1.invalidate();
f_1.read(); // throws IllegalStateException
Under unix you could generally avoid such problems by dup'ing a file descriptor.
Since java does not not offer such a feature one option would be a native library which exposes that. jnr-posix does that for example. On the other hand jnr depends on a lot more jdk implementation properties than your original question.

Turn off date comment in properties file [duplicate]

Is it possible to force Properties not to add the date comment in front? I mean something like the first line here:
#Thu May 26 09:43:52 CEST 2011
main=pkg.ClientMain
args=myargs
I would like to get rid of it altogether. I need my config files to be diff-identical unless there is a meaningful change.
Guess not. This timestamp is printed in private method on Properties and there is no property to control that behaviour.
Only idea that comes to my mind: subclass Properties, overwrite store and copy/paste the content of the store0 method so that the date comment will not be printed.
Or - provide a custom BufferedWriter that prints all but the first line (which will fail if you add real comments, because custom comments are printed before the timestamp...)
Given the source code or Properties, no, it's not possible. BTW, since Properties is in fact a hash table and since its keys are thus not sorted, you can't rely on the properties to be always in the same order anyway.
I would use a custom algorithm to store the properties if I had this requirement. Use the source code of Properties as a starter.
Based on https://stackoverflow.com/a/6184414/242042 here is the implementation I have written that strips out the first line and sorts the keys.
public class CleanProperties extends Properties {
private static class StripFirstLineStream extends FilterOutputStream {
private boolean firstlineseen = false;
public StripFirstLineStream(final OutputStream out) {
super(out);
}
#Override
public void write(final int b) throws IOException {
if (firstlineseen) {
super.write(b);
} else if (b == '\n') {
firstlineseen = true;
}
}
}
private static final long serialVersionUID = 7567765340218227372L;
#Override
public synchronized Enumeration<Object> keys() {
return Collections.enumeration(new TreeSet<>(super.keySet()));
}
#Override
public void store(final OutputStream out, final String comments) throws IOException {
super.store(new StripFirstLineStream(out), null);
}
}
Cleaning looks like this
final Properties props = new CleanProperties();
try (final Reader inStream = Files.newBufferedReader(file, Charset.forName("ISO-8859-1"))) {
props.load(inStream);
} catch (final MalformedInputException mie) {
throw new IOException("Malformed on " + file, mie);
}
if (props.isEmpty()) {
Files.delete(file);
return;
}
try (final OutputStream os = Files.newOutputStream(file)) {
props.store(os, "");
}
if you try to modify in the give xxx.conf file it will be useful.
The write method used to skip the First line (#Thu May 26 09:43:52 CEST 2011) in the store method. The write method run till the end of the first line. after it will run normally.
public class CleanProperties extends Properties {
private static class StripFirstLineStream extends FilterOutputStream {
private boolean firstlineseen = false;
public StripFirstLineStream(final OutputStream out) {
super(out);
}
#Override
public void write(final int b) throws IOException {
if (firstlineseen) {
super.write(b);
} else if (b == '\n') {
// Used to go to next line if did use this line
// you will get the continues output from the give file
super.write('\n');
firstlineseen = true;
}
}
}
private static final long serialVersionUID = 7567765340218227372L;
#Override
public synchronized Enumeration<java.lang.Object> keys() {
return Collections.enumeration(new TreeSet<>(super.keySet()));
}
#Override
public void store(final OutputStream out, final String comments)
throws IOException {
super.store(new StripFirstLineStream(out), null);
}
}
Can you not just flag up in your application somewhere when a meaningful configuration change takes place and only write the file if that is set?
You might want to look into Commons Configuration which has a bit more flexibility when it comes to writing and reading things like properties files. In particular, it has methods which attempt to write the exact same properties file (including spacing, comments etc) as the existing properties file.
You can handle this question by following this Stack Overflow post to retain order:
Write in a standard order:
How can I write Java properties in a defined order?
Then write the properties to a string and remove the comments as needed. Finally write to a file.
ByteArrayOutputStream baos = new ByteArrayOutputStream();
properties.store(baos,null);
String propertiesData = baos.toString(StandardCharsets.UTF_8.name());
propertiesData = propertiesData.replaceAll("^#.*(\r|\n)+",""); // remove all comments
FileUtils.writeStringToFile(fileTarget,propertiesData,StandardCharsets.UTF_8);
// you may want to validate the file is readable by reloading and doing tests to validate the expected number of keys matches
InputStream is = new FileInputStream(fileTarget);
Properties testResult = new Properties();
testResult.load(is);

Java Sonatype Async HTTP Client Upload Progress

I am trying to implement async file upload with progress with sonatype async http client - https://github.com/sonatype/async-http-client.
I tried the method suggested in the docs. Using transfer listener.
http://sonatype.github.com/async-http-client/transfer-listener.html
I implemented onBytesSent of TransferListener interface (just as test):
public void onBytesSent(ByteBuffer byteBuffer) {
System.out.println("Total bytes sent - ");
System.out.println(byteBuffer.capacity());
}
Then in another thread(because I don't want to block the app) I tried to do the following:
TransferCompletionHandler tl = new TransferCompletionHandler();
tl.addTransferListener(listener);
asyncHttpClient.preparePut(getFullUrl(fileWithPath))
.setBody(new BodyGenerator() {
public Body createBody() throws IOException {
return new FileBodyWithOffset(file, offset);
}
})
.addHeader(CONTENT_RANGE, new ContentRange(offset, localSize).toString())
.execute(handler).get();
Everything is fine. File is uploaded correctly and very fast. But the issue is - I am getting messages from onBytesSent in TransferListener only AFTER the upload is finished. For exmaple the upload is completed in 10 minutes. And during that 10 minutes I get nothing. And only after that everything is printed on the console.
I can't figure out what is wrong with this code. I just tried to follow the docs.
I tried to execute the above code in the main thread and it didn't work either.
Maybe it is a wrong way to implement upload progress listener using this client?
I will answer it myself. I did not manage to resolve the issue with TransferListener. So I tried the other way.
I had put the progress logick inside Body interface implementation (inside read method):
public class FileBodyWithOffset implements Body {
private final ReadableByteChannel channel;
private long actualOffset;
private final long contentLength;
public FileBodyWithOffset(final File file, final long offset) throws IOException {
final InputStream stream = new FileInputStream(file);
this.actualOffset = stream.skip(offset);
this.contentLength = file.length() - offset;
this.channel = Channels.newChannel(stream);
}
public long getContentLength() {
return this.contentLength;
}
public long read(ByteBuffer byteBuffer) throws IOException {
System.out.println(new Date());
actualOffset += byteBuffer.capacity();
return channel.read(byteBuffer);
}
public void close() throws IOException {
channel.close();
}
public long getActualOffset() {
return actualOffset;
}
}
Maybe it is a dirty trick, but at least it works.

How to put data from an OutputStream into a ByteBuffer?

In Java I need to put content from an OutputStream (I fill data to that stream myself) into a ByteBuffer. How to do it in a simple way?
You can create a ByteArrayOutputStream and write to it, and extract the contents as a byte[] using toByteArray(). Then ByteBuffer.wrap(byte []) will create a ByteBuffer with the contents of the output byte array.
There is a more efficient variant of #DJClayworth's answer.
As #seh correctly noticed, ByteArrayOutputStream.toByteArray() returns a copy of the backing byte[] object, which may be inefficient. However, the backing byte[] object as well as the count of the bytes are both protected members of the ByteArrayOutputStream class. Hence, you can create your own variant of the ByteArrayOutputStream exposing them directly:
public class MyByteArrayOutputStream extends ByteArrayOutputStream {
public MyByteArrayOutputStream() {
}
public MyByteArrayOutputStream(int size) {
super(size);
}
public int getCount() {
return count;
}
public byte[] getBuf() {
return buf;
}
}
Using this class is easy:
MyByteArrayOutputStream out = new MyByteArrayOutputStream();
fillTheOutputStream(out);
return new ByteArrayInputStream(out.getBuf(), 0, out.getCount());
As a result, once all the output is written the same buffer is used as the basis of an input stream.
Though the above-mention answers solve your problem, none of them are efficient as you expect from NIO. ByteArrayOutputStream or MyByteArrayOutputStream first write the data into a Java heap memory and then copy it to ByteBuffer which greatly affects the performance.
An efficient implementation would be writing ByteBufferOutputStream class yourself. Actually It's quite easy to do. You have to just provide a write() method. See this link for ByteBufferInputStream.
// access the protected member buf & count from the extend class
class ByteArrayOutputStream2ByteBuffer extends ByteArrayOutputStream {
public ByteBuffer toByteBuffer() {
return ByteBuffer.wrap(buf, 0, count);
}
}
Try using PipedOutputStream instead of OutputStream. You can then connect a PipedInputStream to read the data back out of the PipedOutputStream.
You say you're writing to this stream yourself? If so, maybe you could implement your own ByteBufferOutputStream and plug n' play.
The base class would look like so:
public class ByteBufferOutputStream extends OutputStream {
//protected WritableByteChannel wbc; //if you need to write directly to a channel
protected static int bs = 2 * 1024 * 1024; //2MB buffer size, change as needed
protected ByteBuffer bb = ByteBuffer.allocate(bs);
public ByteBufferOutputStream(...) {
//wbc = ... //again for writing to a channel
}
#Override
public void write(int i) throws IOException {
if (!bb.hasRemaining()) flush();
byte b = (byte) i;
bb.put(b);
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
if (bb.remaining() < len) flush();
bb.put(b, off, len);
}
/* do something with the buffer when it's full (perhaps write to channel?)
#Override
public void flush() throws IOException {
bb.flip();
wbc.write(bb);
bb.clear();
}
#Override
public void close() throws IOException {
flush();
wbc.close();
}
/*
}
See efficient implementation of ByteBuffer backed OutputStream with dynamic re-allocation.
/**
* Wraps a {#link ByteBuffer} so it can be used like an {#link OutputStream}. This is similar to a
* {#link java.io.ByteArrayOutputStream}, just that this uses a {#code ByteBuffer} instead of a
* {#code byte[]} as internal storage.
*/
public class ByteBufferOutputStream extends OutputStream {
private ByteBuffer wrappedBuffer;
private final boolean autoEnlarge;
https://gist.github.com/hoijui/7fe8a6d31b20ae7af945

Categories

Resources