What are intended use cases for the BitmapFactory.Options.inTempStorage option?
Documentation is pretty terse on this:
Temp storage to use for decoding. Suggest 16K or so.
If I'm not mistaken it means that if you don't provide the buffer explicitly, it would create and use one by itself.
So the only benefit I see is reusing the same 16K buffer for multiple decodings which seems to have quite questionable impact on performance/memory usage optimization.
So why SDK authors give us control over the temp storage for decoding? Should providing much greater buffer improve decoding performance?
Can someone expand on this?
It seems that your assumption is the correct one - this option is mainly for recycling the buffer itself.
From the Android Source Code:
// pass some temp storage down to the native code. 1024 is made up,
// but should be large enough to avoid too many small calls back
// into is.read(...) This number is not related to the value passed
// to mark(...) above.
byte [] tempStorage = null;
if (opts != null) tempStorage = opts.inTempStorage;
if (tempStorage == null) tempStorage = new byte[16 * 1024];
This means that if you do not send this buffer, it will be allocated. Though does not look like an optimization for most cases, if you load many small images - the allocation of a 16K buffer per image might be pricy.
Regarding the buffer size, as you can see from the comments in the code - there is no magic number. What happens is that the Native code that decodes the image, uses the InputStream managed code to fetch the actual raw bytes (from disk/network etc). It uses the allocated buffer to communicate the bytes for each READ call. So, it is really depends on the InputStream. For example, disk IS might read from the disk in a bulk of 4k and then 16k is more than enough - passing in a buffer bigger than that will not improve the performance since the buffer will not fill up more than 4k at each READ call.
In any case, considering this kind of optimization should be for a really specific cases - if you have such a case, you can provide a bigger buffer and see if it has any affect on the performance.
Related
I have an Android application that reads a lot of chunks of bytes one by one via network, then combine them into a large buffer. For example,
ByteArrayOutputStream outputStream = new ByteArrayOutputStream( );
while (i < 10) {
// read is an API in a lib that returns byte[].
byte[] bytes = API.read();
outputStream.write(bytes);
i++;
}
...
The question is about the memory for bytes. Is there a way to force Java to use the same chunk of byte for all reads? So it does not have to free and allocate memory too much? Will JAVA runtime optimize the case? Thanks.
The byte[] will be garbage collected. It is not appropriate to use an NIO ByteBuffer in this case as you are getting byte[] anyway, though it could come in handy later.
With each loop iteration, a byte[] is being created and filled with data, read from into the stream, and then is no longer used. Once memory runs low (or earlier depending on how your JVM operates) the array will be deleted and the memory made available.
You need not worry about such things most of the time (unless you are concatenating tons of strings, which is extremely inefficient for this reason).
I have made an application in android that lets the user compress and decompress files and I used the package java.util.zip. Everything is okay. the speed, files are totally compressed and decompressed together with the directories. The only problem is that the application is not able to compress/decompress large files (greater than 1gb).
I believe the problem is the size of my buffer. Other codes that I've seen, the value of their buffer is 1024 or 2048 or 8192 but my value of my buffer is base on the size of the chosen file (just to make it flexible). But once the user chose a large file (with a size of >8 digits), that's were the error comes out. I searched over the net and also here in this site but I can't find an answer. my problem is similar to this:
To Compress a big file in a ZIP with Java
Thanks for the future help! :)
EDIT:
Thanks for the comments and answers. It really helped a lot. I thought BUFFER in compressing/decompressing in java means the size of file so in my program, I made the buffer size flexible (buffer size = file size). Will someone please explain how buffer works so I can understand why is it okay that BUFFER has a fixed value. Also for me to figure it out why others people is telling that it is much better if the buffer size is 8k or else. Thanks a lot! :)
If you size the buffer to the size of the file, then it means that you will have OutOfMemoryError whenever the file size is too big for memory available.
Use a normal buffer size and let it do it's work - buffering the data in a streaming fashion, one chunk at a time, rather than all in one go.
For explanation, see for example the documentation of BufferedOutputStream:
The class implements a buffered output stream. By setting up such an
output stream, an application can write bytes to the underlying output
stream without necessarily causing a call to the underlying system for
each byte written.
So using a buffer is more efficient than non-buffered writing.
And from the write method:
Ordinarily this method stores bytes from the given array into this
stream's buffer, flushing the buffer to the underlying output stream
as needed. If the requested length is at least as large as this
stream's buffer, however, then this method will flush the buffer and
write the bytes directly to the underlying output stream.
Each write causes the in-memory buffer to fill up, until the buffer is full. When the buffer is full, it is flushed and cleared. If you use a very large buffer, you will cause a large amount of data to be stored in memory before flushing. If your buffer is the same size as the input file, then you are saying you need to read the whole content into memory before flushing it. Using the default buffer size is usually just fine. There will be more physical writes (flushes); you avoid exploding memory.
By allowing you to specify a specific buffer size, the API is letting you choose the right balance between memory consumption and i/o to suit your application. If you tune your application for performance, you might end up tweaking buffer size. But the default size will be reasonable for many situations.
It sounds like it would help to simply set a maximum size for the buffer, something like:
//After calculating the buffer size bufSize:
bufSize = Math.min(bufSize, MAXSIZE);
I need a byte buffer class in Java for single-threaded use. The buffer should resize when it's full, rather than throw an exception or something. Very important issue for me is performance.
What would you recommend?
ADDED:
At the momement I use ByteBuffer but it cannot resize. I need one that can resize.
Any reason not to use the boring normal ByteArrayOutputStream?
As mentioned by miku above, Evan Jones gives a review of different types and shows that it is very application dependent. So without knowing further details it is hard to speculate.
I would start with ByteArrayOutputStream, and only if profiling shows it is your performance bottleneck move to something else. Often when you believe the buffer code is the bottleneck, it will actually be network or other IO - wait until profiling shows you need an optimisation before wasting time finding a replacement.
If you are moving to something else, then other factors you will need to think about:
You have said you are using single threaded use, so BAOS's synchronization is not needed
what is the buffer being filled by and fed into? If either end is already wired to use Java NIO, then using a direct ByteBuffer is very efficient.
Are you using a circular buffer or a plain linear buffer? If you are then the Ostermiller Utils are pretty efficient, and GPL'd
You can use a direct ByteBuffer. Direct memory uses virtual memory to start with is only allocated to the application when it is used. i.e. the amount of main memory it uses re-sizes automagically.
Create a direct ByteBuffer larger than you need and it will only consume what you use.
you can also write manual code for checking the buffer content continously and if its full then make a new buffer of greater size and shift all the data in that new buffer.
In Java, I have a method
public int getNextFrame( byte[] buff )
that reads from a file into the buffer and returns the number of bytes read. I am reading from .MJPEG that has a 5byte value, say "07939", followed by that many bytes for the jpeg.
The problem is that the JPEG byte size could overflow the buffer. I cannot seem to find a neat solution for the allocation. My goal is to not create a new buffer for every image. I tried a direct ByteBuffer so I could use its array() method to get direct access to the underlying buffer. The ByteBuffer does not expand dynamically.
Should I be returning a reference to the parameter? Like:
public ByteBuffer getNextFrame( ByteBuffer ref )
How do I find the bytes read? Thanks.
java.io.ByteArrayOutputStream is a wrapper around a byte-array and enlarges it as needed. Perhaps this is something you could use.
Edit:
To reuse just call reset() and start over...
Just read the required number of bytes. Do not use read(buffer), but use read(buffer,0,size). If there are more bytes, just discard them, the JPG is broken anyway.
EDIT:
Allocating a byte[] is so much faster than reading from a file or a
socket, I would be surprised it will make much difference, unless you
have a system where micro-seconds cost money.
The time it takes to read a file of 64 KB is about 10 ms (unless the
file is in memory)
The time it takes to allocate a 64 KB byte[] is about 0.001 ms,
possibly faster.
You can use apache IO's IOBuffer, however this expands very expensively.
You can also use ByteBuffer, the position() will tell you how much data was read.
If you don't know how big the buffer will be and you have a 64-bit JVM you can create a large direct buffer. This will only allocate memory (by page) when used. The upshot is that you can allocate a 1 GB but might only ever use 4 KB if that is all you need. Direct buffer doesn't support array() however, you would have to read from the ByteBuffer using its other methods.
Another solution is to use an AtomicReference<byte[]> the called method can increase the size as required, but if its large enough it would reuse the previous buffer.
The usual way of accomplishing this in a high-level API is either let the user provide an OutputStream and fill it with your data (which can be a ByteArrayOutputStream or something completely different), or have an InputStream as return value, that the user can read to get the data (which will dynamically load the correct parts from the file and stop when finished).
I have a servlet that acts as a proxy for fetching images by reading the images as bytes off a HttpURLConnection input stream and then writing the bytes to the response output stream. Here's the relevant code snippet:
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setConnectTimeout(CONNECT_TIMEOUT);
connection.setReadTimeout(READ_TIMEOUT);
InputStream in = connection.getInputStream();
OutputStream out = resp.getOutputStream();
byte[] buf = new byte[1024];
int count = 0;
while ((count = in.read(buf)) >= 0) {
out.write(buf, 0, count);
}
I would like to start caching the image in the proxy servlet. I'm considering wrapping the byte array and storing in a Map but I suspect there is a better way. I've noticed the javax.imagio package but I have no experience with it and not sure if its relevant here. Specifically, I am looking for thoughts on how to store the image and not so much the mechanics of caching.
If you are only caching the images, I would recommend keeping the image as a byte array, not as an image. Using imageio to read the image would uncompress the images and they would take much more memory space.
The class WeekHashMap is probably the easiest way to cache things but you have little control on the way entries are evicted from it.
In some limited cases, a hash map could work. But you need to think about:
(1) How you're going to purge cached images from memory when the cache gets "full" (however you define that -- probably some maximum amount of memory that you want to devote to caching).
(2) How you're going to deal with concurrency.
(3) Relatedly, how you're going to deal with the case where client A requests an image, and then client B requests the same image while it is still being loaded into the cache for client A.
A very simple solution to (1) could be to always store SoftReferences to the image data and let the JVM take care of deciding when to purge them (bearig in mind it could arbitrarily purge them at times beyond your control). Otherwise, you need to develop some kind of policy (first in, image accessed longest ago, smallest/largest image etc, image that will take longest to decode if we have to load it again etc)-- only you know your data and usage, so you have to find the right policy.
For (2), ConcurrentHashMap will generally help you out; you may decide to use explicit locks and other concurrency utilities in fancier cases.
For (3), a fairly elegant solution proposed by Goetz et al is to hijack the Future class. In your map, you store a Future to the cached object (or to your "cache entry" object). If a requester finds that a Future has already been added to the map, then it can call get() and wait for the other thread to finish caching the data. (You could achieve a similar effect with an explicit lock and condition, but Future takes some of the work out for you.)
P.S. I agree with the poster who said you probably want to store the images in their original coded form. But from your code I'm assuming that was probably what you were intending all along.