Generating multiple thumbnails efficiently from a single Image stream - java

I have an image upload servlet which receives uploaded images via HTTP POST and are high resolution images with size varying from 5 MB - 75 MB. The image data is read from the request input stream and and saved onto a local disk. I am looking for a efficient mechanism to generate thumbnails in parallel (or part sequential if not fully parallel) of varied sizes (4-5 different sizes of which the largest is the webimage - 1024x768) from the request inputstream along with saving the stream into disk as original uploaded file.
What I could think of till now is -
Save the original stream as image file to disk.
Generate webimage (1024x768) which is the largest among the lot of thumbnails.
Then use this to generate subsequent smaller images as it would be faster.
Could someone please suggest a better effcient way ? The most desired approach would be to do this synchronously but async is also fine if its very efficient.
Any help in this regard will be much appreciated preferably in Java.

this is quite an interesting question as it has a number of points of optimisation.
Your idea about generating a smaller image then generating the thumbnails off that is probably a good one but the first thing I would say is if you have a 75MB image then it is clearly far bigger than 1024x768 - most likely some multiple of it, in which case you want to make certain you scale the image using SCALE_FAST (Image). What you want to achieve is that the scaling chops the image down to a smaller size by discarding pixels rather than trying to do anything nicer-looking (and much more expensive) like area averaging. You may even be able to get it to go faster by grabbing the int[] for the image and sampling every Nth element to create a new int[] for a new image, scaled down by some factor.
At that point you will have a smaller image, say roughly 2000 by 2000. You can then take that image and scale it using something nicer looking for the actual thumbnails like SCALE_SMOOTH.
I would say that you should not write it to disk if at all possible (during processing anyway). If you can do the operation in memory it will be far faster and that is doubly important where there is parallelism. Unless your server is running an SSD then having two disk-heavy operations running at the same time (like two of these images being rescaled at the same time or one image being rescaled to two different sizes) will force your disk to thrash (since the spindle can only read one stream at a time). You will then be at the mercy of your seek time and you'll quickly find that serialising all the operations will be far faster than doing multiple at once.
I would say rescale them in memory then write them (synchronized) to an ArrayList then have another thread reading those images sequentially and storing them. If you're not sure what I mean then have a look at my answer to another question here:
Producer Consumer solution in Java
This way you parallelise where its useful (CPU operations) and you do the file writes sequentially (avoiding thrashing).
Having said that you need to ask yourself if parallelising is going to benefit you at all. Does your server have multiple CPUs/Cores? If not then this is all moot and you should just not bother threading anything because it will only lose you time.
Further to this, if you are expecting a lot of these images to get uploaded at once then you may not need to parallellise the processing of each image since you will end up with multiple web server threads all processing one image each most of the time and that will give you good CPU utilisation across more than one core anyway. For example if you expect that at any one time there will be 4 images being uploaded constantly then this would utilise 4 cores just fine without further parallelisation.
One last note would be that when you are rescaling the images, once you have the intermediate image you can set the previous image to null to facilitate garbage collection, meaning when you generate the thumbnails you will only have the intermediate image in memory, not the original large size one.

Let me see if I got this right,
You have one large image and want to perform different operations on it at the same time. Some operations involve disk IO.
Option 1,
Start 1 thread to save the original hi res image to disk. This will take a long time compared to other operations because disk writing is slow.
Start other threads to create thumbnails of desired size. You need to resize the original image. I believe this can be done by cloning the bytes of the original image (in java I assume BufferedImage). You can then resize the clones according to the sizes you wish. The resizing operation is faster compared to writing to disk.
If you have 1 thread per thumbnail, you can use these threads to save their thumbnails to disk. The problem is that you will be done making the thumbnails fast, and all these threads will be writing to disk almost at once. The problem here is that they might be sent to different disk locations, instead of being grouped in the same physical area on disk (locality issue). The result is that, the disk writes will be even slower than not doing this in parallel because the disk has to seek the new location and write a bit of data, then the CPU does context switch and takes another thread which will write to another part of the disk (so another seek) and so on. So this idea is slow.
Note: use ExecutorService which has a thread pool, instead of individual threads. In my example I used 1 thread per thumbnail because it makes it easier to explain.
Option 2,
Another way you can do it is designate one thread to do disk writing, and a few other worker threads to do resizing. Cache all thmubnails into a list from which the thread which writes to disk will take them one by one and write them out.
Option 3,
Lastly, if you have multiple disks, you can give each thread a disk to write to, then all writes will be in parallel (more or less).
If you have RAID, the writes will be faster, but not as fast as I just mentioned above because the files are written in series not in parallel. RAID parallelizes writing parts of the same file (to different disks at once).

Related

Efficient Use of Java Heap Memory

The exact question I wanted to ask has already answered here. Still I just want to explore few more possibilities (if there are any).
Scenario: My application is thread based data centric web-app and the amount of data gets decided at run time by User. A user can request some data operation, which triggers multiple threads, each thread transport own data. Sometimes the data selection crash the application with OutOfMemoryError i.e. insufficient space to allocate new object in the Java heap. When there are multiple users using the application concurrently and most of them request big data operations this situation (OutOfMemoryError) is more likely to occur.
Question: Is there a way that I can prevent whole application being crashed? I can limit the amount of data being pulled in memory but if there is a better way than this? Even after limiting the amount of data per user, multiple concurrent user can generate OutOfMemoryError. One user can be put on hold or exit rather than all survive.
Consistently I have experienced the following point:
Stream large data out. Combined with a GZIPOutputStream on Accept-Inflate (maybe as servlet filter). This can be done for the file system and the database. Also there exist (URL based) XML pipelines where parts of the XML can be streamed.
This lowers memory costs, a stream may be throttled (articially slowed down.
PDF generation: optimize the PDF, repeated images only stored once, sensible font usage (ideally the PDF fonts, otherwise embedded fonts).
Office documents: the OpenOffice or Microsofts xlsx/docx variants.
In your case:
Have every process be combinable and stream there result to one output stream: a branching pipeline of tasks. If such a task might be called with the same parameters, yielding the same data, you could use parametrized URLs and cache the results.
I am aware this answer might not fit.

Should I fill stream into a sting/container for manipulation?

I want to make this general question. If we have a program which reads data from outside of the program, should we first put the data in a container and then manipulate the data or should we work directly on the stream, if the stream api of a language is powerful enough?
For example. I am writing a program which reads from text file. Should I first put the data in a string and then manipulate instead of working directly on the stream. I am using java and let's say it has powerful enough (for my needs) stream classes.
Stream processing is generally preferable to accumulating data in memory.
Why? One obvious reason is that the file you are reading might not even fit into memory. You might not even know the size of the data before you've read it completely (imagine, that you are reading from a socket or a pipe rather than a file).
It is also more efficient, especially, when the size isn't known ahead of time - allocating large chunks of memory and moving data around between them can be taxing. Things like processing and concatenating large strings aren't free either.
If the io is slow (ever tried reading from a tape?) or if the data is being produced in real time by a peer process (socket/pipe), your processing of the data read can, at least in part, happen in parallel with reading, which will speed things up.
Stream processing is inherently easier to scale and parallelize if necessary, because your logic is forced to only depend on the current element, being processed, you are free from state. If the amount of data becomes too large to process sequentially, you can trivially scale your app, by adding more readers, and splitting the stream between them.
You might argue, that in case none of this matters, because the file you are reading is only 300 bytes. Indeed, for small amounts of data, this is not crucial (you may also bubble sort it while you are at it), but adopting good patterns and practices makes you a better programmer, and will help when it does matters. There is no disadvantage to it. No, it does not make your code more complicated. It might seem so to you at first, but that's simply because you are not used to stream processing. Once you get into the right mindset, and it becomes natural to you, you'll see that, if anything, the code, dealing with one small piece of data at a time, and not caring about indexes, pointers and positions, is simpler than the alternative.
All of the above applies to sequential processing though. You read the stream once, processing the data immediately, as it comes in, and discarding it (or, perhaps, writing out to the next stream in pipeline).
You mentioned RandomAccessFile ... that's a completely different beast. If you need random access, and the data fits in memory, put it in memory. Seeking the file back and forth is the same thing conceptually, only much slower. There is no benefit to it other than saving memory.
You should certainly process it as you receive it. The other way adds latency and doesn't scale.

How to write java thread pool programme to read content of file?

I want to define thread pool with 10 threads and read the content of the file. But different threads must not read same content.(like divide content into 10 pieces and read each pieces by one thread)
Well what you would do would be roughly this:
get the length of the file,
divide by N.
create N threads
have each one skip to (file_size / N) * thread_no and read (file_size / N) bytes into a buffer
wait for all threads to complete.
stitch the buffers together.
(If you were slightly clever about it, you could avoid the last step ...)
HOWEVER, it is doubtful that you would get much speed-up by doing this. Indeed, I wouldn't be surprised if you got a slow down in many cases. With a typical OS, I would expect that you would get as good, if not better performance by reading the file using one big read(...) call from one thread.
The OS can fetch the data faster from the disc if you read it sequentially. Indeed, a lot of OSes optimize for this use-case, and use read-ahead and in-memory buffering (using OS-level buffers) to give high effective file read rates.
Reading a file with multiple threads means that each thread will typically be reading from a different position in the file. Naively, that would entail the OS to seeking the disk heads backwards and forwards between the different positions ... which will slow down I/O considerably. In practice, the OS will do various things to mitigate that, but even so, simultaneously reading data from different positions on a disk is still bad for I/O throughput.

Channel for sharing data between threads

I have a requirement where I need to read text file then transform it and write it to some other file. I wish to do this in parallel fashion like one thread for read, one for transform and another for write.
Now to share data between threads I need some channel, I was thinking to use BlockingQueue for this but would like to explore some other (better) alternatives if available.
Guava has a EventBus but not sure whether this is a good fit for the requirement. What other alternatives are available and which one is best from performance point of view.
Unless your transform step is really intensive, this is probably a waste of time.
Think of it this way. What are you asking for?
You're asking for something that
Takes an incoming stream of data
Copies it to another thread
Presents it to that thread as an incoming stream of data
What data structure best represents an incoming stream of data for step 3? (Hint: it's the InputStream you started with!)
What value do the first two steps add? The "transform" thread can read from disk just as fast as it could read from disk through another thread. Adding the thread inbetween does not speed up the disk read.
You would start to consider adding another thread when
Your problem can be usefully divided into independent pieces of work (say, each thread works on a chunk of text
The cost of splitting the problem into those pieces of work is significantly smaller than the overhead of adding an additional thread and coordinating between them (which is small, but not free!)
The problem requires more resources than a single CPU can provide (a thread gives you access to more CPU resources, but doesn't provide much value in terms of I/O throughput)

JAVA NIO ByteBuffer allocatation to fit largest dataset?

I'm working on an online game and I've hit a little snag while working on the server side of things.
When using nonblocking sockets in Java, what is the best course of action to handle complete packet data sets that cannot be processed until all the data is available? For example, sending a large 2D tiled map over a socket.
I can think of two ways to handle it:
Allocate the ByteBuffer large enough to handle the complete data set needed to process a large 2D tiled map from my example. Continue add read data to the buffer until it's all been received and process from there.
If the ByteBuffer is a smaller size (perhaps 1500), subsequent reads can be done and put out to a file until it can be processed completely from the file. This would prevent having to have large ByteBuffers, but degrades performance because of disk I/O.
I'm using a dedicated ByteBuffer for every SocketChannel so that I can keep reading in data until it's complete for processing. The problem is if my 2D Tiled Map amounts to 2MB in size, is it really wise to use 1000 2MB ByteBuffers (assuming 1000 is a client connection limit and they are all in use)? There must be a better way that I'm not thinking of.
I'd prefer to keep things simple, but I'm open to any suggestions and appreciate the help. Thanks!
Probably, the best solution for now is to use the full 2MB ByteBuffer and let the OS take care of paging to disk (virtual memory) if that's necessary. You probably won't have 1000 concurrent users right away, and when you do, you can optimize. You may be surprised what your real performance issues are.
I decided the best course of action was to simply reduce the size of my massive dataset and send tile updates instead of an entire map update. That way I can simply send a list of tiles that have changed on a map instead of the entire map over again. This reduces the need for such a large buffer and I'm back on track. Thanks.

Categories

Resources