Executor Service and Huge IO

Executor Service and Huge IO - java

I have a service which calls a database and performs a callback on each result.
ExecutorService service = Executors.newFixedThreadPool(10);
service.exectute(runnable(segmentID, callback)); // database is segmented
Runnable is:
call database - collect all the rows for the segment keep in memory
perform callback(segment);
Now the issue is I get a huge rows returned by database and my understanding is executor service will schedule threads whenever they are idle in I/O. So I go into Out of Memory.
Is there any way to restrict only 10 threads are running at a time and no executor service scheduling happens?
For some reason I have to keep all the rows of a segment in memory.
How can I prevent going OOM by doing this. Is Executor service newFixedThreadPool solution for this?
Please let me know if I missed anything.
Thanks

You must use a fixed thread pool. There's a rule that you should only spawn N threads where N should be in the same order of magnitude than the number of cores in the CPU. There's a debate on the size of N, and you can read more about it here. For a normal CPU we could be talking 4,8, 16 threads.
But even if you were running your program in a cluster, which I think you are not, you can't just fetch 20k rows from a DB and pretend to spawn 20k threads. If you do, the performance of your app is going to degrade big time, because most of the CPU cycles would be consumed in context switching.
Now even with fixed thread pool, you might run into OOM exceptions anyway if the data fetched is stored in memory at the same time. I think the only solution to this is to fetch smaller chunks of data, or write the data to a file as it gets downloaded.

Related

how can i utilize the power of CLUSTER ENVIRONMENT for my thread pool that is dealing with I/O bound jobs?

I have developed a JAVA based server having a Thread Pool that is dynamically growing with respect to client request rate.This strategy is known as FBOS(Frequency Based Optimization Strategy) FBOS for Thread pool System.
For example if request rate is 5 requests per second then my thread pool will have 5 threads to service client's requests. The client requests are I/O bound jobs of 1 seconds.i.e. each request is a runnable java object that have a sleep() method to simulate I/O operation.
If client request rate is 10 requests per second then my thread pool will have 10 threads inside in it to process clients. Each Thread have an internal timer object that is activated when its corresponding thread is idle and when its idle time becomes 5 seconds the timer will delete its corresponding thread from the Thread Pool to dynamically shrink the Thread Pool.
My strategy is working well for short I/O intensities.My server is working nicely for small request rate but for large request rate my Thread pool have large number of threads inside it. For example if request rate is 100 request per second then my Thread Pool will have 100 threads inside it.
Now I have 3 questions in my mind
(1) Can i face memory leaks using this strategy, for large request rate?
(2) Can OS or JVM face excessive Thread management overhead on large request rate that will slow down the system
(3) Last and very important question is that ,I am very curious to implement my thread Pool in a clustered environment(I am DUMMY in clustering).
I just want to take advice from all of you that how a clustering environment can give me more benefit in the scenario of Frequency Based Thread Pool for I/O bound jobs only. That is can a clustering environment give me benefit of using memories of other systems(nodes)?

The simplest solution to use is a cached thread pool, see Executors I suggest you try this first. This will create the number of threads to need at once. For an IO bound request, a single machine can easily expand to 1000s of threads without needing an additional server.
Can i face memory leaks using this strategy, for large request rate?
No, 100 per second is not particularly high. If you are talking over 10,000 per second, you might have a problem (or need another server)
Can OS or JVM face excessive Thread management overhead on large request rate that will slow down the system
Yes, my rule of thumb is that 10,000 threads wastes about 1 cpu in overhead.
Last and very important question is that ,I am very curious to implement my thread Pool in a clustered environment(I am DUMMY in clustering).
Given you look to be using up to 1% of one machine, I wouldn't worry about using multiple machines to do the IO. Most likely you want to process the results, but without more information you couldn't say whether more machines would help or not.
can a clustering environment give me benefit of using memories of other systems(nodes)?
It can help if you need it or it can add complexity you don't need if you don't need it.
I suggest you start with a real problem and look for a solution to solve it, rather than start with a cool solution and try to find a problem for it to solve.

Define Parallel Processing Thread Pool Count and Sleep time

I need to update 550 000 records in a table with the JBOSS Server is Starting up. I need to make this update as a backgroundt process with multiple threads and parallel processing. Application is Spring, so I can use initializing bean for this.
To perform the parallal processing I am planning to use Java executor framework.
ThreadPoolExecutor executor=(ThreadPoolExecutor)Executors.newFixedThreadPool(50); G
How to decide the thread pool count?
I think this is depends on hardware My hardware. it is 16 GB Ram and Co-i 3 processor.
Is it a good practice to Thread.sleep(20);while processing this big update as background.

I don't know much about Spring processing specifically, but your questions seem general enough that I can still provide a possibly inadequate answer.
Generally there's a lot of factors that go into how many threads you want. You definitely don't want multiple threads on a core, as that'll slow things down as threads start contending for CPU time instead of working, so probably your core count would be your ceiling, or maybe core count - 1 to allow one core for all other tasks to run on (so in your case maybe 3 or 4 cores, tops, if I remember core counts for i3 processors right). However, in this case I'd guess you're more likely to run into I/O and/or memory/cache bottlenecks, since when those are involved, those are more likely to slow down your program than insufficient parallelization. In addition, the tasks that your threads are doing would affect the number of threads you can use; if you have one thread to pull data in and one thread to dump data back out after processing, it might be possible for those threads to share a core.
I'm not sure why this would be a good idea... What use do you see for Thread.sleep() while processing? I'd guess it'd actually slow down your processing, because all you're doing is putting threads to sleep when they could be working.
In any case, I'd be wary of parallelizing what is likely to be an I/O bound task. You'll definitely need to profile to see where your bottlenecks are, even before you start parallelizing, to make sure that multiple cores will actually help you.
If it is the CPU that is adding extra time to complete your task, then you can start parallelizing. Even then, be careful about cache issues; try to make sure each thread works on a totally separate chunk of data (e.g. through ThreadLocal) so cache/memory issues don't limit any performance increases. One way this could work is by having a reader thread dump data into a Queue which the worker threads can then read into a ThreadLocal structure, process, etc.
I hope this helped. I'll keep updating as the mistakes I certainly made are pointed out.

Clarification on Thread performance processing 1000's of log files

I am extracting out lines matching a pattern from log files. Hence I allotted each log file to a Runnable object which writes the found pattern lines to a result file. (well synchronised writer methods)
Important snippet under discussion :
ExecutorService executor = Executors.newFixedThreadPool(NUM_THREAD);
for (File eachLogFile : hundredsOfLogFilesArrayObject) {
executor.execute(new RunnableSlavePatternMatcher(eachLogFile));
}
Important Criteria :
The number of log files could be very few like 20 or for some users the number of logs files could cross 1000. I recorded series of tests in an excel sheet and I am really concerned on the RED marked results. 1. I assume that if the number of threads created is equal to the number of files to be processed then the processing time would be less, compared to the case when the number of thread is lesser than the number of files to be processed which didn't happen. (please advice me if my understanding is wrong)
Result :
I would like to identify a value for the NUM_THREAD which is efficient for less number of files as well as 1000's of files
Suggest me answer for Question 1 & 2
Thanks !
Chandru

you just found that your program is not CPU bound but (likely) IO bound
this means that beyond 10 threads the OS can't keep up with the requested reads of all the thread that want their data and more threads are waiting for the next block of data at a time
also because writing the output is synchronized across all threads that may even be the biggest bottle neck in your program, (producer-consumer solution may be the answer here to minimize the time threads are waiting to output)
the optimal number of threads depends on how fast you can read the files (the faster you can read the more threads are useful),

It appears that 2 threads is enough to use all your processing power. Most likely you have two cores and hyper threading.
Mine is a Intel i5 2.4GHz 4CPU 8GB Ram . Is this detail helpful ?
Depending on the model, this has 2 cores and hyper-threading.
I assume that if the number of threads created is equal to the number of files to be processed then the processing time would be less,
This will maximise the overhead, but wont give you more cores than you have already.

When parallelizing, using a lot more threads than you have available cpu cores will usually increase the overall time. You system will spend some overhead time switching from thread to thread on one cpu core instead of having it executing the tasks at once, one after an other.
If you have 8 cpu cores on your computer, you might observe some improvement using 8/9/10 threads instead of using only 1 while using 20+ threads will actually be less efficient.

One problem is that I/O doesn't parallelize well, especially if you have a non-SSD, since sequential reads (what happens when one thread reads a file) are much faster than random reads (when the read head has to jump around between different files read by several threads). I would guess you could speed up the program by reading the files from the thread sending the jobs to the executor:
for (File file : hundredsOfLogFilesArrayObject) {
byte[] fileContents = readContentsOfFile(file);
executor.execute(new RunnableSlavePatternMatcher(fileContents));
}
As for the optimal thread count, that depends.
If your app is I/O bound (which is quite possible if you're not doing extremely heavy processing of the contents), a single worker thread which can process the file contents while the original thread reads the next file will probably suffice.
If you're CPU bound, you probably don't want many more threads than you've got cores:
ExecutorService executor = Executors.newFixedThreadPool(
Runtime.getRuntime().availableProcessors());
Although, if your threads get suspended a lot (waiting for synchronization locks, or something), you may get better result with more threads. Or if you've got other CPU-munching activitities going on, you may want fewer threads.

You can try using cached thread pool.
public static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. Calls to execute will reuse previously constructed threads if available.
You can read more here

How to make a Java Application faster?

I have a billing daemon that must process hundred thousands of data in a very fast manner. I implemented ExecutorSerivce for parallel processing. It did increase the speed but not very much. It takes approx 2.5-3 hours to process 1,00,000 records. How can I make it more faster like processing those data within half an hour?
I have written the following for execution setting:
-Xms2048M -Xmx2048M -XX:MaxPermSize=256m
I tried to implement a Producer Consumer model with 1 producer and 4 consumers. Each list can contain 10,000 records.
ArrayBlockingQueue<BillableList> list =new ArrayBlockingQueue<BillableList>(10);
ExecutorService threadPool = Executors.newFixedThreadPool(5);
threadPool.execute(new Consumer("pool1", list));
threadPool.execute(new Consumer("pool2", list));
threadPool.execute(new Consumer("pool3", list));
threadPool.execute(new Consumer("pool4", list));
Future producerStatus = threadPool.submit(new Producer("Producer", list));
producerStatus.get();
threadPool.shutdown();
I also get a lot of "database lock wait timeout exceeded" exceptions while updating records to the database. Is it due to different consumers trying for the same user at the same time? How can I make different consumers take different data from ArrayBlockingQueue's list?

The only possible answer to this is "Use a profiler and find out why it's slow". You can't do anything about a problem when you don't know where the problem is. What are you going to, pick a random function and micro-optimize it? Profiler data or nothing will ever, ever happen.

How can I make it more faster like processing those data within half an hour?
If adding threads did not help then chances are you are being limited not my CPU but by some other factor. Most likely disk or network IO. As mentioned, profiling your code should show you the culprit.
I also get a lot of "database lock wait timeout exceeded" exceptions while updating records to the database.
And there's your big clue. Regardless of how many threads are working on a the job, if they are all waiting for the database then adding threads is not making it faster.
Here are some ideas:
Increase the physical speed of your database box. SSDs can provide wondrous improvements for IO intensive operations. Increasing the memory can also give a big win because of disk cache.
Consider sharding your data and writing to multiple database instances. This may not be possible given your schema.
Consider turning off auto-commit and manually committing after every ~100 or so operations.
Watch out for indexes. If you are doing some sort of bulk load, often if you turn off indexes your inserts will run faster. Adding the indexes at the end takes a while but still is a win.
Also, if you are doing queries, make sure you have good indexes where needed. Check your database logs to see which queries are taking too long to see if you are missing some indexes in key places.

how many threads to run in java?

I had this brilliant idea to speed up the time needed for generating 36 files: use 36 threads!! Unfortunately if I start one connection (one j2ssh connection object) with 36 threads/sessions, everything lags way more than if I execute each thread at a time.
Now if I try to create 36 new connections (36 j2ssh connection objects) then each thread has a separate connection to server, either i get out of memory exception (somehow the program still runs, and successfully ends its work, slower than the time when I execute one thread after another).
So what to do? how to find the optimal thread number I should use?
because Thread.activeCount() is 3 before starting mine 36 threads?! i'm using Lenovo laptop Intel core i5.

You could narrow it down to a more reasonable number of threads with an ExecutorService. You probably want to use something near the number of processor cores available, e.g:
int threads = Runtime.getRuntime().availableProcessors();
ExecutorService service = Executors.newFixedThreadPool(threads);
for (int i = 0; i < 36; i++) {
service.execute(new Runnable() {
public void run() {
// do what you need per file here
}
});
}
service.shutdown();

A good practice would be to spawn threads equivalent to the number of cores in your processor. I normally use a Executors.fixedThreadPool(numOfCores) executor service and keep feeding it the jobs from my job queue, simple. :-)

Your Intel i5 has two cores; hyperthreading makes them look like four. So you only get four cores' worth of parallelization; the rest of your threads are time sliced.
Assume 1MB RAM per thread just for thread creation, then add the memory that each thread requires to process the file. That will give you an idea about why you're getting out of memory errors. How big are the files you're dealing with? You can see that you'll have a problem if they're very large to have them in memory at the same time.
I'll assume that the server receiving the files can accept multiple connections, so there's value in trying this.
I'd benchmark with 1 thread and then increase them until I found that the performance curve was flattening out.

Brute force: Profile incrementally. Increase the number of threads gradually and check the performance. As the number to connections is just 36, its should be easy

You need to understand that if you create 36 threads you still have one or two processors and it would be switching between threats most of the time.
I would say you increment the threads a little bit, let's say 6 and see the behavior. And then go from there

One way to tune the numebr of threads to the size of the machine is to use
int processors = Runtime.getRuntime().availableProcessors();
int threads = processors * N; // N could 1, 2 or more depending on what you are doing.
ExecutorService es = Executors.newFixedThreadPool(threads);

First you have to find out where the bottle neck is.
If it is the SSH connection, it usually does not help to open multiple connections in parallel. Better use multiple channels on one connection, if needed.
If it is the disk IO, creating multiple threads writing (or reading) only helps if they are accessing different disks (which is seldom the case). But you could have another thread doing CPU-bound things while you are waiting on your disk IO in one thread.
If it is the CPU, and you have enough idle cores, more threads can help. Even more, if they don't need to access common data. But still, more threads than cores (+ some threads doing IO) does not help. (Also take in mind that usually there are other processes on your server, too.)

Using more threads than the number of cores on your machine is going only to slow down the whole process. It will speed up till you reach this number.

Be sure you don't create more threads than you have processing units or you are likely to create more overhead with context switching than you gain in concurrency. Also remember that you only have 1 HDD and 1 HDD controller as a result, I doubt multithreading is going to help you at all here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.