I searched in Google for a solution but I'm still a bit confused about how many threads I should use in my particular case.
I have two usages of threads. First, I have a folder with 10 files in it which I want to parse in parallel (independent of each other). Second, I have a shared data object on which 100 tasks run. Each tasks consists of reading the data object and writing to a shared structure (HashMap).
Should I use only as many threads as CPU cores? Or should I use a ThreadPoolExecutor with a minimum number of threads equals 2 and a maximum number equals 999 (then 100 threads are created)?
Consider use of Executors.newCachedThreadPool(). This creates a thread pool with as many threads needed and reuse idle threads.
I can't tell you how many threads will be created for your 100 tasks. If task is long to execute, 100 threads will be created to start all tasks in parallel immediatly. If task is very short or if you don't push all tasks at the same moment, first thread will be reused for executing more tasks (and not only one).
By the way, creating a thread implies some cost (cpu and memory) and too many threads can be useless due to limitation of number of cores. In this case, you can also limit the number of threads using Executors.newFixedThreadPool( int nThreads ).
A widespread practice is use of number of cores x 2 for the thread count
The ThreadPoolExecutor is only a higher level way to apply multithreading the substance don't change, but it's use can be helpful in the management.
There is no real rules all depends on the type of processing, IO, sync/async tasks involved.
Normally for batch processing for evaluate the number of needed/optimal threads I start with a number of thread == number of CPU then by trial I estimate if can be benefical increase them, depending on the type of tasks involved a slightly higher number of threads (than the number of cores) can be benefical to performance
For example you can try starting with 1.5*cpu tasks verifying the performance difference with 1*cpu and 2*cpu
Bye
Using Executors is recommended as in that case you shall have a minimum threshold for creation of threads and threads will be reused otherwise creating separate threads for each task may result in creation of too many threads.
Related
Im using a thread pool to execute tasks , that are mostly cpu based with a bit of I/O,
of size one larger than the number of cpus.
Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() + 1)
Assuming case of simple program that submits all its tasks to this executor and does little else I assume having a thread pool any larger would slow things because the OS would have to timeslice it cpus more often chance to give each thread in the threadpool a chance to run.
Is that correct, and if so is this a real problem or mostly theoretical, i.e if I increased threadpool size to 1000 would I notice a massive difference.
If you have CPU bound tasks, as you increase the number of threads you get increasing overhead and slower performances. Note: having more threads than waiting tasks is just a waste of resources, but may not slow down the tasks so much.
I would use a multiple (e.g. 1 or 2) of the number of cpus rather than adding just one as having one too many threads can have a surprising amount of overhead.
For reference, check this description.
http://codeidol.com/java/java-concurrency/Applying-Thread-Pools/Sizing-Thread-Pools/
In short, what you have (No. CPU + 1) is optimal on average.
In my Java-system, I have X persons, each person has Y strings, where Y >> X. I need to execute some complex calculations on each string. In order to boost the process, I run strings computing process in separate threads (threads number = CPU cores * 2). My question is should I put each person treatment in the separate thread too or it is enough to run only strings treatment in separate threads?
Should I execute person treatment in separate threads in additional to the thread-based strings computing? Or, because I'm already using the maximum optimal number of threads per number of CPU cores for strings treatment I will not benefit if will put the persons in the separate threads.
All persons are independent of each other.
All person's strings are independent of each other.
I think creating additional threads can slow down the processing, because of some additional overhead needed for new threads creation. But to be sure try to do an experiment. Try with different numbers of threads, then choose the optimal number.
P.S. Like other people in this topic I would recommend using thread pool for this task.
P.P.S. Consider using java.util.concurrent FixedThreadPool (launches n threads, if there are more tasks they are waiting for free thread) or CachedThreadPool (if there are more tasks creates new thread, otherwise reuses existing sleeping threads).
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int)
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool()
I am first assuming that the threads are native threads (not green threads for performance reasons).
There isn't really a performance consideration with passing references of objects into the thread
other than making the gc continually skipping the reference for clean up which is more efficient than
serializing/deserializing the object into the thread.
Long story short, you should avoid creating any unnecessary threads that exceed the hardware capacity if you know that the running threads have a high utilization rate (ie very rarely blocking on io/net/db/etc) otherwise you will force the cpu to perform a thread context switch which is very expensive.
I would likely create a thread pool with a configurable size, which process a queue of person objects.
This allows then a thread to access, update and process an entire persons data without concerns of conflicts with other threads.
If there is IO within the process, you might be able to increase your thread pool size, or decrease it if over utilising the CPU.
Hope that helps
If processing each string takes in the order of 1µs or more, you should be fine putting each string processing in its own Runnable and pass that job to a ThreadPool with as many worker threads as you have logical CPUs. If they are faster, you should batch them so there is less overhead handling the job queue.
If number of threads are increased from nThread to nThread + 1, the speed decreases by half.
ExecutorService executor = Executors.newFixedThreadPool(nThread);
If I just set nThread to 1, it doesn't use all my cores. What's going on?
My task doesn't involve reading file or network. It creates objects and computes. However, it reads a data from a vector.
Can multiple threads reading data from a same vector decrease performance? If so, how can I fix then?
A vector is an old list implementation that relies on a lock to provide threadsafety. If multiple threads at the same time are accessing that vector, these threads will suffer from lock contention and that is probably what you are experiencing now.
If the vector is only read from, I would replace it by an ArrayList (or an array). Because no locking is done, and in case of a readonly data-structure, isn't needed.
If number of threads are more then irrespective of number of tasks, context switching will be slow as threads in a thread pool executor are of same priority and CPU has to be shared amongst them. Also more are number of threads , more are the chances of threads waiting for a monitor.
Even if there is no synchronization, more number of threads can heavily affect performance.
In one of the applications I have worked upon, there was a task of xml parsing which took 100 ms, increased to 5 seconds when number of threads increased from 10 to 50.
Configuring thread pool is a learn and implements thing. It do depends on the no of cores in CPU, More cores will allow more parallel processing.
Im using a thread pool to execute tasks , that are mostly cpu based with a bit of I/O,
of size one larger than the number of cpus.
Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() + 1)
Assuming case of simple program that submits all its tasks to this executor and does little else I assume having a thread pool any larger would slow things because the OS would have to timeslice it cpus more often chance to give each thread in the threadpool a chance to run.
Is that correct, and if so is this a real problem or mostly theoretical, i.e if I increased threadpool size to 1000 would I notice a massive difference.
If you have CPU bound tasks, as you increase the number of threads you get increasing overhead and slower performances. Note: having more threads than waiting tasks is just a waste of resources, but may not slow down the tasks so much.
I would use a multiple (e.g. 1 or 2) of the number of cpus rather than adding just one as having one too many threads can have a surprising amount of overhead.
For reference, check this description.
http://codeidol.com/java/java-concurrency/Applying-Thread-Pools/Sizing-Thread-Pools/
In short, what you have (No. CPU + 1) is optimal on average.
I'm trying to use thread pool in Java. But the number of threads is unknown, so I'm trying to find a solution. Then two questions occured:
I'm looking for increasing size of thread pool for some time, but I couldn't come up with something yet. Any suggestions for that? Some say Executors.newCachedThreadPool() should work but in definition of the method it says it's for short-time threads.
What if I set the size of the thread pool as a big number like 50 or 100? Does it work fine?
You can use Executors.newCachedThreadPool for more long-lived tasks also, but the thing is that if you have long running tasks, and they're added constantly and more frequently than existing tasks are being completed, the amount of threads will spin out of control. In such case it might be a better idea to use a (larger) fixed-size thread pool and let the further tasks wait in queue for a free thread.
This will only mean you'll (probably) have lots of alive threads that are sleeping (idle) most of the time. Basically the things to consider are
How many threads can your system handle (ie. how many threads can be created in total, in Windows-machines this can be less than 1000, in Linuces you can get tens of thousands of thread and even more with some tweaking of the system configuration)
Each thread consumes at least the stack size of a single thread in terms of memory (in Linux, this can be something like 1-8MB per thread by default, again it can be tweaked from ulimits and the JVM's -Xss -parameter)
At least with NPTL, there should minimal or almost zero context-switching penalty for sleeping threads, so excess threads aren't "heavy" in terms of cpu usage
That being said, it'd probably be best to use the ThreadPoolExecutor's constructors directly to get the kind of pooling you want.
Executors.newCachedThreadPool() allows you to create thread on demands. I think you can start by using this - I cannot see where it's stated that it's for short-time threads, but I bet the reason is since you are re-using available threads, having short threads allows you to keep the number of simultaneous active threads quite low.
Unless you've got too many threads running (you can check it using JVisualVM or JConsole), I would suggest sticking with that solution - specially because number of expected threads is undefined. Analyze then the VM and tune your pool accordingly.
For question 2 - were you referring to using something like Executors.newFixedThreadPool(int)? If yes, remember that going aobve the number of threads you defined when you've created the ThreadPool will make threads wait - instead of newCachedThreadPool in which new threads are dynamically created.