Is a single Java thread better than multiple threading in my scenario? - java

Our company is running a Java application (on a single CPU Windows server) to read data from a TCP/IP socket and check for specific criteria (using regular expressions) and if a match is found, then store the data in a MySQL database. The data is huge and is read at a rate of 800 records/second and about 70% of the records will be matching records, so there is a lot of database writes involved. The program is using a LinkedBlockingQueue to handle the data. The producer class just reads the record and puts it into the queue, and a consumer class removes from the queue and does the processing.
So the question is: will it help if I use multiple consumer threads instead of a single thread? Is threading really helpful in the above scenario (since I am using single CPU)? I am looking for suggestions on how to speed up (without changing hardware).
Any suggestions would be really appreciated. Thanks

Simple: Try it and see.
This is one of those questions where you argue several points on either side of the argument. But it sounds like you already have most of the infastructure set up. Just create another consumer thread and see if the helps.
But the first question you need to ask yourself:
What is better?
How do you measure better?
Answer those two questions then try it.

Can the single thread keep up with the incoming data? Can the database keep up with the outgoing data?
In other words, where is the bottleneck? If you need to go multithreaded then look into the Executor concept in the concurrent utilities (There are plenty to choose from in the Executors helper class), as this will handle all the tedious details with threading that you are not particularly interested in doing yourself.
My personal gut feeling is that the bottleneck is the database. Here indexing, and RAM helps a lot, but that is a different question.

It is very likely multi-threading will help, but it is easy to test. Make it a configurable parameter. Find out how many you can do per second with 1 thread, 2 threads, 4 threads, 8 threads, etc.

First of all:
It is wise to create your application using the java 5 concurrent api
If your application is created around the ExecutorService it is fairly easy to change the number of threads used. For example: you could create a threadpool where the number of threads is specified by configuration. So if ever you want to change the number of threads, you only have to change some properties.
About your question:
- About the reading of your socket: as far as i know, it is not usefull (if possible at all) to have two threads read data from one socket. Just use one thread that reads the socket, but make the actions in that thread as few as possible (for example read socket - put data in queue -read socket - etc).
- About the consuming of the queue: It is wise to construct this part as pointed out above, that way it is easy to change number of consuming threads.
- Note: you cannot really predict what is better, there might be another part that is the bottleneck, etcetera. Only monitor / profiling gives you a real view of your situation. But if your application is constructed as above, it is really easy to test with different number of threads.
So in short:
- Producer part: one thread that only reads from socket and puts in queue
- Consumer part: created around the ExecutorService so it is easy to adapt the number of consuming threads
Then use profiling do define the bottlenecks, and use A-B testing to define the optimal numbers of consuming threads for your system

As an update on my earlier question:
We did run some comparison tests between single consumer thread and multiple threads (adding 5, 10, 15 and so on) and monitoring the que size of yet-to-be processed records. The difference was minimal and what more.. the que size was getting slightly bigger after the number of threads was crossing 25 (as compared to running 5 threads). Leads me to the conclusion that the overhead of maintaining the threads was more than the processing benefits got. Maybe this could be particular to our scenario but just mentioning my observations.
And of course (as pointed out by others) the bottleneck is the database. That was handled by using the multiple-insert statement in mySQL instead of single inserts. If we did not have that to start with, we could not have handled this load.
End result: I am still not convinced on how multi-threading will give benefit on processing time. Maybe it has other benefits... but I am looking only from a processing-time factor. If any of you have experience to the contrary, do let us hear about it.
And again thanks for all your input.

In your scenario where a) the processing is minimal b) there is only one CPU c) data goes straight into the database, it is not very likely that adding more threads will help. In other words, the front and the backend threads are I/O bound, with minimal processing int the middle. That's why you don't see much improvement.
What you can do is to try to have three stages: 1st is a single thread pulling data from the socket. 2nd is the thread pool that does processing. 3rd is a single threads that serves the DB output. This may produce better CPU utilization if the input rate varies, at the expense of temporarily growth of the output queue. If not, the throughput will be limited by how fast you can write to the database, no matter how many threads you have, and then you can get away with just a single read-process-write thread.

Related

Define Parallel Processing Thread Pool Count and Sleep time

I need to update 550 000 records in a table with the JBOSS Server is Starting up. I need to make this update as a backgroundt process with multiple threads and parallel processing. Application is Spring, so I can use initializing bean for this.
To perform the parallal processing I am planning to use Java executor framework.
ThreadPoolExecutor executor=(ThreadPoolExecutor)Executors.newFixedThreadPool(50); G
How to decide the thread pool count?
I think this is depends on hardware My hardware. it is 16 GB Ram and Co-i 3 processor.
Is it a good practice to Thread.sleep(20);while processing this big update as background.
I don't know much about Spring processing specifically, but your questions seem general enough that I can still provide a possibly inadequate answer.
Generally there's a lot of factors that go into how many threads you want. You definitely don't want multiple threads on a core, as that'll slow things down as threads start contending for CPU time instead of working, so probably your core count would be your ceiling, or maybe core count - 1 to allow one core for all other tasks to run on (so in your case maybe 3 or 4 cores, tops, if I remember core counts for i3 processors right). However, in this case I'd guess you're more likely to run into I/O and/or memory/cache bottlenecks, since when those are involved, those are more likely to slow down your program than insufficient parallelization. In addition, the tasks that your threads are doing would affect the number of threads you can use; if you have one thread to pull data in and one thread to dump data back out after processing, it might be possible for those threads to share a core.
I'm not sure why this would be a good idea... What use do you see for Thread.sleep() while processing? I'd guess it'd actually slow down your processing, because all you're doing is putting threads to sleep when they could be working.
In any case, I'd be wary of parallelizing what is likely to be an I/O bound task. You'll definitely need to profile to see where your bottlenecks are, even before you start parallelizing, to make sure that multiple cores will actually help you.
If it is the CPU that is adding extra time to complete your task, then you can start parallelizing. Even then, be careful about cache issues; try to make sure each thread works on a totally separate chunk of data (e.g. through ThreadLocal) so cache/memory issues don't limit any performance increases. One way this could work is by having a reader thread dump data into a Queue which the worker threads can then read into a ThreadLocal structure, process, etc.
I hope this helped. I'll keep updating as the mistakes I certainly made are pointed out.

Simple Multi-Threading in Java

Currently, I'm running on a thread-less model that isn't working simply because I'm running out of memory before I can process the data I'm being handed. I've made all the changes that I can to optimize the code, and it's still just not quite quick enough.
Clearly I should move on to a threaded model. I'm wondering what the simplest, easiest way to do the following is:
The main thread passes some info to the worker
That worker performs some work that I'll refactor out of the main method
The workers will disappear and new ones will be instantiated when needed
I've never worked with java threading and from what I've read up on it seems pretty complicated, even if what I'm looking for seems pretty simple.
If you have multiple independent units of work of equal priority, the best solution is generally some sort of work queue, where a limited number of threads (the number chosen to optimize performance) sit in a while(true) loop dequeuing work units from the queue and executing them.
Generally the optimum number of threads is going to be the number of processors +/- 1, though in some cases a larger number will be optimal if the threads tend to get stalled by disk I/O requests or some such.
But keep in mind that tuning the entire system may be required. Eg, you may need more disk arms, and certainly more RAM may be required.
I'd start by having a read through Java Concurrency as refresher ;)
In particular, I would spend some time getting to know the Executors API as it will do most of what you've described without a lot of the overhead of dealing with to many locks ;)
Distributing the memory consumption to multiple threads will not change overall memory consumption. From what I read out of your question, I would like to step forward and tell you: Increase the heap of the Java engine, this will help. Looks like you have to optimize the Java startup parameters and not your code. If I am wrong, then you will have to buffer the data. To Disk! Not to a thread in the same memory model.

Thread Pool vs Many Individual Threads

I'm in the middle of a problem where I am unable decide which solution to take.
The problem is a bit unique. Lets put it this way, i am receiving data from the network continuously (2 to 4 times per second). Now each data belongs to a different, lets say, group.
Now, lets call these groups, group1, group2 and so on.
Each group has a dedicated job queue where data from the network is filtered and added to its corresponding group for processing.
At first I created a dedicated thread per group which would take data from the job queue, process it and then goes to blocking state (using Linked Blocking Queue).
But my senior suggested that i should use thread pools because this way threads wont get blocked and will be usable by other groups for processing.
But here is the thing, the data im getting is fast enough and the time a thread takes to process it is long enough for the thread to, possibly, not go into blocking mode. And this will also guarantee that data gets processed sequentially (job 1 gets done before job 2), which in pooling, very little chances are, might not happen.
My senior is also bent on the fact that pooling will also save us lots of memory because threads are POOLED (im thinking he really went for the word ;) ). While i dont agree to this because, i personally think, pooled or not each thread gets its own stack memory. Unless there is something in thread pools which i am not aware of.
One last thing, I always thought that pooling helps where jobs appear in a big number for short time. This makes sense because thread spawning would be a performance kill because of the time taken to init a thread is lot more than time spent on doing the job. So pooling helps a lot here.
But in my case group1, group2,...,groupN always remain alive. So if there is data or not they will still be there. So thread spawning is not the issue here.
My senior is not convinced and wants me to go with the pooling solution because its memory footprint is great.
So, which path to take?
Thank you.
Good question.
Pooling indeed saves you initialization time, as you said. But it has another aspect: resource management. And here I am asking you this- just how many groups (read- dedicated threads) do you have?
do they grow dynamically during the execution span of the application?
For example, consider a situation where the answer to this question is yes. new Groups types are added dynamically. In this case, you might not want to dedicate a a thread to each one since there is technically no restrictions on the amount of groups that will be created, you will create a lot of threads and the system will be context switching instead of doing real work.
Threadpooling to the rescue- thread pool allows you to specify a restriction on the maxumal number of threads that could be possibly created, with no regard to load. So the application may deny service from certain requests, but the ones that get through are handled properly, without critically depleting the system resources.
Considering the above, I is very possible that in your case, it is very much OK to have a dedicated
thread for each group!
The same goes for your senior's conviction that it will save memory.. Indeed, a thread takes up memory on the heap, but is it really so much, if it is a predefined amount, say 5. Even 10- it is probably OK. Anyway, you should not use pooling unless you are a-priory and absolutely convinced that you actually have a problem!
Pooling is a design decision, not an architectural one. You can not-pool at the beggining and proceed with optimizations in case you find pooling to be beneficial after you encountered a performance issue.
Considering the serialization of requests (in order execution) it is no matter whether you are using a threadpool or a dedicated thread. The sequential execution is a property of the queue coupled with a single handler thread.
Creating a thread will consume resources, including the default stack per thread (IIR 512Kb, but configurable). So the advantage to pooling is that you incur a limited resource hit. Of course you need to size your pool according to the work that you have to perform.
For your particular problem, I think the key is to actually measure performance/thread usage etc. in each scenario. Unless your running into constraints I perhaps wouldn't worry either way, other than to make sure that you can swap one implementation for another without a major impact on your application. Remember that premature optimisation is the root of all evil. Note that:
"Premature optimization" is a phrase used to describe a situation
where a programmer lets performance considerations affect the design
of a piece of code. This can result in a design that is not as clean
as it could have been or code that is incorrect, because the code is
complicated by the optimization and the programmer is distracted by
optimizing.

Is concurrent programming more grided or clustered?

I'm trying to wrap my brain around parallel/concurrent programming (in Java) and am getting hung up on some fundamentals that don't seem to be covered in any of the tutorials I've been reading.
When we talk about "multi-threading", or "parallel/concurrent programming", does that mean we're taking a big problem and spreading it over many threads, or are we first explicitly decomposing it into smaller sub-problems, and passing each sub-problem to its own thread?
For example, let's say we have EndWorldHungerTask implements Runnable, and task accomplishes some enormous problem. In order to complete its objective, it has to do some really heavy lifting, say, a hundred million times:
public class EndWorldHungerTask implements Runnable {
public void run() {
for(int i = 0; i < 100000000; i++)
someReallyExpensiveOperation();
}
}
In order to make this "concurrent" or "multi-threaded", would we pass this EndWorldHungerTask to, say, 100 worker threads (where each of the 100 workers are told by the JVM when to be active and work on the next iteration/someReallyExpensiveOperation() call), or would we refactor it manually/explicitly so that each of the 100 workers is iterating over different parts of the loop/work-to-be-done? In both cases, each of the 100 workers is only iterating a million times.
But, under the first paradigm, Java is telling each Thread when to execute. Under the second, the developer needs to manually (in the code) partition the problem ahead of time, and assign each sub-problem to a new Thread.
I guess I'm asking how its "normally done" in Java land. And, not just for this problem, but in general.
I guess I'm asking how its "normally done" in Java land. And, not just for this problem, but in general.
This is highly dependent on the task at hand.
The standard paradigm in Java is that you have to split the work into chunks yourself. Distributing those chunks across multiple threads/cores is a separate problem, and there exist a variety of patterns for that (queues, thread pools, etc).
It is interesting to note that there exist frameworks that can automatically make use of multiple cores to execute things like for loops in parallel (for example, OpenMP). However, I am not aware of any such frameworks for Java.
Finally, it could be the case that the low-level library that does the bulk of the work can make use of multiple cores. In such a case the higher-level code may be able to remain single-threaded and still benefit from multicore hardware. One example might be numerical code using MKL under the covers.
When we talk about "multi-threading", or "parallel/concurrent programming", does that mean we're taking a big problem and spreading it over many threads, or are we first explicitly decomposing it into smaller sub-problems, and passing each sub-problem to its own thread?
I think this depends highly on the problem. There are times where you have the same task that you call 1000s or millions of times using the same code. This is the ExecutorSerivce.submit() type of pattern. You has million of lines from a file and you are running some processing methods on each line. I guess this is your "spreading it over many threads" type of problem. This works for simple thread models.
But there are other cases where the problem space is made up of a large number of non-homogenous tasks. Sometimes you might spawn a single thread to handle some background keep-alive, and other times a thread pool here and there to process some queue of work. Typically the larger the scope of the problem, the more complicated the concurrency model and the more different types of pools and threads are used. I guess this is your "decomposing it into smaller sub-problems" type.
In order to make this "concurrent" or "multi-threaded", would we pass this EndWorldHungerTask to, say, 100 worker threads (where each of the 100 workers are told by the JVM when to be active and work on the next iteration/someReallyExpensiveOperation() call), or would we refactor it manually/explicitly so that each of the 100 workers is iterating over different parts of the loop/work-to-be-done? In both cases, each of the 100 workers is only iterating a million times.
In your case, I don't see how you can solve world hunger (to use your analogy) with one set of thread code. I think that you have to "decompose it into smaller sub-problems" which corresponds to the latter case that I explain above: a whole series of threads running different code. Some of the sub-solutions can be done in thread-pools and some will be done with individual threads, each running separate code.
I guess I'm asking how its "normally done" in Java land. And, not just for this problem, but in general.
"Normally" depends highly on the problem and its complexity. In my experience, I normally use the ExecutorService constructs as much as possible. But with any decent sized problem you will find yourself with a number of different thread-pools, Spring timer threads, custom one-off thread tasks, producer/consumer models, etc., etc..
Normally you would want each thread to execute one task form start to finish, you would gain nothing from leaving the task half done, then halting execution on that thread and "calling" another thread to finish the job. Java offers of course tools for this kind of thread synchronization, but they are really used when a task is depending on another task to complete - not so that another thread may complete the task.
Most of the time you will have a big problem, that consists of several tasks, if this tasks can be executed concurrently then it would make sense to spawn threads to execute this tasks. There is an overhead associated with creating threads, so if all the tasks are sequential and must wait for the other to finish, then it would not be beneficial at all to spawn multiple threads, just one thread so you don't block the main thread.
"multi-threading" <> "parallel/concurrent programming".
Multithreaded apps are often written to take advantage of the high I/O performance of a preemptive multitasker. An example might be a web crawler/downloader. A multithreaded crawler would typically outperform a single-threaded version by a huge factor, even when running on a box with only one CPU core. The actions of a DNS query to get a site address, connecting to the site, downloading a page, writing it to a disk file are all operations that require little CPU but a lot of IO waiting. So, a lot of these unavoidable waits can be performed in parallel by many threads. When a DNS query comes in, an HTTP client connects or a disk operation is complete, the thread that requested it is made ready/running and can move on to the next operation.
The vast majority of apps are, primarily, written as multithreaded for this reason. That's why the box I'm writing this on has 98 processes, (of which 94 have more than one thread), 1360 threads and 3% CPU use - it's got little to do with splitting CPU work up across cores - it's mostly about IO performance.
Parallel/concurrent programming can actually take place with multiple CPU cores. For those apps that have CPU-intensive work that can be decomposed into largish packages for distribution across cores, a speedup factor approaching the number of cores is possible with care.
Naturally there is some bleedover - the I/O bound web-crawler will tend to perform better on a box with more cores, if only because the interrupt/driver overhead has a smaller impact on overall performance, but it wont be better by much.
It doesn't matter how many workers you have available for the EndWorldHunger Task if they are all waiting for the crops to grow.

Java threading objects

I've created an object of arrays with a size of 1000, they are all threaded so that means 1000 threads are added. Each object holds a socket and 9 more global variables. The whole object consists of 1000 lines of code.
I'm looking for ways to make the program efficient because it lags. CPU use is at 100% everytime I start the program.
I understand that I'm going to have to change the way the program works, but I can't find a good way. Can anyone explain how to achieve this?
It depends on what your threads actually do - are the tasks primarily using CPU or other resources? For CPU intensive tasks, the best strategy is to run as many threads as you have cores, or a few more. For threads which are blocking a lot on e.g. reading files, waiting for the net etc. you can have many more threads than CPUs.
It also depends on how many cores the system has. Obviously the answer is very different for a single processor machine than for a 128-way multiprocessor. The above rules of thumb can give you some estimates, but it is best to make experiments yourself based on these, to figure out the ideal number of threads for your specific setup.
Moreover, since Java5, it is always advisable to use e.g. a ThreadPoolExecutor instead of creating your threads manually. This makes your app both more robust and more flexible.
1/ use thread pool
2/ use futures
You should consider refactor you usage of threads.
1000 Threads normally makes no sense on a normal machine/server although your problem seems to be I/O-heavy. You should consider the number of cpu-threads that are available.
A possible solution would be to use a dispatcher that passes the handling (and possible responding) to a request on the socket into a queue of a ThreadPoolExecutor.
From my experience, 1000 threads are just too many (at least on 8core/8GB RAM machines). A common symptom is context switching slashing, where your OS is just busy jumping from thread to thread while doing little useful work (and a lot of memory is wasted etc.).
If you have to maintain 1000 sockets, you probably have to go for NIO. Easier way out would be closing/opening sockets every time (whether you can do this dependents on the characteristics of your work.).
The way you solve this many thread problem is to use a thread pool, as others note. Instead of extending Thread, code a Runnable instead. This is easier said than done though because you have to maintain state if you need conversation. This commonly involves a ConcurrentMap. I personally tend to put a Handler (which implements Runnable) on this map that should run when the counter party returns a response (the response contains a key everytime). In this case you'd be closing the socket every time. If you use NIO, it's more like coding with Threads in the sense you don't need to identify the counterparty like this, but it has its own complexity.

Categories

Resources