SimpleMessageListener vs DirectMessageListener

SimpleMessageListener vs DirectMessageListener - java

I'm trying to see difference between DirectMessageListener and SimpleMessageListener. I have this drawing just to ask if it is correct.
Let me try to describe how I understood it and maybe you tell me if it is correct.
In front of spring-rabbit there is rabbit-client java library, that is connecting to rabbit-mq server and delivering messages to spring-rabbit library. This client has some ThreadPoolExecutor (which has in this case I think - 16 threads). So, it does not matter how many queues are there in rabbit - if there is a single connection, I get 16 threads. These same threads are reused if I use DirectMessageListener - and this handler method listen is executed in all of these 16 threads when messages arrive. So if I do something complex in handler, rabbit-client must wait for thread to get free in order to get next message using this thread. Also if I increase setConsumersPerQueue to lets say 20, It will create 20 consumer per queue, but not threads. These 20*5 consumers in my case will all reuse these 16 threads offered by ThreadPoolExecutor?
SimpleMessageListener on the other hand, would have its own threads. If concurrent consumers == 1 (I guess default as in my case) it has only one thread. Whenever there is a message on any of secondUseCase* queues, rabbit-client java library will use one of its 16 threads in my case, to forward message to single internal thread that I have in SimpleMessageListener. As soon as it is forwarded, rabbit-client java library thread is freed and it can go back fetching more messages from rabbit server.

Your understanding is correct.
The main difference is that, with the DMLC, all listeners in all listener containers are called on the shared thread pool in the amqp-client (you can increase the 16 if needed). You need to ensure the pool is large enough to handle your expected concurrency across all containers, otherwise you will get starvation.
It's more efficient because threads are shared.
With the SMLC, you don't have to worry about that, but at the expense of having a thread per concurrency. In that case, a small pool in the amqp-client will generally be sufficient.

Related

how to tune up wildfly managed-executor-service thread pool parameter

I have questions regarding the performance tuning.
I'm using Linux 64bit server, java 1.8 with wildfly 10.0.0.final . I developed a webservice which uses thread factory and managed executor service through the wildfly configuration.
the purpose of my webervice is to receive the request which has large data, save data, and then create a new thread to process this data, then return response to request. This way the webservice can return response quickly without waiting for data processing to finish.
The configured managed-executor-service holds a thread pool config specifically for this purpose.
for my understanding in configuration, the core-thread defines how many threads will be alive in the thread pool. when core-thread is full, new requests will be put in queue, when queue is full, then new threads will be created, but these newly created thread will be terminated after some time.
I'm trying to figure out what is the best combination to set the thread pool. The following is my concerns:
if this core-thread is set too small (like 5), maybe the responding time will be long because only 5 active threads are processing data, the rest are put in queue until queue is full. the response time won't look good at heavy load time
if I set core-thread to be big, (like 100 maybe), that means even the system is not busy, there still will be 100 live threads in the pool. I don't see any configuration that can allow these threads to be terminated. I'm concerned it is too many live threads idle.
Does anyone have any suggestions on how to set parameters to handle both heavy load and light load situation without too many idle threads left in pool? I'm actually not familiar with this area, like how many idle threads means too many, how to measure it.
the following is the configuration for thread factory and managed-executor-service.
<managed-thread-factory name="UploadThreadFactory" jndi-name="java:jboss/ee/concurrency/factory/uploadThreadFactory"/>
<managed-executor-service name="UploadManagedExecutor" Jodi-name="java:jboss/ee/concurrency/executor/uploadManagedExecutor" context-service="default" thread-factory="UploadThreadFactory" hung-task-threshold="60000" core-thread="5" max-thread="100" keep-alive-time="5000" queue-length="500"/>
Thanks a lot for your help,
Helen

Multiple Request Handling in java using concurrency api

I have a server which can handle 1000 threads simultaneously.
So for handling the request, i implemented the producer consumer pattern in my code similar kind of servlet container.
At a time, we can have more than 3000 request so for handling this
scenario, what should be the queue size and why?
let's assume we have a queue size of 2000, then what should we do if
we have 4000 request. how can we handle this scenario(easiest way is
to discard the extra request but we need to handle each and every
request)?
I want to generate 20 parallel thread just like jmeter does. How can
i do that using java concurrency API.
In the above scenario, what type of ThreadPool should we need to
utilize like CachedThreadPool or any other and why?

There are two dimensions of bounds to think about: one dimension is how many threads you can use (which is ultimately usually bounded by having enough memory for the corresponding thread stacks -- on a modern JVM the default is a megabyte of stack per thread so 1000 threads is a gigabyte of memory for thread stacks alone). The other dimension is the bounds on your work queue.
For request serving, you probably want a fixed size thread pool (sized according to Little's Law for your workload) and a queue that can grow as needed. For a request-based server an unbounded queue probably has the best graceful degradation result, although you might want to experiment with a bounded queue and a load-shedding RejectedExecutionPolicy.
All of these options can be configured in ThreadPoolExecutor which is probably the implementation you want to use.

Based on your requirement "To process each and every request" you should consider using a Queue (just for suggestion ActiveMQ, it can be any queue implementation).
Put every incoming request into queue and have a thread pool of consumers to consume data from queue. In this way you can grantee that every request is getting processed and your application can scale without changing any stuff.

Thread pool when serving multiple clients with blocking methods

I am developing a webserver in java that will provide websocket communication to its' clients. I have been proposed to use a thread pool when dealing with many clients because it is a lot more time efficient than to use one thread per client.
My question is simply, will Javas ExecutorService, newFixedThreadPool be able to handle a queue of runnable tasks with thread blocking methods being called inside of them?
In other words i guess i am wondering if this thread pool is asynchronous?
The reason i am asking is that i have tried using a newFixedThreadPool with, lets say, 2 threads. Then when i connect 3 clients to the server, i can only receive commands from the first two. But i guess i could be doing something wrong, thats why i am asking.
The runnable tasks are also in an infinite while loop (only ends when client disconnects).

Well, it depends on your implementation. The easiest case is having clients keeping their thread active until the disconnect (or get kicked out because of a timeout). In this case, your thread pool isn't very efficient. I'll only re-use disconnected users' threads instead of creating new one (which is good, but not really relevant).
The second case would be activating the threads only when needed (let's say when a client sends or receives a messages). In this case, you need to remember the server-side (keeping an id for example), in order to be able to sever the thread connection when they don't need them, and re-establish it when they do. In order to do that, you must keep the sockets somewhere, but unbound to any specific thread.
I actually didn't code that myself but I don't see why it would work as this is the mechanism used for websites (i.e. HTTP protocol)

lmax disruptor is too slow in multi-producer mode compared to single-producer mode

Previously, when I use single-producer mode of disruptor, e.g.
new Disruptor<ValueEvent>(ValueEvent.EVENT_FACTORY,
2048, moranContext.getThreadPoolExecutor(), ProducerType.Single,
new BlockingWaitStrategy())
the performance is good. Now I am in a situation that multiple threads would write to a single ring buffer. What I found is that ProducerType.Multi make the code several times slower than single producer mode. That poor performance is not going to be accepted by me. So should I use single producer mode while multiple threads invoke the same event publish method with locks, is that OK? Thanks.

I'm somewhat new to the Disruptor, but after extensive testing and experimenting, I can say that ProducerType.MULTI is more accurate and faster for 2 or more producer threads.
With 14 producer threads on a MacBook, ProducerType.SINGLE shows more events published than consumed, even though my test code is waiting for all producers to end (which they do after a 10s run), and then waiting for the disruptor to end. Not very accurate: Where do those additional published events go?
Driver start: PID=38619 Processors=8 RingBufferSize=1024 Connections=Reuse Publishers=14[SINGLE] Handlers=1[BLOCK] HandlerType=EventHandler<Event>
Done: elpased=10s eventsPublished=6956894 eventsProcessed=4954645
Stats: events/sec=494883.36 sec/event=0.0000 CPU=82.4%
Using ProducerType.MULTI, fewer events are published than with SINGLE, but more events are actually consumed in the same 10 seconds than with SINGLE. And with MULTI, all of the published events are consumed, just what I would expect due to the careful way the driver shuts itself down after the elapsed time expires:
Driver start: PID=38625 Processors=8 RingBufferSize=1024 Connections=Reuse Publishers=14[MULTI] Handlers=1[BLOCK] HandlerType=EventHandler<Event>
Done: elpased=10s eventsPublished=6397109 eventsProcessed=6397109
Stats: events/sec=638906.33 sec/event=0.0000 CPU=30.1%
Again: 2 or more producers: Use ProducerType.MULTI.
By the way, each Producer publishes directly to the ring buffer by getting the next slot, updating the event, and then publishing the slot. And the handler gets the event whenever its onEvent method is called. No extra queues. Very simple.

IMHO, single producer accessed by multi threads with lock won't resolve your problem, because it simply shift the locking from the disruptor side to your own program.
The solution to your problem varies from the type of event model you need. I.e. do you need the events to be consumed chronologically; merged; or any special requirement. Since you are dealing with disruptor and multi producers, that sounds to me very much like FX trading systems :-) Anyway, based on my experience, assuming you need chronological order per producer but don't care about mixing events between producers, I would recommend you to do a queue merging thread. The structure is
Each producer produces data and put them into its own named queue
A worker thread constantly examine the queues. For each queue it remove one or several items and put it to the single producer of your single producer disruptor.
Note that in the above scenario,
Each producer queue is a single producer single consumer queue.
The disruptor is a single producer multi consumer disruptor.
Depends on your need, to avoid a forever running thread, if the thread examine for, say, 100 runs and all queues are empty, it can set some variable and go wait() and the event producers can yield() it when seeing it's waiting.
I think this resolve your problem. If not please post your need of event processing pattern and let's see.

Best way to configure a Threadpool for a Java RIA client app

I've a Java client which accesses our server side over HTTP making several small requests to load each new page of data. We maintain a thread pool to handle all non UI processing, so any background client side tasks and any tasks which want to make a connection to the server. I've been looking into some performance issues and I'm not certain we've got our threadpool set up as well as possible. Currently we use a ThreadPoolExecutor with a core pool size of 8, we use a LinkedBlockingQueue for the work queue so the max pool size is ignored. No doubt there's no simple do this certain thing in all situations answer, but are there any best practices. My thinking at the moment is
1) I'll switch to using a SynchronousQueue instead of a LinkedBlockingQueue so the pool can grow to the max pool size figure.
2) I'll set the max pool size to be unlimited.
Basically my current fear is that occasional performance issues on the server side are causing unrelated client side processing to halt due to the upper limit on the thread pool size. My fear with unbounding it is the additional hit on managing those threads on the client, possibly just the better of 2 evils.
Any suggestions, best practices or useful references?
Cheers,
Robin

It sounds like you'd probably be better of limiting the queue size: does your application still behave properly when there are many requests queued (is it acceptable for all task to be queued for a long time, are some more important to others)? What happens if there are still queued tasks left and the user quits the application? If the queue growing very large, is there a chance that the server will catch-up (soon enough) to hide the problem completely from the user?
I'd say create one queue for requests whose response is needed to update the user interface, and keep its queue very small. If this queue gets too big, notify the user.
For real background tasks keep a separate pool, with a longer queue, but not infinite. Define graceful behavior for this pool when it grows or when the user wants to quit but there are tasks left, what should happen?

In general, network latencies are easily orders of magnitude higher than anything that can be happening in regards to memory allocation or thread management on the client side. So, as a general rule, if you are running into a performance bottle neck, look first and foremost to the networking link.
If the issue is that your server simply can not keep up with the requests from the clients, bumping up the threads on the client side is not going to help matters: you'll simply progress from having 8 threads waiting to get a response to more threads waiting (and you may even aggravate the server side issues by increasing its load due to higher number of connections it is managing).
Both of the concurrent queues in JDK are high performers; the choice really boils down to usage semantics. If you have non-blocking plumbing, then it is more natural to use the non-blocking queue. IF you don't, then using the blocking queues makes more sense. (You can always specify Integer.MAX_VALUE as the limit). If FIFO processing is not a requirement, make sure you do not specify fair ordering as that will entail a substantial performance hit.

As alphazero said, if you've got a bottleneck, your number of client side waiting jobs will continue to grow regardless of what approach you use.
The real question is how you want to deal with the bottleneck. Or more correctly, how you want your users to deal with the bottleneck.
If you use an unbounded queue, then you don't get feedback that the bottleneck has occurred. And in some applications, this is fine: if the user is kicking off asynchronous tasks, then there's no need to report a backlog (assuming it eventually clears). However, if the user needs to wait for a response before doing the next client-side task, this is very bad.
If you use LinkedBlockingQueue.offer() on a bounded queue, then you'll immediately get a response that says the queue is full, and can take action such as disabling certain application features, popping a dialog, whatever. This will, however, require more work on your part, particularly if requests can be submitted from multiple places. I'd suggest, if you don't have it already, you create a GUI-aware layer over the server queue to provide common behavior.
And, of course, never ever call LinkedBlockingQueue.put() from the event thread (unless you don't mind a hung client, that is).

Why not create an unbounded queue, but reject tasks (and maybe even inform the user that the server is busy (app dependent!)) when the queue reaches a certain size? You can then log this event and find out what happened on the server side for the backup to occur, Additionally, unless you are connecting to a multiple remote servers there is probably not much point having more than a couple of threads in the pool, although this does depend on your app and what it does and who it talks to.
Having an unbounded pool is usually dangerous as it generally doesn't degrade gracefully. Better to log the problem, raise an alert, prevent further actions being queued and figure out how to scale the server side, if the problem is there, to prevent this happening again.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.