Letting a different thread handle process if another hasn't finished

Letting a different thread handle process if another hasn't finished - java

I am playing around quickfix and I have a design question.
I process messages received in a function below:
void processFixMessage(Message message){
//do stuff here
}
There's almost a certain chance that I cosume(process) messages slower.
My question is, is there a way to handle such a situation where,
If I haven't finished a message and received another message, a different
thread should pick up and start processing.

You can hop the thread in your processFixMessage(Message message). Depending on the rate of incoming message and time to process a single message you can choose how many threads you would like to create.
One way is to create a ThreadPool of n threads and submit your message parsing to that pool.
You can refer code: https://www.journaldev.com/1069/threadpoolexecutor-java-thread-pool-example-executorservice
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
You can have dynamic number of threads based on machine as:
int cores = Runtime.getRuntime().availableProcessors();

Related

ActiveMQ : how to fork-join? Ie. how to emit one message when all subtasks are done

imagine you have some task structure of
Task1
Task2: 1 million separate independent Subtask[i] that can run concurrently
Task3: must run once after ALL Task2 subtasks have completed
And all of Task1, Subtask[i] and Task3 are represented by MQ messages.
How can this be solved on an ActiveMQ? Especially the triggering of a Task3 message once all subtasks are complete.
I know, it's not a queueing problem, it's a fork-join problem. Lets say the environment dictates you must use an ActiveMQ for it.
Using ActiveMQ features, dynamic queues and consumers, stuff like that, is allowed. Using external counters, like a database row representing Task2's progress, is not allowed.

Hidden in this fork-join problem is a state management and observability challenge. Since the database is ruled out, you have to rely on something in-memory or on-queue.
Create a unique id for the task run -- something short, but with enough space to not collide like an airplane locator code-- ie. 34FDSX
Send all messages for the task to a queue://TASK.34FDSX.DATA
Send a control message to queue://TASK.34FDSX.CONTROL that contains the task id and expected total # of messages (including each messageId would be helpful too)
When consumers from queue://TASK.34FDSX.DATA complete their work, they should send a 'done' message to queue://TASK.34FDSX.DONE queue with their messageId or some identifier.
The consumers for the .CONTROL queue and the .DONE queue should be the same process and can track the expected and total completed tasks. Once everything is completed, he can fire the event to trigger Task #3.
This approach provides everything as 'online', and you can also timeout the .CONTROL and .DONE reader if too much time passes before the task completes.
Queue deletion can be done using ActiveMQ destination GC, or as a clean-up step in the .CONTROL/.DONE reader during the occurances when everything completes successfully.
Advantages:
No infinite blocking consumers
No infinite open transactions
State of the TASK is online and observable via the presence of queues and queue metrics-- queue size, enqueue count, dequeue count
The entire solution can be multi-threaded and the only requirement is that for a given task the .CONTROL/.DONE listener is the same consumer, but multiple tasks can have individual .CONTROL/.DONE listeners to scale.

The question here is a bit vague so my answer will have to be a bit vague as well.
Each of the million independent subtasks for "Task 2" can be represented by a single message. All these messages can be in the same queue. You can spin up as many consumers as you want and process all these messages (i.e. perform all the subtasks). Just ensure that these consumers either use client-acknowledge mode or a transacted session so that the message is not removed from the queue until they are done processing the message. Once there are no more messages in the queue then you know "Task 2" is done.
To detect when the queue is empty you can have a "special" consumer on the queue that periodically opens a transacted session and tries to consume a message from the queue. If the consumer receives a message then you can rollback the transacted session to put the message back on the queue and you know that the queue is not empty (i.e. "Task 2" is not done). If the consumer doesn't receive a message then you know the queue is empty and you can send another message indicating this. You could launch this special consumer as part of "Task 2" after all the messages for the subtasks have been sent to avoid detecting an empty queue prematurely.
To be clear, this is a simple solution. You could certainly add more complexity depending on your requirements, but your question just outlined the basic problem so it's unclear what other requirements you have (if any).

Long delay between Akka actors

I'm consistently seeing very long delays (60+ seconds) between two actors, from the time at which the first actor sends a message for the second, and when the second actor's onReceive method is actually called with the message. What kinds of things can I look for to debug this problem?
Details
Each instance of ActorA is sending one message for ActorB with ActorRef.tell(Object, ActorRef). I collect a millisecond timestamp (with System.currentTimeMillis()) right after calling the tell method in ActorA, and getting another one at the start of ActorB's onReceive(Object). The interval between these timestamps is consistently 60 seconds or more. Specifically, when plotted over time, this interval follows a rough saw tooth pattern that ranges from more 60 second to almost 120 seconds, as shown in the graph below.
These actors are early in the data flow of the system, there are several other actors that follow after ActorB. This large gap only occurs between these two specific actors, the gaps between other pairs of adjacent actors is typically less than a millisecond, occassionally a few tens of milliseconds. Additionally, the actual time spent inside any given actor is never more than a second.
Generally, each actor in the system only passes a single message to another actor. One of the actors (subsequent to ActorB) sends a single message to each of a few different actors, and a small percentage (less than 0.1%) of the time, certain actors will send multiple messages to the same subsequent actor (i.e., multiple instances of the subsequent actor will be demanded). When this occurs, the number of multiple messages is typically on the order of a dozen or less.
Can this be explained (explicitely) by the normal reactive nature of Akka? Does it indicate a problem with the way work is distributed or the way the actors are configured? Is there something that can explicitly block a particular actor from spinning up? What other information should I collect or look at to understand the source of this, or to understand whether or not it is actually a problem?

You have a limited thread pool. If your Actors block, they still take up space in the thread pool. New threads will not be created if your thread pool is saturated.
You may want to configure
core-pool-size-factor,
core-pool-size-min, and
core-pool-size-max.
If you expect certain actions to block, you can instead wrap them in Future { blocking { ... } } and register a callback. But it's better to use asynchronous, non-blocking calls.

Returning results of operations from a message-loop in different thread

I have some theoretical question regarding "message loops"; particularly returning result of operations happening in a message loop that runs in a different thread. I have the situation where I'm having a TCP server listening for incoming messages. For each incoming message the server will authenticate the client who sent the message and two things may happen:
If the authenticated client has an attached handler the received message will be passed to the handler's message queue.
If the client has no handler a new one will be created and the same as above (the message will be passed to its message queue).
The handler is currently an object implementing the Callable interface so that it'll run in a different thread and its simple enough the get the result of the operation. Now for my problem: Each handler can have N amount of messages to be processed. The handler has a "message loop" like functionality that runs until a timeout occurs - a timeout in this case would be the socket's idle time reaching a predefined treshold. What I would like to know, how can I get Java to return a value from within the message loop without actually terminating the thread. Something like the following:
while (true) {
if (expired(socket))
break; // the callable will finish the call() method.
// get the first item from the queue.
message = messageQueue.poll();
result = process(message);
// I want to return the result to the caller which is in a different thread.
}
Now obviously a return statement would stop the message loop and if the messageQueue contains more messages they'll be lost. Another naive approach would be to use a callback-like mechanism, which requires an extra object + I still need to synchronize the caller with the Callable in the background thread. Something like wait & notify although I have K threads running in the background.
What would be the sophisticated way to handle this situation of returning results of operations from within a message-loop in a different thread, without terminating the thread itself?
#Edit:
I'll give a description of the whole process so that it clarifies what is happening here.
A client sends a message (xml string) to the application through tcp sockets.
The application authenticates the client, and if the client has no associated handler it'll create one.
The app will push the message to the queue of the handler.
The handler runs in a separate thread waiting for incoming messages from clients they're associated with, they MUST NOT handle messages for other clients.
When the handler picks up a message it'll transform it into a SOAP message and will forward it to another system through TCP socket.
When the handler recieves the response it needs to delegate it back to the caller without terminating its message-loop.
So the caller is something like a Dispatcher dispatching messages to the threads that are running the handlers associated with the sender of the message. It also collects the response from the handlers and sends them back to the correct clients.
Each handler, currently has their own message queue where only those messages are pushed which the particular handle has to process. When a handler starts up, it'll open a TCP socket to the target system where they'll forward the incoming messages after transformations were applied. When the handler reaches the maximal allowed idle time (The socket were opened without sending a request) the socket will be closed and the message-loop stopped. At this point the handler will finish its execution. the purpose of this, is to have a socket for each individual clients through which they can send multiple requests without the need for the target system to do another authentication.

Few options/questions come to mind:
Is there a problem to terminate the thread, check the returned result and then re-submit this task to the same thread pool? You will get a result, analyze it, and then resubmit to the pool and continue the work
As this thread runs, it can submit the statuses to a different ("external") queue which is analyzed outside this thread. An independent thread always run and check this queue
That's as far as I could think on how to...

It depends on...
If you want to return simple type you can use a thead safe result queue (global or by caller).
Propably thread pool will be more suitable in your case.
I belive that the most universal way is callback mechanism.

real-time message processing method body in seperate thread

I have method which is passed in real-time data constantly.
The method then evaluates the data:
void processMessage(String messageBeingPassed) {
//evaluate the message here and do something with it
//depending on the current state of the message
//if message.equals("test")
//call separate thread to save to database etc...
//etc...
}
My question is, is there any advantage to putting the entire method body inside a thread for better performance?
such as:
void processMessage(String messageBeingPassed) {
Runnable runnable = new Runnable() {
public void run() {
//evaluate the message here and do something
//depending on the current state of the message
//if message.equals("test")
//call separate thread to save to database etc...
//etc...
}
//start main body thread for this current message etc...
}
}
Thanks for any response.

It will depend on various factors. If that method is a bottleneck for your application (i.e. you get long queues of messages waiting to be processed), then it will likely improve your performance up to a certain point, and then degrade again if you use too many threads. So you should use a thread pool and have like 4 threads responsible for that, or some other amount that works best.
However, if you don't get such queues of messages, then that's hardly going to help you.
Either way, the only way to know for sure is through testing and profiling of what performs best in your application.

The advantage is that you can process multiple messages at once, and the calling method won't need to block while the message is being processed (in other words, message processing will be asynchronous instead of synchronous). The disadvantage is that you open yourself up to data races / deadlocks / etc if you're not careful about designing your methods - generally, if your runnable will ONLY be operating on the messageBeingPassed object (and not e.g. on any static fields), then you should be fine. In addition, threads carry some overhead with them, which you can reduce by using an ExecutorService instead of constructing your own thread objects.

It's depend on the rate of data and the time taken by the "processMessage". If the next data arrives before the "processMessage" method finishes its execution of the previous data, it is a good idea to use a thread inside the "processMessage" method

Java: High-performance message-passing (single-producer/single-consumer)

I initially asked this question here, but I've realized that my question is not about a while-true loop. What I want to know is, what's the proper way to do high-performance asynchronous message-passing in Java?
What I'm trying to do...
I have ~10,000 consumers, each consuming messages from their private queues. I have one thread that's producing messages one by one and putting them in the correct consumer's queue. Each consumer loops indefinitely, checking for a message to appear in its queue and processing it.
I believe the term is "single-producer/single-consumer", since there's one producer, and each consumer only works on their private queue (multiple consumers never read from the same queue).
Inside Consumer.java:
#Override
public void run() {
while (true) {
Message msg = messageQueue.poll();
if (msg != null) {
... // do something with the message
}
}
}
The Producer is putting messages inside Consumer message queues at a rapid pace (several million messages per second). Consumers should process these messages as fast as possible!
Note: the while (true) { ... } is terminated by a KILL message sent by the Producer as its last message.
However, my question is about the proper way to design this message-passing. What kind of queue should I use for messageQueue? Should it be synchronous or asynchronous? How should Message be designed? Should I use a while-true loop? Should Consumer be a thread, or something else? Will 10,000 threads slow down to a crawl? What's the alternative to threads?
So, what's the proper way to do high-performance message-passing in Java?

I would say that the context switching overhead of 10,000 threads is going to be very high, not to mention the memory overhead. By default, on 32-bit platforms, each thread uses a default stack size of 256kb, so that's 2.5GB just for your stack. Obviously you're talking 64-bit but even so, that quite a large amount of memory. Due to the amount of memory used, the cache is going to be thrashing lots, and the cpu will be throttled by the memory bandwidth.
I would look for a design that avoids using so many threads to avoid allocating large amounts of stack and context switching overhead. You cannot process 10,000 threads concurrently. Current hardware has typically less than 100 cores.
I would create one queue per hardware thread and dispatch messages in a round-robin fashion. If the processing times vary considerably, there is the danger that some threads finish processing their queue before they are given more work, while other threads never get through their allotted work. This can be avoided by using work stealing, as implemented in the JSR-166 ForkJoin framework.
Since communication is one way from the publisher to the subscribers, then Message does not need any special design, assuming the subscriber doesn't change the message once it has been published.
EDIT: Reading the comments, if you have 10,000 symbols, then create a handful of generic subscriber threads (one subscriber thread per core), that asynchornously recieve messages from the publisher (e.g. via their message queue). The subscriber pulls the message from the queue, retrieves the symbol from the message, and looks this up in a Map of message handlers, retrieves the handler, and invokes the handler to synchronously handle the message. Once done, it repeats, fetching the next message from the queue. If messages for the same symbol have to be processed in order (which is why I'm guessing you wanted 10,000 queues.), you need to map symbols to subscribers. E.g. if there are 10 subscribers, then symbols 0-999 go to subscriber 0, 1000-1999 to subscriber 1 etc.. A more refined scheme is to map symbols according to their frequency distribution, so that each subscriber gets roughly the same load. For example, if 10% of the traffic is symbol 0, then subscriber 0 will deal with just that one symbol and the other symbols will be distributed amongst the other subscribers.

You could use this (credit goes to Which ThreadPool in Java should I use?):
class Main {
ExecutorService threadPool = Executors.newFixedThreadPool(
Runtime.availableProcessors()*2);
public static void main(String[] args){
Set<Consumer> consumers = getConsumers(threadPool);
for(Consumer consumer : consumers){
threadPool.execute(consumer);
}
}
}
and
class Consumer {
private final ExecutorService tp;
private final MessageQueue messageQueue;
Consumer(ExecutorService tp,MessageQueue queue){
this.tp = tp;
this.messageQueue = queue;
}
#Override
public void run(){
Message msg = messageQueue.poll();
if (msg != null) {
try{
... // do something with the message
finally{
this.tp.execute(this);
}
}
}
}
}
This way, you can have okay scheduling with very little hassle.

First of all, there's no single correct answer unless you either put a complete design doc or you try different approaches for yourself.
I'm assuming your processing is not going to be computationally intensive otherwise you wouldn't be thinking of processing 10000 queues at the same time. One possible solution is to minimise context switching by having one-two threads per CPU. Unless your system is going to be processing data in strict real time that may possibly give you bigger delays on each queue but overall better throughput.
For example -- have your producer thread run on its own CPU and put batches of messages to consumer threads. Each consumer thread would then distribute messages to its N private queues, perform the processing step, receive new data batch and so on. Again, depends on your delay tolerance so the processing step may mean either processing all the queues, a fixed number of queues, as many queues it can unless a time threshold is reached. Being able to easily tell which queue belongs to which consumer thread (e.g. if queues are numbered sequentially: int consumerThreadNum = queueNum & 0x03) would be beneficial as looking them up in a hash table each time may be slow.
To minimise memory thrashing it may not be such a good idea to create/destroy queues all the time so you may want to pre-allocate a (max number of queues/number of cores) queue objects per thread. When a queue is finished instead of being destroyed it can be cleared and reused. You don't want gc to get in your way too often and for too long.
Another unknown is if your producer produces complete sets of data for each queue or will send data in chunks until the KILL command is received. If your producer sends complete data sets you may do away with the queue concept completely and just process the data as it arrives to a consumer thread.

Have a pool of consumer threads relative to the hardware and os capacity. These consumer threads could poll your message queue.
I would either have the Messages know how to process themselves or register processors with the consumer thread classes when they are initialized.

In the absence of more detail about the constraints of processing the symbols, its hard to give very specific advice.
You should take a look at this slashdot article:
http://developers.slashdot.org/story/10/07/27/1925209/Java-IO-Faster-Than-NIO
It has quite a bit of discussions and actual measured data about the many thread vs. single select vs. thread pool arguments.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.