i'm new to akka. I made a ping-pong example between two actors (Ping Actor and Pong Actor) from two nodes on local machine and then test them in 2 different ways. Basically, Ping Actor will send a message is System.nanoTime() to Pong Actor. After get the message, Pong actor resend the received nano time back to Ping Actor.
Then I can calculate the taken time of a ping-pong round.
Way 1: main <-> pingActor <-> pongActor
Main1:
for (int i = 0; i < 10; i++) {
pingActor.tell("start", null);
}
PingActor:
public Receive createReceive() {
return receiveBuilder()
.matchEquals("start", start -> {
pongActor.tel(System.nanoTime(), self());
})
.match(String.class, aboveTime -> {
long timeDiff = System.nanoTime() - Long.parseLong(aboveTime);
System.out.println(timeDiff);
})
.build();
}
Way 2: main -> pingActor <-> pongActor
Main2:
pingActor.tell("start", null);
PingActor:
public Receive createReceive() {
return receiveBuilder()
.matchEquals("start", start -> {
pongActor.tel(System.nanoTime(), self());
})
.match(String.class, aboveTime -> {
long timeDiff = System.nanoTime() - Long.parseLong(aboveTime);
System.out.println(timeDiff);
pongActor.tel(time, self());
})
.build();
}
My test result shows that the 1st way is really slower (100ms average) than the 2nd way (1ms average). I need the explanation, and how to make a call from outside of an actor as fast as inside it does?
Thank you
The compare between way1 & way2 is not fair.
way1
Ping actor will concurrently handle more than one message at same time, as the send speed of Main1 is too quick, some message will queued in mailbox, this result in some response message from Pong also queued in Ping's mailbox, so the differ here is not the real speed of message transfer.
You need to be aware that actor is run in one thread unless you increase nr-of-instances. Meanwhile, akka's actor model does not use coroutine, in the backend it still works on threadpool, so if the scenario is more complex, you may find more delay, so you need to carefully handle non-blocking for akka based application.
way2
From way1's explanation, you could find your way2 in fact is run in sequence mode, which the differ reflects the real transfer speed between actors, it does not include the buffer time in queue. So it quicker than way1. If your Main2 also use a for loop, the speed will also drop; correspondingly, if you add some delay in for loop such as Thread.sleep(1000), the way1's differ will be smaller. However, I think your test data for way2 still a little slower, on my local machine, the message transfer speed is 50,000+ per second.
For how to make a call from outside of an actor as fast as inside it does?
This is really a big question, what you do is just the way, but you may need to carefully design your actors to fully make use of underlying threadpool & also you need to know the running mechanism of akka's actors, then you can make akka powerful in your application.
Related
I have a simple class named QueueService with some methods that wrap the methods from the AWS SQS SDK for Java. For example:
public ArrayList<Hashtable<String, String>> receiveMessages(String queueURL) {
List<Message> messages = this.sqsClient.receiveMessage(queueURL).getMessages();
ArrayList<Hashtable<String, String>> resultList = new ArrayList<Hashtable<String, String>>();
for(Message message : messages) {
Hashtable<String, String> resultItem = new Hashtable<String, String>();
resultItem.put("MessageId", message.getMessageId());
resultItem.put("ReceiptHandle", message.getReceiptHandle());
resultItem.put("Body", message.getBody());
resultList.add(resultItem);
}
return resultList;
}
I have another another class named App that has a main and creates an instace of the QueueService.
I looking for a "pattern" to make the main in App to listen for new messages in the queue. Right now I have a while(true) loop where I call the receiveMessagesmethod:
while(true) {
messages = queueService.receiveMessages(queueURL);
for(Hashtable<String, String> message: messages) {
String receiptHandle = message.get("ReceiptHandle");
String messageBody = message.get("MessageBody");
System.out.println(messageBody);
queueService.deleteMessage(queueURL, receiptHandle);
}
}
Is this the correct way? Should I use the async message receive method in SQS SDK?
To my knowledge, there is no way in Amazon SQS to support an active listener model where Amazon SQS would "push" messages to your listener, or would invoke your message listener when there are messages.
So, you would always have to poll for messages. There are two polling mechanisms supported for polling - Short Polling and Long Polling. Each has its own pros and cons, but Long Polling is the one you would typically end up using in most cases, although the default one is Short Polling. Long Polling mechanism is definitely more efficient in terms of network traffic, is more cost efficient (because Amazon charges you by the number of requests made), and is also the preferred mechanism when you want your messages to be processed in a time sensitive manner (~= process as soon as possible).
There are more intricacies around Long Polling and Short Polling that are worth knowing, and its somewhat difficult to paraphrase all of that here, but if you like, you can read a lot more details about this through the following blog. It has a few code examples as well that should be helpful.
http://pragmaticnotes.com/2017/11/20/amazon-sqs-long-polling-versus-short-polling/
In terms of a while(true) loop, I would say it depends.
If you are using Long Polling, and you can set the wait time to be (max) 20 seconds, that way you do not poll SQS more often than 20 seconds if there are no messages. If there are messages, you can decide whether to poll frequently (to process messages as soon as they arrive) or whether to always process them in time intervals (say every n seconds).
Another point to note would be that you could read upto 10 messages in a single receiveMessages request, so that would also reduce the number of calls you make to SQS, thereby reducing costs. And as the above blog explains in details, you may request to read 10 messages, but it may not return you 10 even if there are that many messages in the queue.
In general though, I would say you need to build appropriate hooks and exception handling to turn off the polling if you wish to at runtime, in case you are using a while(true) kind of a structure.
Another aspect to consider is whether you would like to poll SQS in your main application thread or you would like to spawn another thread. So another option could be to create a ScheduledThreadPoolExecutor with a single thread in the main to schedule a thread to poll the SQS periodically (every few seconds), and you may not need a while(true) structure.
There are a few things that you're missing:
Use the receiveMessages(ReceiveMessageRequest) and set a wait time to enable long polling.
Wrap your AWS calls in try/catch blocks. In particular, pay attention to OverLimitException, which can be thrown from receiveMessages() if you would have too many in-flight messages.
Wrap the entire body of the while loop in its own try/catch block, logging any exceptions that are caught (there shouldn't be -- this is here to ensure that your application doesn't crash because AWS changed their API or you neglected to handle an expected exception).
See doc for more information about long polling and possible exceptions.
As for using the async client: do you have any particular reason to use it? If not, then don't: a single receiver thread is much easier to manage.
If you want to use SQS and then lambda to process the request you can follow the steps given in the link or you always use lambda instead of SQS and invoke lambda for every request.
As of 2019 SQS can trigger lambdas:
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
I found one solution for actively listening the queue.
For Node. I have used the following package and resolved my issue.
sqs-consumer
Link
https://www.npmjs.com/package/sqs-consumer
(disclaimer: I'm less than a beginner with akka)
Suppose I have an actor calling a method that never terminates (it's an extreme example, you can think about calling a method that has the chance to terminate in a long time or never).
For example (Java)
public static class InfiniteLoop{
public static int neverReturns(){
int x = 0;
while(true){
int++;
}
return x;
}
}
Now if, while processing a message, an actor calls
InfiniteLoop.neverReturns()
the actor will never terminate.
Is there a way to kill it while it is still processing a message? If yes, will the loop continue in background?
(what I'm trying to understand is if there is a way to recover from a "infinite loop"-style fault in an akka system)
There is no way to implement such thing in Akka. All methods to stop an actor rely on sending a message to the actor. If your actor is stuck processing the current message because of the loop, it will never process the message telling it to stop. This is the base of the akka actor model, messages in the mailbox are processed in order, and I don't think you can find a way around that. Check this article for your options on stopping/killing an actor: https://petabridge.com/blog/how-to-stop-an-actor-akkadotnet/. You will see how the semantics of the stopping change for each method, but they all start by sending a message to the actor.
It would be nice to know why you need such a thing, cause maybe your base requirement can be implemented in a more akka-ish way. For instance, the potentially blocking action could be wrapped in a future if it's okay for the actor to pass to the next message.
I have this simple code:
List<ActorRef> actors = new ArrayList<>();
for (int i = 0; i < ACTOR_COUNT; i++) {
actors.add(system.actorOf(...));
}
for (ActorRef actor : actors) {
system.scheduler().schedule(FiniteDuration.create(0, TimeUnit.MILLISECONDS),
FiniteDuration.create(1000, TimeUnit.MILLISECONDS), actor, "Run", system.dispatcher(), null);
}
It creates a number of actors and then creates a scheduler for each of them. Actors itself are responsible for querying MQ and then process a message.
When ACTOR_COUNT > 30, everything is good. But otherwise, we have a memory leak (instances of akka.dispatch.Envelopes with message "Run" are filling up and can't be garbage collected)
It's pretty weird, because when we have more actors, then we have more messages (1 per second for each of them) - but unexpectedly it STOPS filling up when there are more actors/messages.
Time interval (1000 ms) doesn't really affect the situation, it just make it slower or faster.
Could you please explain this behavior for me?
Thank you.
UPDATE
Here is a dummy actor, which can help to isolate a problem.
public class MessageQueueTestActor extends UntypedActor {
private static final Logger log = LoggerFactory.getLogger(MessageQueueTestActor.class);
#Override
public void onReceive(Object message) throws Exception {
Thread.sleep(3000);
}
}
The problem is reproduced with ACTOR_COUNT = 5. Now it's obvious that when actor sleep time > scheduler interval, envelopes are filling up. If I reduce sleep time from 3000ms to 500ms, the problem is gone.
But messages also become available for garbage collector if I increase the number of actors up to 30 (with the same sleep time = 3000ms). Why? Looks like something in Akka starts working differently after that threashold.
This is a «debug my code» question, not sure whether it should be here, but I'll answer in any case.
The scheduler does not enqueue the message into the actor’s mailbox itself, it uses the given dispatcher to do that. Since you block the threads in the default dispatcher and also use that to do the enqueueing, there is a point at which messages from the scheduler do not reach the mailboxes anymore (I assume that your default dispatcher has 30 threads). More correctly: they reach it one by one while the actors process up to five messages during each turn they get.
So, nothing is GC-ed, you just enqueue a different thing (Runnable) at a different place (default dispatcher). Your program will never work sustainably if the processing time is greater than the tick period.
I'm consistently seeing very long delays (60+ seconds) between two actors, from the time at which the first actor sends a message for the second, and when the second actor's onReceive method is actually called with the message. What kinds of things can I look for to debug this problem?
Details
Each instance of ActorA is sending one message for ActorB with ActorRef.tell(Object, ActorRef). I collect a millisecond timestamp (with System.currentTimeMillis()) right after calling the tell method in ActorA, and getting another one at the start of ActorB's onReceive(Object). The interval between these timestamps is consistently 60 seconds or more. Specifically, when plotted over time, this interval follows a rough saw tooth pattern that ranges from more 60 second to almost 120 seconds, as shown in the graph below.
These actors are early in the data flow of the system, there are several other actors that follow after ActorB. This large gap only occurs between these two specific actors, the gaps between other pairs of adjacent actors is typically less than a millisecond, occassionally a few tens of milliseconds. Additionally, the actual time spent inside any given actor is never more than a second.
Generally, each actor in the system only passes a single message to another actor. One of the actors (subsequent to ActorB) sends a single message to each of a few different actors, and a small percentage (less than 0.1%) of the time, certain actors will send multiple messages to the same subsequent actor (i.e., multiple instances of the subsequent actor will be demanded). When this occurs, the number of multiple messages is typically on the order of a dozen or less.
Can this be explained (explicitely) by the normal reactive nature of Akka? Does it indicate a problem with the way work is distributed or the way the actors are configured? Is there something that can explicitly block a particular actor from spinning up? What other information should I collect or look at to understand the source of this, or to understand whether or not it is actually a problem?
You have a limited thread pool. If your Actors block, they still take up space in the thread pool. New threads will not be created if your thread pool is saturated.
You may want to configure
core-pool-size-factor,
core-pool-size-min, and
core-pool-size-max.
If you expect certain actions to block, you can instead wrap them in Future { blocking { ... } } and register a callback. But it's better to use asynchronous, non-blocking calls.
I have 2 classes. One (A) collects some data and the other (B) sends the data to TCP/IP clients. The process is asynchronous with refresh rates from nearly zero to a few seconds.
Note that this application has no GUI so I won't be able to use many built in "onChange" listeners.
In normal conditions I would simply write the code so that A calls a "send" method on B, passing the data, no problems here.
Now, assume that the rate A collects data is critical (real time) and that A cannot wait for B to complete the sending process (note that B uses TCP, not UDP). The way I implemented this is
A places the data in a field in B
B has a continuous loop that checks if the data is new or now. If new, it will send it out.
If during the send the data is updated a few times it doesn't matter, as long as it doesn't slow down A.
Spawning a new thread for each send would in principle not slow down A but it's likely gonna result in a mess.
You can see that B is working in synchronous mode (but A isn't) and it's implemented with a while loop with a Thread.sleep() call. My questions are:
Should I use a timer task instead of the while loop? I know that most people hate the Thread.sleep() call but ultimately the only thing I'm interested is in keeping CPU low.
Isn't there a more elegant way than the synchronous approach? In some cases the data refresh of A is about 1 second and it would be nice if I could just have a listener that would act on an event. In such case a sleep time of 25ms would be a waste of cycles. In other cases it's very fast and I'd like no sleep at all.
*Example: imagine that A is submitting screenshots from your screen and B is sending them to the clients. Only the last one matters and B is gonna go as fast as possible *
Any ideas or suggestions? Please keep things as simple and low cpu as possible
thanks a lot!
I would make it like this:
A collects the data in whatever fashion is appropriate and then post the "next message" to send. If there is already a message pending, let the new message replace / update the previous.
B checks for ay pending messages, if one is available it grabs it and send it to the client(s). However, if no message is pending, then B waits for one to be available.
Object lock = new Object();
Object pending = null;
public void post(Object message) {
synchronized (lock) {
pending = message;
lock.notifyAll();
}
}
public Object getNextMessage() {
Object message;
synchronized (lock) {
while (pending == null) {
try {
lock.wait();
} catch (InterruptedException e) {
// Ignore
}
}
message = pending;
pending = null;
}
return message;
}
Using a queue you could instead do
BlockingDeque<Object> queue = new LinkedBlockingDeque<Object>(1);
public void postMessage(Object message) {
// If previous message is still pending we replace it.
queue.clear();
queue.offer(message);
}
public Object getNextMessage() {
while (true) {
try {
return queue.take();
} catch (InterruptedException e) {
// Ignore interrupts
}
}
}
Of course in both example it would be good to instead of the while (true) use a signal so you can gracefully shut down.
I would set up a LinkedBlockingQueue between A and B whose size should not block A when the queue becomes full. In A, the method that collects the data will post it to the queue. In B, as long as there is an item in the queue, it is new and should be sent out.
If you want B to take advantage of multiple edits to a message by A to be merged and sent out as a single update, then I would do it using the Observer.
The message that A keeps updating is the Observable.
B is an observer of this message.
Every time A updates the message, it is an indication for B to take some action.
B can choose to send the update to the clients immediately
B can also choose to wait for a certain period of time using a Timer and send the update to clients only after the timer fires off. The code to send update will be the TimerTask.
B would not set the Timer again until A changes the message.
You can use an Exchanger
B will send information, use the exchanger to exchange (he might wait for A and its fine)
Once exchange is made he will send the information.
A will use the exchanger with timeout 0, which means if B isn't already waiting then we skip this exchange, if he waiting the the exchange will be made and A will continue with his job and B can now send information.
Information that comes while B is busy will be ignore (the exchange in A with timeout 0 will just throw an exception if B is busy, make sure you catch it)
The most elegant way is using a message queue. A writes data to the queue as soon as it is available. B subscribes to the queue and is notified whenever new data is in. A message queue handles everything for you.
However you should be more explicit: should B be notified for each and every message? What happens if an update is lost?