Akka: unexpected mailbox filling up

Akka: unexpected mailbox filling up - java

I have this simple code:
List<ActorRef> actors = new ArrayList<>();
for (int i = 0; i < ACTOR_COUNT; i++) {
actors.add(system.actorOf(...));
}
for (ActorRef actor : actors) {
system.scheduler().schedule(FiniteDuration.create(0, TimeUnit.MILLISECONDS),
FiniteDuration.create(1000, TimeUnit.MILLISECONDS), actor, "Run", system.dispatcher(), null);
}
It creates a number of actors and then creates a scheduler for each of them. Actors itself are responsible for querying MQ and then process a message.
When ACTOR_COUNT > 30, everything is good. But otherwise, we have a memory leak (instances of akka.dispatch.Envelopes with message "Run" are filling up and can't be garbage collected)
It's pretty weird, because when we have more actors, then we have more messages (1 per second for each of them) - but unexpectedly it STOPS filling up when there are more actors/messages.
Time interval (1000 ms) doesn't really affect the situation, it just make it slower or faster.
Could you please explain this behavior for me?
Thank you.
UPDATE
Here is a dummy actor, which can help to isolate a problem.
public class MessageQueueTestActor extends UntypedActor {
private static final Logger log = LoggerFactory.getLogger(MessageQueueTestActor.class);
#Override
public void onReceive(Object message) throws Exception {
Thread.sleep(3000);
}
}
The problem is reproduced with ACTOR_COUNT = 5. Now it's obvious that when actor sleep time > scheduler interval, envelopes are filling up. If I reduce sleep time from 3000ms to 500ms, the problem is gone.
But messages also become available for garbage collector if I increase the number of actors up to 30 (with the same sleep time = 3000ms). Why? Looks like something in Akka starts working differently after that threashold.

This is a «debug my code» question, not sure whether it should be here, but I'll answer in any case.
The scheduler does not enqueue the message into the actor’s mailbox itself, it uses the given dispatcher to do that. Since you block the threads in the default dispatcher and also use that to do the enqueueing, there is a point at which messages from the scheduler do not reach the mailboxes anymore (I assume that your default dispatcher has 30 threads). More correctly: they reach it one by one while the actors process up to five messages during each turn they get.
So, nothing is GC-ed, you just enqueue a different thing (Runnable) at a different place (default dispatcher). Your program will never work sustainably if the processing time is greater than the tick period.

Related

Is there a way to pause akka actor for X time

We have an Akka actor that is reading a big folder with many files.
It reads them for the processing of other actors.
It seems like it reads the too fast and eventually we run into OutOfMemoryException .
We like to know if we can somehow pause/sleep it for X time?

You can pause it yourself, by that I mean you should have the actor stop processing files and setup a timer to send itself a resume message after X seconds/minutes.
But... you'll never find the right amount of time (X above), it'll always be too long or too short. This kind of problem is why akka-stream exists. I would recommend you to have a look at it because it is the only good solution to that kind of problems.

For out of memory problem you can increase the queue size of the actors mail box.
And for sleeping/pausing you can use schedular to schedule the message to the actor for a specific amount of time.
There's no need to explicitly cause an actor to sleep: using loop and react for each actor means that the underlying thread pool will have waiting threads whilst there are no messages for the actors to process.
In the case that you want to schedule events for your actors to process, this is pretty easy using a single-threaded scheduler from the java.util.concurrent utilities:
object Scheduler {
import java.util.concurrent.Executors
import scala.compat.Platform
import java.util.concurrent.TimeUnit
private lazy val sched =Executor.
new SingleThreadScheduledExecutor();
def schedule(f: => Unit, time: Long) {
sched.schedule(new Runnable {
def run = f
},
time , TimeUnit.MILLISECONDS)}}

calling an akka actor from its outside is very slow

i'm new to akka. I made a ping-pong example between two actors (Ping Actor and Pong Actor) from two nodes on local machine and then test them in 2 different ways. Basically, Ping Actor will send a message is System.nanoTime() to Pong Actor. After get the message, Pong actor resend the received nano time back to Ping Actor.
Then I can calculate the taken time of a ping-pong round.
Way 1: main <-> pingActor <-> pongActor
Main1:
for (int i = 0; i < 10; i++) {
pingActor.tell("start", null);
}
PingActor:
public Receive createReceive() {
return receiveBuilder()
.matchEquals("start", start -> {
pongActor.tel(System.nanoTime(), self());
})
.match(String.class, aboveTime -> {
long timeDiff = System.nanoTime() - Long.parseLong(aboveTime);
System.out.println(timeDiff);
})
.build();
}
Way 2: main -> pingActor <-> pongActor
Main2:
pingActor.tell("start", null);
PingActor:
public Receive createReceive() {
return receiveBuilder()
.matchEquals("start", start -> {
pongActor.tel(System.nanoTime(), self());
})
.match(String.class, aboveTime -> {
long timeDiff = System.nanoTime() - Long.parseLong(aboveTime);
System.out.println(timeDiff);
pongActor.tel(time, self());
})
.build();
}
My test result shows that the 1st way is really slower (100ms average) than the 2nd way (1ms average). I need the explanation, and how to make a call from outside of an actor as fast as inside it does?
Thank you

The compare between way1 & way2 is not fair.
way1
Ping actor will concurrently handle more than one message at same time, as the send speed of Main1 is too quick, some message will queued in mailbox, this result in some response message from Pong also queued in Ping's mailbox, so the differ here is not the real speed of message transfer.
You need to be aware that actor is run in one thread unless you increase nr-of-instances. Meanwhile, akka's actor model does not use coroutine, in the backend it still works on threadpool, so if the scenario is more complex, you may find more delay, so you need to carefully handle non-blocking for akka based application.
way2
From way1's explanation, you could find your way2 in fact is run in sequence mode, which the differ reflects the real transfer speed between actors, it does not include the buffer time in queue. So it quicker than way1. If your Main2 also use a for loop, the speed will also drop; correspondingly, if you add some delay in for loop such as Thread.sleep(1000), the way1's differ will be smaller. However, I think your test data for way2 still a little slower, on my local machine, the message transfer speed is 50,000+ per second.
For how to make a call from outside of an actor as fast as inside it does?
This is really a big question, what you do is just the way, but you may need to carefully design your actors to fully make use of underlying threadpool & also you need to know the running mechanism of akka's actors, then you can make akka powerful in your application.

Managing Java threads in a CPU instruction pipeline simulator

I have implemented a 5-Stage CPU instruction pipeline simulator in Java using multi-threading.
Each Stage is a thread that performs mainly below 3 functions, also there is a queue (of capacity 1) in-between every two stages.
Receive from the previous stage.
Process i.e. perform its main responsibility.
Forward to the next stage.
#Override
public void run() {
while (!(latchQueue.isEmpty())) {
fetch();
process();
forward();
}
}
Simulation works fine. This is where I’m stuck, I want to be able to simulate only a specified number of clock cycles. so, the simulator should stop/pause once it has reached the specified number of cycles.
As of now, I have started all the 5 threads and let it simulate the processing of all the instructions rather than limiting it by clock cycles.
How can I accomplish this? do I need to pause thread when specified clock cycles have reached? If so how can I gracefully handle suspending/stopping the threads? Please help me in choosing the best possible approach.
Thanks in advance :)

You are already using some concurrent queue to communicate between the threads (exactly how it works isn't clear because your code example is quite incomplete).
So you can count cycles at the first stage, and use that same mechanism to communicate: shove a sentinel object, which represents "time to stop/pause this thread", onto the queue for the first stage, and when processed it pauses the processor (and still forwards it to the next stage, so all stages will progressively shut down). For example, you could extend the type of objects passed in your queue so that the hierarchy contains both real payload objects (e.g., decoded instructions, etc) or "command objects" like this stop/pause sentinel.
Another asynchronous solution would be to Thread.interrupt each thread and add an interrupt check in your processing loop - that's mostly to gracefully shut down, and not so much to support a "pause" functionality.

Will following work?
Share following class CyclesCounter between all your threads representing stages. It has tryReserve method, getting true from it means thread has got enough "clock cycles" for its' next run. Getting false means there's not enough cycles left. Class is thread-safe.
After getting false, perhaps, your thread should just stop then (i.e., by returning from run()) -- no way it can get enough nr of cycles (due to your requirements, as I understood them), until whole session is run again.
class CyclesManager {
private final AtomicInteger cycles;
CyclesManager(int initialTotalCycles) {
if (initialTotalCycles < 0)
throw new IllegalArgumentException("Negative initial cycles: " + initialTotalCycles);
cycles = new AtomicInteger(initialTotalCycles);
}
/**
* Tries to reserve given nr of cycles from available total nr of cycles. Total nr is decreased accordingly.
* Method is thread-safe: total nr of is consistent if called from several threads concurrently.
*
* #param cyclesToReserve how many cycles we want
* #return {#code true} if cycles are ours, {#code false} if not -- there's not enough left
*/
boolean tryReserve(int cyclesToReserve) {
int currentCycles = cycles.get();
if (currentCycles < cyclesToReserve)
return false;
return cycles.compareAndSet(currentCycles, currentCycles - cyclesToReserve);
}
}

Long delay between Akka actors

I'm consistently seeing very long delays (60+ seconds) between two actors, from the time at which the first actor sends a message for the second, and when the second actor's onReceive method is actually called with the message. What kinds of things can I look for to debug this problem?
Details
Each instance of ActorA is sending one message for ActorB with ActorRef.tell(Object, ActorRef). I collect a millisecond timestamp (with System.currentTimeMillis()) right after calling the tell method in ActorA, and getting another one at the start of ActorB's onReceive(Object). The interval between these timestamps is consistently 60 seconds or more. Specifically, when plotted over time, this interval follows a rough saw tooth pattern that ranges from more 60 second to almost 120 seconds, as shown in the graph below.
These actors are early in the data flow of the system, there are several other actors that follow after ActorB. This large gap only occurs between these two specific actors, the gaps between other pairs of adjacent actors is typically less than a millisecond, occassionally a few tens of milliseconds. Additionally, the actual time spent inside any given actor is never more than a second.
Generally, each actor in the system only passes a single message to another actor. One of the actors (subsequent to ActorB) sends a single message to each of a few different actors, and a small percentage (less than 0.1%) of the time, certain actors will send multiple messages to the same subsequent actor (i.e., multiple instances of the subsequent actor will be demanded). When this occurs, the number of multiple messages is typically on the order of a dozen or less.
Can this be explained (explicitely) by the normal reactive nature of Akka? Does it indicate a problem with the way work is distributed or the way the actors are configured? Is there something that can explicitly block a particular actor from spinning up? What other information should I collect or look at to understand the source of this, or to understand whether or not it is actually a problem?

You have a limited thread pool. If your Actors block, they still take up space in the thread pool. New threads will not be created if your thread pool is saturated.
You may want to configure
core-pool-size-factor,
core-pool-size-min, and
core-pool-size-max.
If you expect certain actions to block, you can instead wrap them in Future { blocking { ... } } and register a callback. But it's better to use asynchronous, non-blocking calls.

Java execute task with a number of retries and a timeout

I'm trying to create a method that executes a given task in a maximum amount of time. If it fails to finish in that time, it should be retried a number of times before giving up. It should also wait a number of seconds between each try. Here's what I've come up with and I'd like some critiques on my approach. Is their a simpler way to do this using the ScheduledExecutorService or is my way of doing this suffice?
public static <T> T execute(Callable<T> task, int tries, int waitTimeSeconds, int timeout)
throws InterruptedException, TimeoutException, Exception {
Exception lastThrown = null;
for (int i = 0; i < tries; i++) {
try {
final Future<T> future = new FutureTask<T>(task);
return future.get(timeout, TimeUnit.SECONDS);
} catch (TimeoutException ex) {
lastThrown = ex;
} catch (ExecutionException ex) {
lastThrown = (Exception) ex.getCause();
}
Thread.sleep(TimeUnit.SECONDS.toMillis(waitTimeSeconds));
}
if (lastThrown == null) {
lastThrown = new TimeoutException("Reached max tries without being caused by some exception. " + task.getClass());
}
throw lastThrown;
}

I think, but it's my opinion, that if you are scheduling network related tasks, you should not retry but eventually run them in parallel. I describe this other approach later.
Regarding your code, you should pass the task to an executor, or the FutureTask to a thread. It will not spawn a thread or execute by itself. If you have an executor (see ExecutorService), you don't even need a FutureTask, you can simply schedule it and obtain a callable.
So, given that you have an ExecutorService, you can call :
Future<T> future = yourExecutor.submit(task);
Future.get(timeout) will wait for that timeout and eventually return with TimeoutException even if the task has never started at all, for example if the Executor is already busy doing other work and cannot find a free thread. So, you could end up trying 5 times and waiting for seconds without ever giving the task a chance to run. This may or may not be what you expect, but usually it is not. Maybe you should wait for it to start before giving it a timeout.
Also, you should explicitly cancel the Future even if it throws TimeoutException, otherwise it may keep running, since nor documentation nor code says it will stop when a get with timeout fails.
Even if you cancel it, unless the Callable has been "properly written", it could keep running for some time. Nothing you can do it about it in this part of code, just keep in mind that no thread can "really stop" what another thread is doing in Java, and for good reasons.
However I suppose your tasks will mostly be network related, so it should react correctly to a thread interruption.
I usually use a different strategy is situations like this:
I would write public static T execute(Callable task, int maxTries, int timeout), so the task, max number of tries (potentially 1), max total timeout ("I want an answer in max 10 seconds, no matter how many times you try, 10 seconds or nothing")
I start spawning the task, giving it to an executor, and then call future.get(timeout/tries)
If I receive a result, return it. If I receive an exception, will try again (see later)
If however i get a timeout, I DON'T cancel the future, instead I save it in a list.
I check if too much time has passed, or too many retries. In that case I cancel all the futures in the list and throw exception, return null, whatever
Otherwise, I cycle, schedule the task again (in parallel with the first one).
See point 2
If I have not received a result, I check the future(s) in the list, maybe one of the previous spawned task managed to do it.
Assuming your tasks can be executed more than once (as I suppose they are, otherwise no way to retry), for network stuff I found this solution to work better.
Suppose your network is actually very busy, you ask for a network connection, giving 20 retries 2 seconds each. Since your network is busy, none of the 20 retries manages to get the connection in 2 seconds. However, a single execution lasting 40 seconds may manage to connect and receive data. It's like a person pressing f5 compulsively on a page when the net is slow, it will not do any good, since every time the browser has to start from the beginning.
Instead, I keep the various futures running, the first one that manages to get the data will return a result and the others will be stopped. If the first one hangs, the second one will work, or the third one maybe.
Comparing with a browser, is like opening another tab and retrying to load the page there without closing the first one. If the net is slow, the second one will take some time, but not stopping the first one, which will eventually load properly. If instead the first tab was hung, the second one will load rapidly. Whichever loads first, we can close the other tab.

The thread on which your execute is called will block for so much time. Not sure if this is correct for you. Basically , for these types of tasks , ScheduledExecutorService is best.You can schedule a task and specify the timings. Take a look at ScheduledThreadPoolExecutor

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.