Akka DeathWatch - Find reason for termination

Akka DeathWatch - Find reason for termination - java

Question: How can I find out if an actor was stopped gracefully (e.g. through its parent stopping) or through an exception?
Context: With the following deathwatch setup I only get the Terminated.class message in the good test, where I explicitly call stop. I expected a Terminated.class message only in the bad case. Using a supervisorStrategy that stops the child that threw an exception would make no difference, as this leads to the behaviour of the good test. And there I can't find a way to decide if it was caused by an exception or not.
My test setup is the following:
DeathWatch
public class DeathWatch extends AbstractActor {
#Override
public Receive createReceive() {
return receiveBuilder()
.matchAny(this::logTerminated)
.build();
}
private <P> void logTerminated(final P p) {
log.info("terminated: {}", p);
}
}
Actor
public class MyActor extends AbstractActor {
#Override
public Receive createReceive() {
return receiveBuilder()
.matchEquals("good", s -> { getContext().stop(self()); })
.matchEquals("bad", s -> { throw new Exception("baaaad"); })
.build();
}
}
Test
public class Test {
private TestActorRef<Actor> actor;
#Before
public void setUp() throws Exception {
actor = TestActorRef.create(ActorSystem.create(), Props.create(MyActor.class), "actor");
TestActorRef.create(ActorSystem.create(), Props.create(DeathWatch.class),"deathwatch").watch(actor);
}
#Test
public void good() throws Exception {
actor.tell("good", ActorRef.noSender());
}
#Test
public void bad() throws Exception {
actor.tell("bad", ActorRef.noSender());
}
}
Update: Adding the following supervisor, leads to a second logging of "terminated", but yields no further context information.
public class Supervisor extends AbstractActor {
private final ActorRef child;
#Override
public Receive createReceive() {
return receiveBuilder()
.match(String.class, s -> child.tell(s, getSelf()))
.build();
}
#Override
public SupervisorStrategy supervisorStrategy() {
return new OneForOneStrategy(DeciderBuilder.match(Exception.class, e -> stop()).build());
}
}

The Terminated message is behaving as expected. From the documentation:
In order to be notified when another actor terminates (i.e. stops permanently, not temporary failure and restart), an actor may register itself for reception of the Terminated message dispatched by the other actor upon termination.
And here:
Termination of an actor proceeds in two steps: first the actor suspends its mailbox processing and sends a stop command to all its children, then it keeps processing the internal termination notifications from its children until the last one is gone, finally terminating itself (invoking postStop, dumping mailbox, publishing Terminated on the DeathWatch, telling its supervisor)....
The postStop() hook is invoked after an actor is fully stopped.
The Terminated message isn't reserved for the scenario in which an actor is stopped due to an exception or error; it comes into play whenever an actor is stopped, including scenarios in which the actor is stopped "normally." Let's go through each scenario in your test case:
"Good" case without an explicit supervisor: MyActor stops itself, calls postStop (which isn't overridden, so nothing happens in postStop), and sends a Terminated message to the actor that's watching it (your DeathWatch actor).
"Good" case with an explict supervisor: same as 1.
"Bad" case without an explicit supervisor: The default supervision strategy is used, which is to restart the actor. A restart does not trigger the sending of a Terminated message.
"Bad" case with an explicit supervisor: the supervisor handles the Exception, then stops MyActor, again launching the termination chain described above, resulting in a Termination message sent to the watching actor.
So how does one distinguish between the "good" and "bad" cases when an actor is stopped? Look at the logs. The SupervisorStrategy, by default, logs Stop failures at the ERROR level.
When an exception is thrown, if you want to do more than log the exception, consider restarting the actor instead of stopping it. A restart, unlike a stop, always indicates that something went wrong (as mentioned earlier, a restart is the default strategy when an exception is thrown). You could place post-exception logic inside the preRestart or postRestart hook.
Note that when an exception is thrown while an actor is processing a message, that message is lost, as described here. If you want to do something with that message, you have to catch the exception.
If you have an actor that you want to inform whenever an exception is thrown, you can send a message to this monitor actor from within the parent's supervisor strategy (the parent of the actor that can throw an exception). This assumes that the parent actor has a reference to this monitor actor. If the strategy is declared inside the parent and not in the parent's companion object, then the body of the strategy has access to the actor in which the exception was thrown (via sender). ErrorMessage below is a made-up class:
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case t: Throwable =>
val problemActor = sender
monitorActor ! ErrorMessage(t, problemActor)
Stop
}

Related

How to make the ChannelOutboundHandler.write() method to be asynchronous？

My project uses the Java Netty framework to transfer messages. The application is both a client and a server. When we send a message to the remote server, we want to do some processing of this message. I use ChannelOutboundHandler.write() in my project to achieve this purpose:
public class MyOutBoundHandler extends ChannelOutboundHandlerAdapter{
#Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception {
process((ByteBuf) msg); // do some processing of this message
ctx.write(msg, promise);
}
}
I found a problem that when the process((ByteBuf) msg) method throws an exception, it will cause blocking, and the next method ctx.write(msg, promise) will not be executed. So how to make them asynchronous cause I hope the process((ByteBuf) msg) will not affect the writing of messages to the remote server.

If 'ctx.write(msg, promise)' does not rely on the result of the 'process((ByteBuf) msg)', you can just wrap the 'process((ByteBuf) msg)' in a runnable task and submit the task to the ThreadPool.

found a problem that when the process((ByteBuf) msg) method throws an exception, it will cause blocking, and the next method ctx.write(msg, promise) will not be executed
Unless you executing blocking code, netty will not block.
Behavior you are explaining is not blocking, it is just how control flow in java works. If an exception is thrown, none of the subsequent code will be executed unless you explicitly catch and resume.
Ideally, in your case, you want to add a try-catch block around the call to process() and if it fails, fail the promise using promise.tryFailure()

you can just add a listener for the ChannelPromise before process method was called。
refer to the following code here :
promise.addListener(new ChannelFutureListener() {
#Override
public void operationComplete(ChannelFuture future) throws Exception {
Throwable cause = future.cause();
if (cause != null) {
//process happed exception will be here and you can call ctx.write(msg)
//to keep spreading forward the write event in the pipeline
ctx.write(msg);
} else {
// when msg has been write to socket successfully, netty will notify here!!
}
}
});
process(msg);

How to detect akka actor termination is due to system shutdown and avoid restarting it

I have a Spring application that makes use of a small Akka actor system (using Java), where I have a MasterActor that extends Akka's AbstractActor that initialises a Router and sets up a few worker actors. It also watches the lifecycle of the workers. I want to restart a Worker actor if it happens to die because of some Exception.
public MasterActor(ActorPropsFactory actorPropsFactory) {
this.actorPropsFactory = actorPropsFactory;
int workers = Runtime.getRuntime().availableProcessors() - 1;
List<Routee> routees = Stream.generate(this::createActorRefRoutee).limit(workers).collect(Collectors.toList());
this.router = new Router(new ConsistentHashingRoutingLogic(getContext().system()), routees);
}
private ActorRefRoutee createActorRefRoutee() {
ActorRef worker = getContext().actorOf(actorPropsFactory.create(getWorkerActorClass()));
getContext().watch(worker);
return new ActorRefRoutee(worker);
}
private void route(Object message, Supplier<String> routingKeySupplier) {
String routingKey = routingKeySupplier.get();
RouterEnvelope envelope = new ConsistentHashingRouter.ConsistentHashableEnvelope(message, routingKey);
router.route(envelope, getSender());
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(
EventMessage.class,
message -> this.route(message, () -> message.getEvent().getId().toString()))
.match(
Terminated.class,
message -> {
logger.info("WorkerActor {} terminated, restarting", message.getActor());
// todo: detect whether the system is shutting down before restarting the actor
router = router.removeRoutee(message.actor())
.addRoutee(createActorRefRoutee());
})
.build();
}
The problem I am having is that if the Spring Application fails to start up. (For example it fails to connect to the database, or some credentials are incorrect or something), I am receiving the Terminated message from all workers, and the Master actor tries to start new ones, which also get Terminated immediately, going into an endless loop.
What is the correct way to detect such scenario? Is there a way for the Master actor detect that the actor system is shutting down so that the workers are not restarted again?

Can't you just set up a supervision strategy for your Router so you can inspect the type of Exception that causes the failure? This way you also don't need to restart your workers manually.
EDIT:
You set up SupervisorStrategy like this:
private static SupervisorStrategy strategy=
new OneForOneStrategy(
10,
Duration.ofMinutes(1),
DeciderBuilder.match(ArithmeticException.class,e->SupervisorStrategy.resume())
.match(NullPointerException.class,e->SupervisorStrategy.restart())
.match(IllegalArgumentException.class,e->SupervisorStrategy.stop())
.matchAny(o->SupervisorStrategy.escalate())
.build());
final ActorRef router=
system.actorOf(
new RoundRobinPool(5).withSupervisorStrategy(strategy).props(Props.create(Echo.class)));
You can read more about it here:
Router Actor supervision
Fault tolerance in Akka

StreamResourceWriter error handling via ErrorHandler interface

I have a FileCreator class that implements StreamResourceWriter interface and MainErrorHandler class that implements ErrorHandler. I'm using the MainErrorHandler class as a centralized Exception handler in my project which mostly logs the exception and shows a notification to the user. The problem is that StreamResourceWriter.accept() method runs in a non UI thread and when an Exception is thrown it is directed to the ErrorHandler which then fails to show a notification due to "IllegalStateException: UI instance is not available". Is there a way to show a notification window to the user from MainErrorHandler when FileCreator throws an error in accept() method?
Below FileCreator snippet.
public class FileCreator implements StreamResourceWriter {
#Override
public void accept(OutputStream stream, VaadinSession session) throws IOException {
// Run in a non ui thread.
// Writes to OutputStream but an Exception might be thrown during this process
}
}
Below MainErrorHandler snippet.
/**
* Centralized error handler
*/
public class MainErrorHandler implements ErrorHandler {
private static final Logger log = LoggerFactory.getLogger(MainErrorHandler.class);
#Override
public void error(ErrorEvent event) {
log.error("Error occurred", event.getThrowable());
//Cannot show a notification if ErrorEvent came from FileCreator.
//Will get an IllegalStateException: UI instance is not available.
Notification.show("Error occurred");
//Tried UI.getCurrent but it returns null if ErrorEvent came from FileCreator.
UI.getCurrent();
}
}
Using Vaadin 13.0.1.
Edit
One way to solve this issue is to pass UI reference to FileCreator directly. Below an example.
public class FileCreator implements StreamResourceWriter {
private UI ui;
//Pass UI reference directly
public FileCreator(UI ui){
this.ui = ui;
}
#Override
public void accept(OutputStream stream, VaadinSession session) throws IOException {
try{
// Run in a non ui thread.
// Writes to OutputStream but an Exception might be thrown during this process
}catch(Exception e){
//I don't like this since have to catch all exceptions and have to call ErrorHandeler directly with a UI reference. Also what if somewhere in code ErrorHandler is changed and is not of type MainErrorHandler.
((MainErrorHandler)VaadinSession.getCurrent().getErrorHandler()).error(e, ui);
}
}
}
As I said in comments I really don't like this approach either since I am forced to catch all Exceptions, have to cast ErrorHandler to MainErrorHandler and calling it directly.

There is a way, but it's not perfect.
You can get all UI instances via VaadinSession.getCurrent().getUIs().
To filter out the inactive/detached UIs you can check if ui.getSession() returns a VaadinSession (so, not null). The JavaDoc of getSession says:
The method will return null if the UI is not currently attached to a VaadinSession.
Then you can invoke the access method on each of the UIs and create and show the notification inside the UI-context.
for(UI ui : VaadinSession.getCurrent().getUIs()) {
// Filtering out detached/inactive UIs
if (ui.getSession() != null) {
ui.access(() -> {
// create Notification here
});
}
I said it's not perfect because you have to keep in mind that the user can have several UIs opened at the same time(e.g. multiple tabs).

Wiring Akka PoisonPill to actor system and JVM shutdown hooks

Akka/Java here, although I have a basic understanding of Scala. New to Akka. I have a Master class that starts up when the actor system fires up, which manages three children: Fizz, Buzz and Foo.
When Master starts up, that call to doSomething() can throw a NoSuchElementException. If it does, I would like the Master to shut down its three children, kill itself, shut down the actor system as a whole, and then invoke a custom system shutdown hook. My best attempt thus far:
public class MyApp {
public static void main(String[] args) {
ActorRef master = actorSystem.actorOf(Props.create(Master.class));
master.tell(new Init(), ActorRef.noSender());
Runtime.getRuntime().addShutdownHook(new Thread() {
#Override
void run() {
System.out.println("Shutting down!");
}
});
}
}
public class Master extends AbstractActor {
private Logger log = LoggerFactory.getLogger(this.getClass());
private ActorRef fizz;
private ActorRef buzz;
private ActorRef foo;
public Master() {
super();
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(Init.class, init -> {
try {
fizz = context().actorOf(Props.create(Fizz.class));
buzz = context().actorOf(Props.create(Buzz.class));
foo = context().actorOf(Props.create(Foo.class));
long metric = doSomething();
log.info("After all the children started up \"metric\" was: {}", metric);
} catch(NoSuchElementException ex) {
self().tell(PoisonPill.getInstance(), self());
}
}).build();
}
}
My thinking here is:
Since Master is the top-most actor, I can't define a SupervisorStrategy to handle the thrown NoSuchElementException for me, so I have to put a try-catch in there to handle it
My understanding of PoisonPill is that it shuts down the receiving actor's children and then shuts the actor down
However I'm still fuzzy as to whether PoisonPill shuts the actor system down if the actor happens to be the root/top-level actor, and I'm also not seeing how I can wire the PoisonPill to not only shut the actor system down, but also engage the JVM's shutdown hook
When I run this code I don't see any evidence of the actor system shutting down, it just hangs. Any ideas how I can wire all this together to achieve the desired affect?

One way to achieve the desired behavior is to have the master actor call getContext().getSystem().terminate() and register a callback that contains the shutdown logic with ActorSystem.registerOnTermination:
actorSystem.registerOnTermination(new Runnable {
#Override
public void run() {
System.out.println("Shutting down!");
}
});
// ...
try {
// ...
} catch (NoSuchElementException ex) {
getContext().getSystem().terminate();
}
Coordinated shutdown is available for shutdown procedures that are more involved.

You can change the supervision strategy for the user guardian:
https://doc.akka.io/docs/akka/current/general/supervision.html#user-the-guardian-actor
If you have the user guardian escalate any exceptions, then throw the NoSuchElementException, the exception will end up at the root guardian with the result that your system will exit.

Untyped Actors (Java+Akka): requeuing unhandled messages

I'm creating a system of actors using Java+Akka.
In particular, I define untyped actors by providing an implementation of the onReceive() method.
In that method, I implement the behavior of the actor by defining the logic to be executed upon reception of a message. It may be something as:
public void onReceive(Object msg) throws Exception {
if(msg instanceof MsgType1){ ... }
else if(msg instanceof MsgType2){ ... }
else unhandled(msg);
}
However, what if the actor is interested in only a single type of msg?
Is it possible to implement a selective receive, so that the actor waits for a certain msg (and the system automatically re-queues all the other types of messages)???

This "a la Erlang" message processing mode is not available in Akka AFAIK. However, you can use Stash to obtain the effect you want.
So the code would look like this
public void onReceive(Object msg) throws Exception {
if(msg instanceof MsgType1){ ... }
else if(msg instanceof MsgType2){ ... }
else stash();
}
At some point in the message handling you would switch to another state (presumably by calling getContext().become). You would also do a unstashAll() call in order to re-append the messages you ignored until that point back to the mailbox.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.