akka.pattern.AskTimeoutException while running Lagom HelloWorld example

akka.pattern.AskTimeoutException while running Lagom HelloWorld example - java

I have a problem while trying my hands on the Hello World example explained here.
Kindly note that I have just modified the HelloEntity.java file to be able to return something other than "Hello, World!". Most certain my changes are taking time and hence I am getting the below Timeout error.
I am currently trying (doing a PoC) on a single node to understand the Lagom framework and do not have liberty to deploy multiple nodes.
I have also tried modifying the default lagom.circuit-breaker in application.conf "call-timeout = 100s" however, this does not seem to have helped.
Following is the exact error message for your reference:
{"name":"akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://hello-impl-application/system/sharding/HelloEntity#1074448247]] after [5000 ms]. Sender[null] sent message of type \"com.lightbend.lagom.javadsl.persistence.CommandEnvelope\".","detail":"akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://hello-impl-application/system/sharding/HelloEntity#1074448247]] after [5000 ms]. Sender[null] sent message of type \"com.lightbend.lagom.javadsl.persistence.CommandEnvelope\".\n\tat akka.pattern.PromiseActorRef$.$anonfun$defaultOnTimeout$1(AskSupport.scala:595)\n\tat akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:605)\n\tat akka.actor.Scheduler$$anon$4.run(Scheduler.scala:140)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:866)\n\tat scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)\n\tat scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:864)\n\tat akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:279)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:283)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235)\n\tat java.lang.Thread.run(Thread.java:748)\n"}
Question: Is there a way to increase the akka Timeout by modifying the application.conf or any of the java source files in the Hello World project? Can you please help me with the exact details.
Thanks in advance for you time and help.

The call timeout is the timeout for circuit breakers, which is configured using lagom.circuit-breaker.default.call-timeout. But that's not what is timing out above, the thing that is timing out above is the request to your HelloEntity, that timeout is configured using lagom.persistence.ask-timeout. The reason why there's a timeout on requests to entities is because in a multi-node environment, your entities are sharded across nodes, so an ask on them may go to another node, which is why a timeout is needed in case that node is not responding.
All that said, I don't think changing the ask-timeout will solve your problem. If you have a single node, then your entities should respond instantly if everything is working ok.
Is that the only error you're seeing in the logs?
Are you seeing this in devmode (ie, using the runAll command), or are you running the Lagom service some other way?
Is your database responding?

Thanks James for the help/pointer.
Adding following lines to resources/application.conf did the trick for me:
lagom.persistence.ask-timeout=30s
hello {
..
..
call-timeout = 30s
call-timeout = ${?CIRCUIT_BREAKER_CALL_TIMEOUT}
..
}
A Call is a Service-to-Service communication. That’s a SeviceClient communicating to a remote server. It uses a circuit breaker. It is a extra-service call.
An ask (in the context of lagom.persistence) is sending a command to a persistent entity. That happens across the nodes insied your Lagom service. It is not using circuit breaking. It is an intra-service call.

Related

Finagle service discovery issue

Here I wanted to register to 2 endpoints and send requests to them. You can see this in the code below. I name one env1 and the other env2.
val client = Http.client
.configured(Transport.Options(noDelay = false, reuseAddr = false))
.newService("gexampleapi-env1.localhost.net:8081,gexampleapi-env2.localhost.net:8081")
So far everything is normal. But env1 instance had to be down for some reason(for a few hours' maintenance etc. not sure why.). Under normal circumstances, our expectation is that it continues to send requests through the env2 instance. But this didn't happen. Could not send requests to both servers. Normally it was working correctly, but it didn't work that day for a reason we don't know.
Since the event took place months ago, I only have the following log.
2022-02-15 12:09:40,181 [finagle/netty4-1-3] INFO com.twitter.finagle
FailureAccrualFactory marking connection to "gExampleAPI" as dead.
Remote Address:
Inet(gexampleapi-env1.localhost.net/10.0.0.1:8081,Map())
To solve the problem, we removed gexampleapi-env1.localhost.net:8081 host from the config file. and after restarting it continued to process requests. If you have any ideas about why we may have experienced this problem and how to avoid this next time, I would appreciate it if you could share them.

How to check delayed/scheduled messages in RabbitMQ's Mnesia

I was checking for some alternatives for Quartz-scheduler.
Though this is not a complete replacement, I was trying out RabbitMQ Delayed Messages Plugin (suits for my use-case).
I was able to get the scheduling work but I was not to view the messages which are delayed(which are stored in Mnesia).
Is there a way to check the messages and/or number of messages in Mnesia?
Edit : I inferred that the messages are stored in Mnesia from the comment from here.

There is no way to check the messages that RabbitMQ is persisting in it's mnesia database.
RabbitMQ is not a generalized datastore. It is a purpose-built message broker and queueing system. The datastore it has in it is there to facilitate the persistence of messages, not to be queried and used as if it were a database on it's own.

To view the data inside MNESIA you could :
Write a simple Erlang program as this, as result you have:
(rabbit#gabrieles-MBP)5>
load:traverse_table_and_show('rabbit_delayed_messagerabbit#gabrieles-MBP').
{delay_entry,
{delay_key,1442258857832,
{exchange,
{resource,<<"/">>,exchange,<<"my-exchange">>},
'x-delayed-message',true,false,false,
[{<<"x-delayed-type">>,longstr,<<"direct">>}],
undefined,undefined, {[],[]}}},
{delivery,false,false,<0.2008.0>,
{basic_message,
{resource,<<"/">>,exchange,<<"my-exchange">>},
[<<>>],
{content,60,
{'P_basic',undefined,undefined,
[{<<"x-delay">>,signedint,100000}],
undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined,undefined,
undefined},
..
OR in this way:
execute an Erlang shell session using:
erl -set-cookie ABCDEFGHI -sname monitorNode#gabrielesMBP
you have to use the same cookie that rabbitmq are using.
Typically $(HOME).erlang.cookie
execute this command:observer:start().
and you should have this:
Once you are connected to rabbitmq node open Table Viewer and from the menu Mnesia table as:
Here you can see your data:

FailoverClientConnectionFactory is not threadsafe?

I had code working with TcpNioClientConnectionFactory and it has been working fine. Until recently when I made change so that the TCP client perform failover in case of down time of server by using FailoverClientConnectionFactory and it starts to return me response for different request, even for single AbstractClientConnectionFactory provided to the Failover factory.
My code is using #MessagingGateway and the method is wrapped using CompletableFuture however even without CompletableFuture it still return wrong response (most of the time).
The log file is showing
ERROR o.s.i.i.t.TcpOutboundGateway - Cannot correlate response - no pending reply
I can always reproduce this issue using IT test.
Please help.

Huge performance issue using camel routes in karaf

I have a tricky issue with karaf, and having tried all day to fix it, I need your insights. Here is the problem:
I have camel routes (pure java DSL) that get data from 2 sources, process them, and then send the results to a redis
- when using as standalone application (with a Main class and a command line "java -jar myjar.jar"), data are processed and saved in less than 20minutes
- when using them as a bundle (part of another feature actually) , on the same machine, it takes about 10 hours .
EDIT: I forgot to add: I use camel 2.1.0 and karaf 2.3.2
Now, we are in the process of refactoring our SI to karaf features, so sadly, it's not really possible to just keep the standalone app.
I tried playing with karaf java memory option, using a cluster (I failed :d ) playing with SEDA and threadpool, replacing all direct route by a seda, without success. A dev:create-dump shows a lot of
thread #38 - Split" Id=166 BLOCKED on java.lang.Class#56d1396f owned by "Camel (camelRedisProvisioning)
Could it be an issue with split and parallelProcessing in karaf ? Standalone app shows indeed a LOT more CPU activity.
Here are my camel route
//start with a quartz and a cron tab
from("quartz://provisioning/topOffersStart?cron=" + cronValue.replace(' ', '+')).multicast()
.parallelProcessing().to("direct:prodDAO", "direct:thesaurus");
//get from two sources and process
from("direct:prodDAO").bean(ProductsDAO.class)
.setHeader("_type", constant(TopExport.PRODUCT_TOP))
.setHeader("topOffer", constant("topOffer"))
.to("direct:topOffers");
from("direct:thesaurus")
.to(thesaurusUri).unmarshal(csv).bean(ThesaurusConverter.class, "convert")
.setHeader("_type", constant(TopExport.CATEGORY_TOP))
.setHeader("topOffer", constant("topOffer"))
.to("direct:topOffers");
//processing
from("direct:topOffers").choice()
.when(isCategory)
.to("direct:topOffersThesaurus")
.otherwise()
.when(isProduct)
.to("direct:topOffersProducts")
.otherwise()
.log(LoggingLevel.ERROR, "${header[_type]} is not valid !")
.endChoice()
.endChoice()
.end();
from("direct:topOffersThesaurus")
//here is where I think the problem comes
.split(body()).parallelProcessing().streaming()
.bean(someprocessing)
.to("direct:toRedis");
from("direct:topOffersProducts")
//here is where I think the problem comes
.split(body()).parallelProcessing().streaming()
.bean(someprocessing)
.to("direct:toRedis");
//save into redis
from("direct:toRedis")
.setHeader("CamelRedis.Key", simple("provisioning:${header[_topID]}"))
.setHeader("CamelRedis.Command", constant("SETEX"))
.setHeader("CamelRedis.Timeout", constant("90000"))//25h
.setHeader("CamelRedis.Value", simple("${body}"))
.to("spring-redis://?redisTemplate=#provisioningRedisTemplateStringSerializer");
NB: the body sent to direct:topOffers[products|thesaurus] is a list of pojo (the same class)
Thanks to anyone that can help
EDIT:
I think I narrowed it down to a deadlock on jaxb. Indeed, in my routes, I make lots of call to a java client calling a web service. When using karaf, thread are block there :
java.lang.Thread.State: BLOCKED (on object monitor) at com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:78)
further down the stack trace, we see the unmarshalling method used to transform the xml in object, those 2 line we suspect to me
final JAXBContext context = JAXBContext.newInstance(clazz.getPackage().getName());
final Unmarshaller um = context.createUnmarshaller();
I remove the final, no improvements. Maybe something to do with the jaxb used by karaf ? I do not install any jaxb impl with the bundle

Nailed it !
As seen above, it was indeed linked to a deadlock on the jaxb context in my webservice client.
what I did :
- refactoring of the old code for that client by removing the final keyword on Marshaller/Unmarshaller object (I think the deadlock came from there, even if it was the exact same code when runing on standalone)
- instanciate the context based on the package, and only once.
I must admit Classloader issues with OSGI had me banging my head on my desk for a few hours, but thanks to Why can't JAXB find my jaxb.index when running inside Apache Felix? , I manage to fix that
Granted, my threads are now sleeping instead of blocked, but now I process my data in less than 30 min, so that's good enough for me

Zookeeper Network Ensemble does not start appropiately

I've been working with zookeeper lately to fill a requirement of reliablity in distributed applications. I'm working with three computers and I followed this tutorial:
http://sanjivblogs.blogspot.ie/2011/04/deploying-zookeeper-ensemble.html
I did step by step to ensure I did it well, but now when I start my zookeepers with the
./zkServer.sh start
I'm getting these exceptions for all my computers:
2013-04-05 21:46:58,995 [myid:2] - WARN [SendWorker:1:QuorumCnxManager$SendWorker#679] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:342)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:831)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:62)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:667)
2013-04-05 21:46:58,995 [myid:2] - WARN [SendWorker:1:QuorumCnxManager$SendWorker#688] - Send worker leaving thread
2013-04-05 21:47:58,363 [myid:2] - WARN [RecvWorker:3:QuorumCnxManager$RecvWorker#762] - Connection broken for id 3, my id = 2, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:747)
But I don't know what I am doing wrong to get this. My objective is to syncrhonize my zookeepers in different machines to get always an available service. I went to zookeeper.apache.org Web Päge and look for the same information on how to configure and start my zookeeper, but are the same steps I followed before.
If somebody could help me please I would really appretiate it. Thanks in advance.

I will need to follow some strict steps to achieve this, but finally done it. If somebody is facing the same issue, to make the zookeeper enssemble, please remember:
You need 3 zookeeper servers running (local or over the network), this is the minimum number to achieve the synchronization. In each server, it is needed to create a file called "myid" (inside the zookeeper folder), the content of each myid file must be a sequential number, for instance, I have three zookeeper servers (folders), so I have a myid with content 1, other with content 2 and other with content 3.
Then in the zoo.cfg it is necessary to stablish the parameters required:
tickTime=2000
#dataDir=/var/lib/zookeeper
dataDir=/home/mtataje/var/zookeeper1
clientPort=2184
initLimit=10
syncLimit=20
server.1=192.168.3.41:2888:3888
server.2=192.168.3.41:2889:3889
server.3=192.168.3.41:2995:2999
The zoo.cfg varies from each server to another, in my case because I was testing on local, I needed to change the port and the dataDir.
After that, excutes the:
./zkServer.sh start
Maybe some exceptions will appear, but it is because at least two zookeepers must be synchronized, when you start at least 2 zookeepers, exceptions should be gone.
Best regards.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

akka.pattern.AskTimeoutException while running Lagom HelloWorld example - java

Related

Finagle service discovery issue

How to check delayed/scheduled messages in RabbitMQ's Mnesia

FailoverClientConnectionFactory is not threadsafe?

Huge performance issue using camel routes in karaf

Zookeeper Network Ensemble does not start appropiately

Categories

Resources