How to make Vert.x and Apache Ignite Client work together? - java

I deployed Apache Ignite cluster and I need to perform different operations with caches from my Vert.x backend.
I successfully connect to cluster using Apache Ignite client (not Thin client). Apache Ingite Client is run inside Vert.x verticle:
vertx.deployVerticle(new IgniteVerticle(),
new DeploymentOptions().setInstances(1).setWorker(true),
apacheIgniteVerticleDeployment.completer());
But some time later I start receiving the following messages:
SEVERE: Blocked system-critical thread has been detected.
This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=28s]
SEVERE: Critical system error detected. Will be handled accordingly to configured handler
[hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED,
err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1567112815022]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1567112815022]
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Such messages appear about every 10 seconds. I have a guess that this can be somehow related to the way how Vert.x works.
What can be the reason of these exceptions?

You can try increasing systemWorkerBlockedTimeout on IgniteConfiguration to make this message go away. See more in docs:
https://apacheignite.readme.io/docs/critical-failures-handling

Related

MSMQ error while trying to access remote private queue.Cannot open queue. (hr=unknown hr (-2147023071))

MSMQ error while trying to access remote private queue.
Exception: Cannot open queue. (hr=unknown hr (-2147023071))
I already added these two:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSMQ\Parameters\Security\AllowNonauthenticatedRPC and set the value to 1.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSMQ\Parameters\Security\NewRemoteReadServerAllowNoneSecurityClient and set it to 1
-2147023071 is 0x80070721 which isn't an MSMQ-specific error code (as they start 0xC00Exxxx). I believe this is a security-related error code.
As you are receiving messages from a remote queue, you are using the RPC protocol so this article will help:
Understanding how MSMQ security blocks RPC traffic
Sending a message uses the MSMQ protocol and so does not have the same problems.

Managed VM Issue? Frequently found error log about VmApiProxyDelegate (using Datastore API, TaskQueue API) on AppEngine Managed VM instance

I'm using the Google AppEngine Managed VM/Java since March 2015. Everything is work well. But after September, October 2015, I noticed I 've seen the error log of "com.google.apphosting.vmruntime.VmApiProxyDelegate" in Managed VM instance log frequently.
2 groups of error log I noticed.
The first one is related to datastore operation on Managed VM instance. It happens on
datastore_v3.Get().
datastore_v3.RunQuery()
datastore_v3.Put()
memcache.Get().
Sample of stacktrace I saw in the log is below ... (sample below is for datastore_v3.Put())
com.google.apphosting.vmruntime.VmApiProxyDelegate runSyncCall: HTTP
ApiProxy I/O error for datastore_v3.Put: Read timed out
com.google.apphosting.api.ApiProxy$RPCFailedException: The remote RPC
to the application server failed for the call datastore_v3.Get(). at
com.google.apphosting.vmruntime.VmApiProxyDelegate.runSyncCall(VmApiProxyDelegate.java:182)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate.makeApiCall(VmApiProxyDelegate.java:141)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate.access$000(VmApiProxyDelegate.java:47)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate$MakeSyncCall.call(VmApiProxyDelegate.java:375)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate$MakeSyncCall.call(VmApiProxyDelegate.java:351)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
So, when this error occurred, at application level (my code), I got RPCFailedException but right now, I've not handled it with retry mechanism ( I only use retry mechanism for ConcurrentModificationException with App Engine Datastore API)
The second group of error log is about TaskQueue API on the Managed VM.
The error message I got is ...
com.google.apphosting.vmruntime.VmApiProxyDelegate runSyncCall: Error
body: RPC Error: /StubbyService.Send to (unknown) : APP_ERROR(2)
When tracing , the detailed stacktrace is ...
com.wat_suttiwat.batchengine.job.PushNotificationTaskExecutor
executeTask: The remote RPC to the application server failed for the
call taskqueue.QueryAndOwnTasks().
com.google.apphosting.api.ApiProxy$RPCFailedException: The remote RPC
to the application server failed for the call
taskqueue.QueryAndOwnTasks(). at
com.google.apphosting.vmruntime.VmApiProxyDelegate.runSyncCall(VmApiProxyDelegate.java:161)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate.makeApiCall(VmApiProxyDelegate.java:141)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate.access$000(VmApiProxyDelegate.java:47)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate$MakeSyncCall.call(VmApiProxyDelegate.java:375)
at
com.google.apphosting.vmruntime.VmApiProxyDelegate$MakeSyncCall.call(VmApiProxyDelegate.java:351)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"RPCFailedException" exception raised at the application level (like the first case). And I noticed the Google AppEngine front-end instance (not the managed-vm instance), it work as usual no error about these.
So my question is
Should I add retry-mechanism for RPCFailedException ? Is it useful to add retry? I don't see any documentation from the Google AppEngine document on this.
Does anyone has the same issues with me? If yes, please help to vote the issue at this issue-tracker (#12393): https://code.google.com/p/googleappengine/issues/detail?can=2&start=0&num=100&q=&colspec=ID%20Type%20Component%20Status%20Stars%20Summary%20Language%20Priority%20Owner%20Log&groupby=&sort=&id=12393
If you have any workaround, please share.
Thanks so much

How can I gracefully handle a Kafka outage?

I am connecting to Kafka using the 0.8.2.1 kafka-clients library. I am able to successfully connect when Kafka is up, but I want to handle failure gracefully when Kafka is down. Here is my configuration:
kafkaProperties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaUrl);
kafkaProperties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
kafkaProperties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
kafkaProperties.setProperty(ProducerConfig.RETRIES_CONFIG, "3");
producer = new KafkaProducer(kafkaProperties);
When Kafka is down, I get the following error in my logs:
WARN: 07 Apr 2015 14:09:49.230 org.apache.kafka.common.network.Selector:276 - [] Error in I/O with localhost/127.0.0.1
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.7.0_75]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) ~[na:1.7.0_75]
at org.apache.kafka.common.network.Selector.poll(Selector.java:238) ~[kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) [kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) [kafka-clients-0.8.2.1.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) [kafka-clients-0.8.2.1.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
This error repeats in an infinite loop and locks up my Java application. I have tried various configuration settings related to timeouts, retries, and acknowledgements, but I have been unable to prevent this loop from occurring.
Is there a configuration setting that can prevent this? Do I need to try a different version of the client? How can a Kafka outage be handled gracefully?
I figured out that this combination of settings allows the kafka client to fail quickly without holding the thread or spamming the logs:
kafkaProperties.setProperty(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG, "300");
kafkaProperties.setProperty(ProducerConfig.TIMEOUT_CONFIG, "300");
kafkaProperties.setProperty(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, "10000");
kafkaProperties.setProperty(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG, "10000");
I dislike that the kafka client holds the thread while trying to connect to the kafka server, rather than being fully async, but this at least is functional.
In the 0.9 client, there's also the max.block.ms property, which will limit the time the client is allowed to run.

Hazelcast keep running after application stops on Weblogic (and also Tomcat)

I have just been playing around to get what Hazelcast provides. There are only a few things using Hazelcast in the applcation.
On Weblogic server 10.3.6, I saw that there is something wrong on one of the Maps. It kept giving exception about no class definiton error even though it has been running maybe more than a few days without a problem. Then I stoppped the application on running 8 nodes on Oracle Weblogic. I assumed the Hazelcast clusters/instances would shutdown as well but even though the application stops on all the nodes of Weblogic, I saw Hazelcast merge exceptions on the logs (keep throwing).
I also test my spring based application on Tomcat 7, even though I shutdown the application, Hazelcast resists shutting down somehow.
Is it normal behaviour? How can we shutdown all Hazelcast instances even after shutting down the application?
Note that I call Hazelcast when my application starts, there is no special client, only 8 nodes of Weblogic server.
Edit: Here is the stacktrace of the migration problem
SEVERE: Problem while reading DataSerializable, namespace: 0, id: 0, class: 'com.hazelcast.partition.impl.MigrationRequestOperation', exception: com.hazelcast.partition.impl.MigrationRequestOperation
com.hazelcast.nio.serialization.HazelcastSerializationException: Problem while reading DataSerializable, namespace: 0, id: 0, class: 'com.hazelcast.partition.impl.MigrationRequestOperation', exception: com.hazelcast.partition.impl.MigrationRequestOperatio
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:120)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:39)
at com.hazelcast.nio.serialization.StreamSerializerAdapter.toObject(StreamSerializerAdapter.java:65)
at com.hazelcast.nio.serialization.SerializationServiceImpl.toObject(SerializationServiceImpl.java:260)
at com.hazelcast.spi.impl.NodeEngineImpl.toObject(NodeEngineImpl.java:186)
at com.hazelcast.spi.impl.BasicOperationService$OperationPacketHandler.loadOperation(BasicOperationService.java:638)
at com.hazelcast.spi.impl.BasicOperationService$OperationPacketHandler.handle(BasicOperationService.java:621)
at com.hazelcast.spi.impl.BasicOperationService$OperationPacketHandler.access$1500(BasicOperationService.java:614)
at com.hazelcast.spi.impl.BasicOperationService$BasicDispatcherImpl.dispatch(BasicOperationService.java:566)
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.process(BasicOperationScheduler.java:466)
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.processPriorityMessages(BasicOperationScheduler.java:480)
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.doRun(BasicOperationScheduler.java:457)
at com.hazelcast.spi.impl.BasicOperationScheduler$OperationThread.run(BasicOperationScheduler.java:432)
Caused by: java.lang.ClassNotFoundException: com.hazelcast.partition.impl.MigrationRequestOperation
at weblogic.utils.classloaders.GenericClassLoader.findLocalClass(GenericClassLoader.java:297)
at weblogic.utils.classloaders.GenericClassLoader.findClass(GenericClassLoader.java:270)
at weblogic.utils.classloaders.ChangeAwareClassLoader.findClass(ChangeAwareClassLoader.java:64)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:179)
at weblogic.utils.classloaders.ChangeAwareClassLoader.loadClass(ChangeAwareClassLoader.java:43)
at com.hazelcast.nio.ClassLoaderUtil.tryLoadClass(ClassLoaderUtil.java:124)
at com.hazelcast.nio.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:113)
at com.hazelcast.nio.ClassLoaderUtil.newInstance(ClassLoaderUtil.java:66)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:109)
... 12 more
Mar 16, 2015 5:31:29 PM com.hazelcast.spi.OperationService
The credible Hazelcast docs says:
As a final step, if you are done with your client, you can shut it down as shown below. This will release all the used resources and will close connections to the cluster.
client.shutdown();
which should be called in destroy() method of your bean:
public class ExampleBean implements DisposableBean {
public void destroy() {
client.shutdown();
}
}
The exception you have pasted occurs due to the fact that your app class loader is destroyed upon shutdown of your app.

Failed to connect to queue manager 'QUEUE-NAME' with connection mode 'Client' and host

I have developed subscripe (topic) conncept using Camel. it is working fine in my local tomcat.but it is not working in my test environment tomcat. it is getting below mentioned error. kindly help me to resolve the issue and how to debug the issue.
is it related to server configuration ?
Error
org.apache.camel.component.jms.JmsMessageListenerContainer refreshConnectionUntilSuccessful
SEVERE: Could not refresh JMS Connection for destination 'TOPIC-NAME' - retrying in 5000 ms. Cause: JMSWMQ0018: Failed to
connect to queue manager 'QUEUE-MANAGER' with connection mode 'Client' and
host name 'HOST-NAME'.; nested exception is com.ibm.mq.MQException:
JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED')
reason '2059' ('MQRC_Q_MGR_NOT_AVAILABLE').
regards,
Gnana
There is almost no information to go on here and therefore no way to answer with any confidence. Instead, I'll provide a diagnostic process and hopefully you will find the problem. Note that in the future if you have similar issues, it would help to list the diagnostics you have already tried so that people responding can narrow down their answers.
In order for this to work, the QMgr must be running a listener, have a channel defined and available, have authorizations set up to allow the connection, and be able to resolve the queue or topic requested. With that in mind, the things I normally check and the order I check them in is as follows:
Is the QMgr running.
Is the listener running? On what port?
Can I telnet to the QMgr on the listener port? i.e. telnet mqhost 1414.
Is the channel defined? If so, is it available?
Do the sample client programs work? In this case, amqspubc is the one to try.
There are other considerations and if all of the above work, it is time to look into the client code and configuration, the versions of the client and server, authorizations, etc. But until you know that the basic configuration is in place to support a client connection (which was not indicated in the question) then these are the things to start with.

Categories

Resources