Vert.x Thread Naming - "vert.x-worker-thread-..."

Vert.x Thread Naming - "vert.x-worker-thread-..." - java

I am making a blocking service call in one of my worker verticles that logged a warning. This was "addressed" by increasing the time limit, but, I am more curious about how to read the naming of the thread in the log line - vert.x-worker-thread-3,5,main. The full log was this -
io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-worker-thread-3,5,main] has been blocked for 64134 ms, time limit is 60000
io.vertx.core.VertxException: Thread blocked
What does the 3,5,main indicate? Is it some kind of trace from the main verticle? Thanks.

ThreadName: vert.x-worker-thread-3
ThreadPriority: 5
Source: main

Related

Restriction input queu size at Micronaut(Netty)

When the application is started but not yet warmed(Jit need time), it cannot process the expected RPS.
The problem is in the incoming queue. As the IO thread continues to process requests, there are many requests in the queue that the GC cannot clean up. After the overflow of the Survived generation, GC starts perfom major pause, which slows down the execution of requests even more and after some time the application falls on the OOM.
My application have self warmed readnessProbe (3k random request).
I try to configure count of thread and queue size:
application.yml
micronaut:
server:
port: 8080
netty:
parent:
threads: 2
worker:
threads: 2
executors:
io:
n-threads: 1
parallelism: 1
type: FIXED
scheduled:
n-threads: 1
parallelism: 1
corePoolSize: 1
And some props
System.setProperty("io.netty.eventLoop.maxPendingTasks", "16")
System.setProperty("io.netty.eventexecutor.maxPendingTasks", "16")
System.setProperty("io.netty.eventLoopThreads", "1")
But the queue keeps filling up:
i want to find somw way to restriction input queu size at Micronaut, so that the application does not failed under high load

MarkLogic Java API deadlock detection

One of our application just suffered from some nasty deadlocks. I had quite a hard time recreating the problem because the deadlock (or stacktrace) did not show up immediately in my java application logs.
To my surprise the marklogic java api retries failing requests (e.g because of a deadlock). This might make sense, if your request is not a multi statement request, but otherwise i'm not sure if it does.
So lets stick with this deadlock problem. I created a simple code snippet in which i create a deadlock on purpose. The snippet creates a document test.xml and then tries to read and write from two different transactions, each on a new thread.
public static void main(String[] args) throws Exception {
final Logger root = (Logger) LoggerFactory.getLogger(Logger.ROOT_LOGGER_NAME);
final Logger ok = (Logger) LoggerFactory.getLogger(OkHttpServices.class);
root.setLevel(Level.ALL);
ok.setLevel(Level.ALL);
final DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, new DatabaseClientFactory.DigestAuthContext("username", "password"));
final StringHandle handle = new StringHandle("<doc><name>Test</name></doc>")
.withFormat(Format.XML);
client.newTextDocumentManager().write("test.xml", handle);
root.info("t1: opening");
final Transaction t1 = client.openTransaction();
root.info("t1: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t1);
root.info("t2: opening");
final Transaction t2 = client.openTransaction();
root.info("t2: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t2);
new Thread(() -> {
root.info("t1: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t1</t></doc>").withFormat(Format.XML), t1);
t1.commit();
}).start();
new Thread(() -> {
root.info("t2: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t2</t></doc>").withFormat(Format.XML), t2);
t2.commit();
}).start();
TimeUnit.MINUTES.sleep(5);
client.release();
}
This code will produce the following log:
14:12:27.437 [main] DEBUG c.m.client.impl.OkHttpServices - Connecting to localhost at 8000 as admin
14:12:27.570 [main] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction null
14:12:27.608 [main] INFO ROOT - t1: opening
14:12:27.609 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:27.962 [main] INFO ROOT - t1: reading
14:12:27.963 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 5298588351036278526
14:12:28.283 [main] INFO ROOT - t2: opening
14:12:28.283 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:28.286 [main] INFO ROOT - t2: reading
14:12:28.286 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 8819382734425123844
14:12:28.289 [Thread-1] INFO ROOT - t1: writing
14:12:28.289 [Thread-1] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 5298588351036278526
14:12:28.289 [Thread-2] INFO ROOT - t2: writing
14:12:28.290 [Thread-2] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 8819382734425123844
Neither t1 or t2 will get commited. MarkLogic logs confirm that there actually is a deadlock:
==> /var/opt/MarkLogic/Logs/8000_AccessLog.txt <==
127.0.0.1 - admin [24/Nov/2018:14:12:30 +0000] "PUT /v1/documents?txid=5298588351036278526&category=content&uri=test.xml HTTP/1.1" 503 1034 - "okhttp/3.9.0"
==> /var/opt/MarkLogic/Logs/ErrorLog.txt <==
2018-11-24 14:12:30.719 Info: Deadlock detected locking Documents test.xml
This would not be a problem, if one of the requests would fail and throw an exception, but this is not the case. MarkLogic Java Api retries every request up to 120 seconds and one of the updates timeouts after like 120 seconds or so:
Exception in thread "Thread-1" com.marklogic.client.FailedRequestException: Service unavailable and maximum retry period elapsed: 121 seconds after 65 retries
at com.marklogic.client.impl.OkHttpServices.putPostDocumentImpl(OkHttpServices.java:1422)
at com.marklogic.client.impl.OkHttpServices.putDocument(OkHttpServices.java:1256)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:920)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:758)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:717)
at Scratch.lambda$main$0(scratch.java:40)
at java.lang.Thread.run(Thread.java:748)
What are possible ways to overcome this problem? One way might be to set a maximum time to live for a transaction (like 5 seconds), but this feels hacky and unreliable. Any other ideas? Are there any other settings i should check out?
I'm on MarkLogic 9.0-7.2 and using marklogic-client-api:4.0.3.
Edit: One way to solve the deadlock would be by syncronizing the calling function, this is actually the way i solved it in my case (see comments). But i think the underlying problem still exists. Having a deadlock in a multi statement transaction should not be hidden away in a 120 second timeout. I rather have a immediately failing request than a 120 second lock on one of my documents + 64 failing retries per thread.

Deadlocks are usually resolvable by retrying. Internally, the server does a inner-retry loop because usually deadlocks are transient and incidental, lasting a very short time. In your case you have constructed a case that will never succeed with any timeout that's equal for both threads.
Deadlocks can be avoided at the application layer by avoiding multi-statement transactions when using the REST API. (which is what the Java api uses).
Multi statement transactions over REST cannot be implemented 100% safely due to the client's responsibility to manage the transaction ID and the server's inability to detect client-side errors or client-side identity. Very subtle problems can and do occur unless you are aggressively proactive wrt handling errors and multithreading. If you 'push' the logic to the server (xquery or javascript) the server is able to manage things much better.
As for if its 'good' or not for the Java API to implement retries for this case, that's debatable either way. (The compromise for an seemingly easy-to-use interface is that many things that would otherwise be options are decided for you as a convention. There's generally no one-size-fits-all answer. In this case I am presuming the thought was that a deadlock is more likely caused by independant code/logic by 'accident' as opposed to identical code running in tangent -- a retry in that case would be a good choice. In your example its not, but then an earlier error would still fail predictably until you change your code to 'not do that' ).
If it doesn't already exist, a feature request for a configurable timeout and retry behaviour does seem a reasonable request. I would recommend, however, to attempt to avoid any REST calls that result in an open transaction -- inherently that is problematic, particularly if you don't notice the problem upfront (then its more likely to bite you in production). Unlike JDBC, which keeps a connection open so that the server can detect client disconnects, HTTP and the ML Rest API do not -- which leads to a different programming model then traditional database coding in java.

Taking 5 seconds to shutdown a java grpc ManagedChannel

I have a client that needs to disconnect from one server and connect to another. Its taking about 16 seconds. I still haven't debugged the connection logic, but I can see the shutdown of the channel is taking 5 seconds. Is this expected behavior, or should I be looking for thread starvation in my code.
LOG.debug("==============SHUTTING DOWN MANAGED CHANNEL");
long startTime=System.currentTimeMillis();
channel.shutdown().awaitTermination(20, SECONDS);
long endTime=System.currentTimeMillis();
LOG.debug("Time to shutdown channel ms = {}",endTime-startTime);
LOG.debug("==============RETURN FROM SHUTTING DOWN MANAGED CHANNEL");
From the log
2018-07-09 14:41:23,143 DEBUG [com.ticomgeo.ftc.client.FTCClient] (EE-ManagedExecutorService-singleThreaded-Thread-1) ==============SHUTTING DOWN MANAGED CHANNEL
2018-07-09 14:41:28,151 INFO [io.grpc.internal.ManagedChannelImpl] (grpc-default-worker-ELG-1-1) [io.grpc.internal.ManagedChannelImpl-1] Terminated
2018-07-09 14:41:28,152 DEBUG [com.ticomgeo.ftc.client.FTCClient] (EE-ManagedExecutorService-singleThreaded-Thread-1) Time to shutdown channel ms = 5009
2018-07-09 14:41:28,152 DEBUG [com.ticomgeo.ftc.client.FTCClient] (EE-ManagedExecutorService-singleThreaded-Thread-1) ==============RETURN FROM SHUTTING DOWN MANAGED CHANNEL

There are two shutdown functions, shutdown and shutdownNow. Is there any chance you have a calls going that are blocking shutdown? You may be better served by shutdownNow.
shutdown
Initiates an orderly shutdown in which preexisting calls continue but new calls are rejected.
shutdownNow
Initiates a forceful shutdown in which preexisting and new calls are rejected. Although forceful, the shutdown process is still not instantaneous; isTerminated() will likely return false immediately after this method returns.

Why doesn't Future.get(...) kill the thread?

I have an application that is using future for asynchronous execution.
I set the parameter on get method, that the thread should get killed after 10 seconds, when it does not get the response:
Future<RecordMetadata> meta = producer.send(record, new ProducerCallBack());
RecordMetadata data = meta.get(10, TimeUnit.SECONDS);
But the thread get killed after 60 second:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.<init>(KafkaProducer.java:1124)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:823)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:760)
at io.khinkali.KafkaProducerClient.main(KafkaProducerClient.java:49)
Caused by: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
What am I doing wrong?

From the docs:
The threshold for time to block is determined by max.block.ms after
which it throws a TimeoutException.
Check Kafka Appender config in logback.xml, look for:
<producerConfig>max.block.ms=60000</producerConfig>

I set the parameter on get method, that the thread should get killed after 10 seconds, when it does not get the response:
If we are talking about Future.get(...) there is nothing about it that "kills" the thread at all. To quote from the javadocs, the Future.get(...) method:
Waits if necessary for at most the given time for the computation to complete, and then retrieves its result, if available.
If the get(...) method times out then it will throw TimeoutException but your thread is free to continue to run. If you want to stop the thread running then you'll need to catch TimeoutException and then call meta.cancel(true) but even that doesn't guarantee that the thread will be "killed". That causes the thread to be interrupted which means that certain methods will throw InterruptedException or the thread needs to be checking for Thread.currentThread().isInterrupted().
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
Yeah this timeout has nothing to do with the Future.get(...) timeout.

Why WebSphere's threads hangs up?

I have WAS 7 and Filenet CE 5.1 and have a troubles.
Why WebSphere's threads hangs up. Is it JDBC driver error?
Could you kindly advice me. Thank a lot!
[22.06.16 13:14:58:921 YEKT] 0000001d ThreadMonitor W WSVR0605W: Thread "WebContainer : 15" (00000047) was active for 631301 msec and can be hanged up. Total threads that can be hang up: 69.
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:140)
at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1782)
at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:4838)
at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:6150)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:402)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(SQLServerPreparedStatement.java:350)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:5696)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1715)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:180)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:155)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.execute(SQLServerPreparedStatement.java:332)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecute(WSJdbcPreparedStatement.java:942)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.execute(WSJdbcPreparedStatement.java:618)
at com.filenet.engine.dbpersist.DBExecutionElement.execute(DBExecutionElement.java:218)
at com.filenet.engine.dbpersist.DBExecutionContext.getNextResult(DBExecutionContext.java:106)
at com.filenet.engine.dbpersist.DBStatementList.executeStatements(DBStatementList.java:161)
at com.filenet.engine.persist.DBStatementList2.executeStatementsNoResult(DBStatementList2.java:57)
at com.filenet.engine.persist.IndependentPersister.executeChangeWork(IndependentPersister.java:409)
at com.filenet.engine.persist.IndependentPersister.executeChange(IndependentPersister.java:225)
at com.filenet.engine.persist.SubscribablePersister.executeChange(SubscribablePersister.java:172)
at com.filenet.engine.jca.impl.RequestBrokerImpl.executeChanges(RequestBrokerImpl.java:1266)
at com.filenet.engine.jca.impl.RequestBrokerImpl.executeChanges(RequestBrokerImpl.java:1146)
at com.filenet.engine.ejb.EngineCoreBean._executeChanges(EngineCoreBean.java:618)

the stack indicates that the thread is waiting to recieve data from your database.
Possible causes could include:
the database is down (or unable to communicate over the network)
a deadlock has occurred in the database
you are fetching some really big data set and/or doing so inefficiently such that the statement is taking an excessive amount of time. You never mentioned if your query ever completes or not, but if it does, I suspect this option is the suspect.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.