I have kafka records:
ConsumerRecords<String, Events> records = kafkaConsumer.poll(POLL_TIMEOUT);
I want to run the below code using parallel streams, not multithreading.
records.forEach((record) -> {
Event event = record.value();
I tried with mlutithreading but I want to try parallelstream:
for (ConsumerRecord<String, Event> record : records) {
executor.execute(new Runnable() {
public void run() {
Actually I'm facing issue with HTTP.send with multithreading (even with a thread pool of 1 thread). I'm getting
"Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target".
This is a request over https. This error comes only for the first time the request is made. Afterwards, the exception vanishes. poof!
For multithreading i'm using:
int threadCOunt=1;
BlockingQueue<Runnable> queue = new ArrayBlockingQueue<Runnable>(threadCOunt, true);
RejectedExecutionHandler handler = new ThreadPoolExecutor.CallerRunsPolicy();
ExecutorService executor = new ThreadPoolExecutor(threadCOunt, threadCOunt, 0L, TimeUnit.MILLISECONDS, queue, handler);
HTTPSend.send() is:
long sizeSend = 0;
SSLContext sc = null;
try {
sc = SSLContext.getInstance("TLS");
sc.init(null, TRUST_ALL_CERTS, new SecureRandom());
} catch (NoSuchAlgorithmException | KeyManagementException e) {
LOGGER.error("Failed to create SSL context", e);
// Ignore differences between given hostname and certificate hostname
HostnameVerifier hv = (hostname, session) -> true;
// Create the REST client and configure it to connect meta
Client client = ClientBuilder.newBuilder()
WebTarget baseTarget = client.target(getURL()).path(HTTP_PATH);
Response jsonResponse = null;
try {
StringBuilder eventsBatchString = new StringBuilder();
Entity<String> entity = Entity.entity(eventsBatchString.toString(), MediaType.APPLICATION_JSON_TYPE);
builder = baseTarget.request();
LOGGER.debug("about to send the event {} and URL {}", entity, getURL());
jsonResponse = builder.header(HTTP_ACK_CHANNEL, guid.toString())
.header("Content-type", MediaType.APPLICATION_JSON)
.header("Authorization", String.format("Meta %s", eventsModuleConfig.getSecretKey()))
I see what you want to do, and I'm not sure that's the best idea (I'm also not sure it's not).
The poll / commit model of Kafka allows simple backpressure and retention of the last item processed if you crashed. By returning to your poll loop "immediately" you are telling Kafka "I am ready for more", and committing the offset (manually or automatically) tells Kafka that you have successfully read up to that point.
What you seem to want to do is read off Kafka as fast as possible, committing offsets, then putting the Kafka records into an executor queue then you balance your requests per second etc from that.
I'm not 100% sure that's a good idea: what happens if your app crashes? You may have committed some Kafka messages that actually didn't make it upstream. If you do really want to do this, I would suggest manually committing the offset (via commitSync) upon completion of the Runnable, instead of letting the high level consumer do it for you.
Why might you want to use a thread executor: I think these can be accomplished with Kafka too.
You may want to post multiple messages to the web server at the same time. A well paritioned Kafka topic will let multiple consumers / consumer groups consumer multiple partitions, thus - assuming a perfectly scaling HTTP server - would let you parallelize the posting of messages to your server. Yay for process based concurrency!
Maybe the web server is not perfectly scalable, or slow for this request (say each request takes 1 second): you need to limit the number of requests per second the web server takes, if you have a queue you might have a couple threads posting while not backing up Kafka.
In this case you can set max.poll.records to a scalable value that your web server requires. There's probably a better way to do this too, although it's escaping me at the moment.
If your web server takes a long time to respond you may get errors related to failing heartbeats. In that case I direct you to this SO answer on the timeout / heartbeat topic.
Instead of using a thread executioner, thus making synchronous HTTP requests appear to be async, I would use an evented HTTP client like Netty, thus achieving parallelism without thread based concurrency.
For solving a "slow consumer" use case where you're doing I/O processing, you should use something like Parallel Consumer (PC) to avoid the "head of line blocking" problem you're describing.
By using PC, you can processing all your keys in parallel, regardless of how long it takes to do your I/O.
It also comes with a non blocking Vert.x module which more efficiently uses the CPU.
PC directly solves for this, by sub partitioning the input partitions by key and processing each key in parallel.
It also tracks per record acknowledgement. Check out Parallel Consumer on GitHub (it's open source BTW, and I'm the author).
We are setting up a cluster to handle inferencing (with Tensorflow Serving) over gRPC. We intend to use a layer-7 load balancer (AWS ALB) to distribute the load. For our work load, inferencing will occur many times per minute from each client account. It is my understand that gRPC holds connection state for each of these channels. As a result, in order for the ALB to do its job, we need to periodically teardown and rebuild the connection on the client instance.
My question: what is the best practice for cycling a connection in Java?
Below is my proposed code, which would be called every couple minutes on each client channel. I assume that while the first connection is being shutdown, we can go about creating new one and immediately issue a request on it; or do we need to wait while the prior channel is shutdown first. In our situation, the channel will (very likely) be empty since the previous request will have been 10 seconds earlier.
if (mChannel != null)
mChannel = ManagedChannelBuilder.forAddress(mHost, mPort).usePlaintext().build();
mStub = PredictionServiceGrpc.newBlockingStub(mChannel);
The best practice is to use Lookaside Load Balancing.
However, you can do few tweaks to terminate client connections.
var builder = ManagedChannelBuilder.forAddress(mHost, mPort)
.keepAliveTime(15, TimeUnit.SECONDS)
.keepAliveTimeout(5, TimeUnit.SECONDS);
The above config will ensure to terminate sticky gRPC connections, and AWS ALB can do its job to load balance requests uniformly.
There are other options that you can try depending upon your use case, e.g retries, etc. See ManagedChannelBuilder
I'm playing around with Vert.x and quite new to the servers based on event loop as opposed to the thread/connection model.
public void start(Future<Void> fut) {
.requestHandler(r -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request received - "+start.format(DateTimeFormatter.ISO_DATE_TIME));
final MyModel model = new MyModel();
try {
for(int i=0;i<10000000;i++){
//some simple operation
model.data = start.format(DateTimeFormatter.ISO_DATE_TIME) +" - "+LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);
} catch (Exception e1) {
// TODO Auto-generated catch block
new Gson().toJson(model)
.listen(4568, result -> {
if (result.succeeded()) {
} else {
System.out.println("Server started ..");
I'm just trying to simulate a long running request handler to understand how this model works.
What I've observed is the so called event loop is blocked until my first request completes. Whatever little time it takes, subsequent request is not acted upon until the previous one completes.
Obviously I'm missing a piece here and that's the question that I have here.
Edited based on the answers so far:
Isn't accepting all requests considered to be asynchronous? If a new
connection can only be accepted when the previous one is cleared
off, how is it async?
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request). So it means, the
event loop can't accept a new connection until the previous request
finishes(even if its winds up in a second). And If I as a programmer
have to think through all these and push such request handlers to a
worker thread , then how does it differ from a thread/connection
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve
c10k problem, when it can't start all concurrent requests parallely and have to wait till the previous one terminates?
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
Edits and topping this question for a bounty
Do not completely understand how this model is claimed to asynchronous.
Vert.x has an async JDBC client (Asyncronous is the keyword) which I tried to adapt with RXJava.
Here is a code sample (Relevant portions)
server.requestStream().toObservable().subscribe(req -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request for " + req.absoluteURI() +" received - " +start.format(DateTimeFormatter.ISO_DATE_TIME));
conn -> {
// Now chain some statements using flatmap composition
Observable<ResultSet> resa = conn.queryObservable("SELECT * FROM CALL_OPTION WHERE UNDERLYING='NIFTY'");
// Subscribe to the final result
resa.subscribe(resultSet -> {
System.out.println("Request for " + req.absoluteURI() +" Ended - " +LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME));
}, err -> {
System.out.println("Database problem");
// Could not connect
err -> {
The select query there takes 3 seconds approx to return the complete table dump.
When I fire concurrent requests(tried with just 2), I see that the second request completely waits for the first one to complete.
If the JDBC select is asynchronous, Isn't it a fair expectation to have the framework handle the second connection while it waits for the select query to return anything.?
Vert.x event loop is, in fact, a classical event loop existing on many platforms. And of course, most explanations and docs could be found for Node.js, as it's the most popular framework based on this architecture pattern. Take a look at one more or less good explanation of mechanics under Node.js event loop. Vert.x tutorial has fine explanation between "Don’t call us, we’ll call you" and "Verticles" too.
Edit for your updates:
First of all, when you are working with an event loop, the main thread should work very quickly for all requests. You shouldn't do any long job in this loop. And of course, you shouldn't wait for a response to your call to the database.
- Schedule a call asynchronously
- Assign a callback (handler) to result
- Callback will be executed in the worker thread, not event loop thread. This callback, for example, will return a response to the socket.
So, your operations in the event loop should just schedule all asynchronous operations with callbacks and go to the next request without awaiting any results.
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request).
In that case, your request has some computation expensive parts or access to IO - your code in the event loop shouldn't wait for the result of these operations.
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or all the I/O op are handled asynchronously?
When you have too many concurrent requests and a traditional programming model, you will make thread per each request. What this thread will do? They will be mostly waiting for IO operations (for example, result from database). It's a waste of resources. In our event loop model, you have one main thread that schedule operations and preallocated amount of worker threads for long tasks. + None of these workers actually wait for the response, they just can execute another code while waiting for IO result (it can be implemented as callbacks or periodical checking status of IO jobs currently in progress). I would recommend you go through Java NIO and Java NIO 2 to understand how this async IO can be actually implemented inside the framework. Green threads is a very related concept too, that would be good to understand. Green threads and coroutines are a type of shadowed event loop, that trying to achieve the same thing - fewer threads because we can reuse system thread while green thread waiting for something.
How does it even solve c10k problem, when it can't start all concurrent requests parallel and have to wait till the previous one terminates?
For sure we don't wait in the main thread for sending the response for the previous request. Get request, schedule long/IO tasks execution, next request.
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
If you make everything right - no. Even more, you will get good data locality and execution flow prediction. One CPU core will execute your short event loop and schedule async work without context switching and nothing more. Other cores make a call to the database and return response and only this. Switching between callbacks or checking different channels for IO status doesn't actually require any system thread's context switching - it's actually working in one worker thread. So, we have one worker thread per core and this one system thread await/checks results availability from multiple connections to database for example. Revisit Java NIO concept to understand how it can work this way. (Classical example for NIO - proxy-server that can accept many parallel connections (thousands), proxy requests to some other remote servers, listen to responses and send responses back to clients and all of this using one or two threads)
About your code, I made a sample project for you to demonstrate that everything works as expected:
public class MyFirstVerticle extends AbstractVerticle {
public void start(Future<Void> fut) {
JDBCClient client = JDBCClient.createShared(vertx, new JsonObject()
.put("url", "jdbc:hsqldb:mem:test?shutdown=true")
.put("driver_class", "org.hsqldb.jdbcDriver")
.put("max_pool_size", 30));
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
// create a table
connection.execute("create table test(id int primary key, name varchar(255))", create -> {
if (create.failed()) {throw new RuntimeException(create.cause());}
.requestHandler(r -> {
int requestId = new Random().nextInt();
System.out.println("Request " + requestId + " received");
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
connection.execute("insert into test values ('" + requestId + "', 'World')", insert -> {
// query some data with arguments
.queryWithParams("select * from test where id = ?", new JsonArray().add(requestId), rs -> {
connection.close(done -> {if (done.failed()) {throw new RuntimeException(done.cause());}});
System.out.println("Result " + requestId + " returned");
.listen(8080, result -> {
if (result.succeeded()) {
} else {
public class MyFirstVerticleTest {
private Vertx vertx;
public void setUp(TestContext context) {
vertx = Vertx.vertx();
public void tearDown(TestContext context) {
public void testMyApplication(TestContext context) {
for (int i = 0; i < 10; i++) {
final Async async = context.async();
vertx.createHttpClient().getNow(8080, "localhost", "/",
response -> response.handler(body -> {
Request 1412761034 received
Request -1781489277 received
Request 1008255692 received
Request -853002509 received
Request -919489429 received
Request 1902219940 received
Request -2141153291 received
Request 1144684415 received
Request -1409053630 received
Request -546435082 received
Result 1412761034 returned
Result -1781489277 returned
Result 1008255692 returned
Result -853002509 returned
Result -919489429 returned
Result 1902219940 returned
Result -2141153291 returned
Result 1144684415 returned
Result -1409053630 returned
Result -546435082 returned
So, we accept a request - schedule a request to the database, go to the next request, we consume all of them and send a response for each request only when everything is done with the database.
About your code sample I see two possible issues - first, it looks like you don't close() connection, which is important to return it to pool. Second, how your pool is configured? If there is only one free connection - these requests will serialize waiting for this connection.
I recommend you to add some printing of a timestamp for both requests to find a place where you serialize. You have something that makes the calls in the event loop to be blocking. Or... check that you send requests in parallel in your test. Not next after getting a response after previous.
How is this asynchronous? The answer is in your question itself
What I've observed is the so called event loop is blocked until my
first request completes. Whatever little time it takes, subsequent
request is not acted upon until the previous one completes
The idea is instead of having a new for serving each HTTP request, same thread is used which you have blocked by your long running task.
The goal of event loop is to save the time involved in context switching from one thread to another thread and utilize the ideal CPU time when a task is using IO/Network activities. If while handling your request it had to other IO/Network operation eg: fetching data from a remote MongoDB instance during that time your thread will not be blocked and instead an another request would be served by the same thread which is the ideal use case of event loop model (Considering that you have concurrent requests coming to your server).
If you have long running tasks which does not involve Network/IO operation, you should consider using thread pool instead, if you block your main event loop thread itself other requests would be delayed. i.e. for long running tasks you are okay to pay the price of context switching for for server to be responsive.
The way a server can handle requests can vary:
1) Spawn a new thread for each incoming request (In this model the context switching would be high and there is additional cost of spawning a new thread every time)
2) Use a thread pool to server the request (Same set of thread would be used to serve requests and extra requests gets queued up)
3) Use a event loop (single thread for all the requests. Negligible context switching. Because there would be some threads running e.g: to queue up the incoming requests)
First of all context switching is not bad, it is required to keep application server responsive, but, too much context switching can be a problem if the number of concurrent requests goes too high (roughly more than 10k). If you want to understand in more detail I recommend you to read C10K article
Assume a typical request takes anywhere between 100 ms to 1 sec (based
on the kind and nature of the request). So it means, the event loop
can't accept a new connection until the previous request finishes(even
if its winds up in a second).
If you need to respond to large number of concurrent requests (more than 10k) I would consider more than 500ms as a longer running operation. Secondly, Like I said there are some threads/context switching involved e.g.: to queue up incoming requests, but, the context switching amongst threads would be greatly reduced as there would be too few threads at a time. Thirdly, if there is a network/IO operation involved in resolving first request second request would get a chance to be resolved before first is resolved, this is where this model plays well.
And If I as a programmer have to think
through all these and push such request handlers to a worker thread ,
then how does it differ from a thread/connection model?
Vertx is trying to give you best of threads and event loop, so, as programmer you can make a call on how to make your application efficient under both the scenario i.e. long running operation with and without network/IO operation.
I'm just trying to understand how is this model better from a
traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve c10k
problem, when it can't start all concurrent requests parallely and
have to wait till the previous one terminates?
The above explanation should answer this.
Even if I decide to push all these operations to a worker
thread(pooled), then I'm back to the same problem isn't it? Context
switching between threads?
Like I said, both have pros and cons and vertx gives you both the model and depending on your use case you got to choose what is ideal for your scenario.
In these sort of processing engines, you are supposed to turn long running tasks in to asynchronously executed operations and these is a methodology for doing this, so that the critical thread can complete as quickly as possible and return to perform another task. i.e. any IO operations are passed to the framework to call you back when the IO is done.
The framework is asynchronous in the sense that it supports you producing and running these asynchronous tasks, but it doesn't change your code from being synchronous to asynchronous.
I am using the Oracle Jersey Client, and am trying to cancel a long running get or put operation.
The Client is constructed as:
JacksonJsonProvider provider = new JacksonJsonProvider(new ObjectMapper());
ClientConfig clientConfig = new DefaultClientConfig();
Client client = Client.create(clientConfig);
The following code is executed on a worker thread:
File bigZipFile = new File("/home/me/everything.zip");
WebResource resource = client.resource("https://putfileshere.com");
Builder builder = resource.getRequestBuilder();
builder.type("application/zip").put(bigZipFile); //This will take a while!
I want to cancel this long-running put. When I try to interrupt the worker thread, the put operation continues to run. From what I can see, the Jersey Client makes no attempt to check for Thread.interrupted().
I see the same behavior when using an AsyncWebResource instead of WebResource and using Future.cancel(true) on the Builder.put(..) call.
So far, the only solution I have come up with to interrupt this is throwing a RuntimeException in a ContainerListener:
client.addFilter(new ConnectionListenerFilter(
new OnStartConnectionListener(){
public ContainerListener onStart(ClientRequest cr) {
return new ContainerListener(){
public void onSent(long delta, long bytes) {
//If the thread has been interrupted, stop the operation
if (Thread.interrupted()) {
throw new RuntimeException("Upload or Download canceled");
//Report progress otherwise
I am wondering if there is a better solution (perhaps when creating the Client) that correctly handles interruptible I/O without using a RuntimeException.
I am wondering if there is a better solution (perhaps when creating the Client) that correctly handles interruptible I/O without using a RuntimeException.
Yeah, interrupting the thread will only work if the code is watching for the interrupts or calling other methods (such as Thread.sleep(...)) that watch for it.
Throwing an exception out of listener doesn't sound like a bad idea. I would certainly create your own RuntimeException class such as TimeoutRuntimeException or something so you can specifically catch and handle it.
Another thing to do would be to close the underlying IO stream that is being written to which would cause an IOException but I'm not familiar with Jersey so I'm not sure if you can get access to the connection.
Ah, here's an idea. Instead of putting the File, how about putting some sort of extension on a BufferedInputStream that is reading from the File but also has a timeout. So Jersey would be reading from the buffer and at some point it would throw an IOException if the timeout expires.
As of Jersey 2.35, the above API has changed. A timeout has been introduces in the client builder which can set read timeout. If the server takes too long to respond, the underlying socket will timeout. However, if the server starts sending the response, it shall not timeout. This can be utilized, if the server does not start sending partial response, which depends on the server implementation.
.connectTimeout(1*1000, TimeUnit.MILLISECONDS)
.readTimeout(5*1000, TimeUnit.MILLISECONDS).build()
The current filters and interceptors are for data only and the solution posted in the original question will not work with filters and interceptors (though I admit I may have missed something there).
Another way is to get hold of the underlying HttpUrlConnection (for standard Jersey client configuration) and it seems to be possible with org.glassfish.jersey.client.HttpUrlConnectorProvider
HttpUrlConnectorProvider httpConProvider=new HttpUrlConnectorProvider();
httpConProvider.connectionFactory(new CustomHttpUrlConnectionfactory());
public static class CustomHttpUrlConnectionfactory implements
public HttpURLConnection getConnection(URL url) throws IOException {
System.out.println("CustomHttpUrlConnectionfactory ..... called");
return (HttpURLConnection)url.openConnection();
}//getConnection closing
}//inner-class closing
I did try the connection provider approach, however, I could not get that working. The idea would be to keep reference to the connection by some means (thread id etc.) and close it if the communication is taking too long. The primary problem was I could not find a way to register the provider with the client. The standard
mechanism does not seem to work (or perhaps it is not supposed to work like that) and the documentation is a bit sketchy in that direction.
I have a multi client server application in java. The server keeps on receiving connections and each client is handled by a separate thread. The client/server communication goes on until the socket is closed. So the request received from clients is put in a LinkedBlockingQueue and then other thread process each request from that queue. Since the client request is added to the queue I am using a ConcurrentHashMap to get the clientsocket later on when the request is processed and response is ready so that i can send the response to client later.
Now I need to implement a timeout functionality so if the request is not process and response is not ready within a time period then some sort of message is sent to the client that your request cannot be processed now. Can anybody tell me the best idea to do it in a multithreaded environment. Remember that I have a client map in which client connection is put against each request id.
I am thinking to have a separate thread that will keep on iterating the map keys and check the time. But since request keep on adding in the map I want some best way to do it.
Guava's loading cache can solve the timeout and concurrent modifications for you: https://code.google.com/p/guava-libraries/wiki/CachesExplained Exchange your request map to a LoadingCache by setting it up like this:
LoadingCache<Request, Connection> requests = CacheBuilder.newBuilder()
.expireAfterAccess(1, TimeUnit.MINUTES)
new CacheLoader<Request, Connection>() {
public Connection load(Request request) throws AnyException {
return clientConnectionForRequest(request);
When a request comes in, you load it in the cache:
After this, the request will sit there waiting to be processed. If processing is started, then get the connection and invalidate the request, so it is removed from the cache. ①
Connection c = requests.getIfPresent(request);
if (c != null) {
requests.invalidate(request); // remove from the waiting area
// proceeed with processing the request
} else {
// the request was evicted from the cache as it expired
In the removal listener you need to implement some simple logic that listens for evictions. (If you invalidate explicitly, then wasEvicted() will return false.)
MY_LISTENER = new RemovalListener<Request, Connection>() {
public void onRemovaRequest RemovalNotification<Request, Connection> notification) {
if (notification.wasEvicted()) {
Connection c = notification.getValue();
// send timeout response to client
You can order the requests by placing them in a queue and executing the method described at ① That method will also take care of executing only those requests that did not time out yet, you need no additional house keeping.
Use Concurrent Hash Map. It allows full concurrency for reads and adjustable concurrency for writes. It uses volatile variables to put the data. Even if any modification is being made by any thread to a bucket, it will be visible to any other thread trying to read the datafrom the same bucket.
I have an Akka actor that owns an AsyncHttpClient. This actor must handles a lot of asynchronous requests. Because my system cannot handle thousands of requests simultaneously, I need to limit the number of concurrent requests.
Right now, I'm doing this :
AsyncHttpClientConfig config = new AsyncHttpClientConfig.Builder().setAllowPoolingConnection(true)
.addRequestFilter(new ThrottleRequestFilter(32))
final AsyncHttpClient httpClient = new AsyncHttpClient(new NettyAsyncHttpProvider(config));
When my actor receives a message, I use the client like this :
Future<Integer> f = httpClient.prepareGet(url).execute(
new AsyncCompletionHandler<Integer>() {
public Integer onCompleted(Response response) throws Exception {
// handle successful request
public void onThrowable(Throwable t){
// handle failed request
The problem is that requests are never put in the client queue and are all processed like the configuration doesn't matter. Why doesn't this work as it should?
From the maintainer:
setMaxConnectionsPerHost only caps the number of connections that can be open to a given host. There's no built-in queuing mechanism for requests that might need a connection while there's none available.
So basically, it's a hard limit. Also, in versions of the library prior to, I believe, 1.9.10, the maximumConnectionsPerHost field was not being properly utilized by the code to limit the number of concurrent connections per host. Instead, there was a bug where the client only looked at the maximumConnectionsTotal field.
Link to issue referenced on GitHub