Java Async HttpClient request seems to block the main thread?

Java Async HttpClient request seems to block the main thread? - java

According to this the following snippet should be Async.
Therefore, the output should read: TP1, TP2, TP3, http://openjdk.java.net/.
However, when I run it I get: TP1, TP2, http://openjdk.java.net/, TP3.
It seems "sendAsync" is blocking the main thread. This is not what I expected from an Async method.
Am I doing something wrong?
public static void main(String[] args) {
HttpClient client = HttpClient.newHttpClient();
System.out.println("TP1");
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("http://openjdk.java.net/"))
.build();
System.out.println("TP2");
client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
.thenApply(HttpResponse::uri)
.thenAccept(System.out::println)
.join();
System.out.println("TP3");
}

Explanation
You call join() and that will explicitly wait and block until the future is completed.
From CompletableFuture#join:
Returns the result value when complete, or throws an (unchecked) exception if completed exceptionally. [...]
Although not explicitly mentioned but obvious from the name (refer to Thread#join which "Waits for this thread to die."), it can only return a result by waiting for the call to complete.
The method is very similar to CompletableFuture#get, they differ in their behavior regarding exceptional completion:
Waits if necessary for this future to complete, and then returns its result.
Solution
Put the future into a variable and join later, when you actually want to wait for it.
For example:
System.out.println("TP2");
var task = client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
.thenApply(HttpResponse::uri)
.thenAccept(System.out::println);
System.out.println("TP3");
task.join(); // wait later
Or never wait on it. Then your main-thread might die earlier but the JVM only shuts down once all non-daemon threads are dead and the thread used by HttpClient for the async task is not a daemon thread.
Note
Also, never rely on the order of multithreaded execution.
Even if you wouldnt have made a mistake, the order you observe would be a valid order of a multithreaded execution.
Remember that the OS scheduler is free to decide in which order it executes what - it can be any order.

Related

Java 9 HttpClient hangs

I'm experimenting with HTTP/2 client from jdk 9-ea+171. The code is taken from this example:
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(new URI("https://www.google.com/"))
.build();
HttpResponse<String> response
= client.send(request, HttpResponse.BodyHandler.asString());
But the client hangs on the last line forever. Please advice how to fix it?
Debugging shows it infinitely waits in method waitUntilPrefaceSent().

This is a bug in the latest build's implementation of a HTTP2 connection. It does not occure with previous builds.
First of all, you need to specify the GET method to avoid getting a null pointer exception.
What happens is that the main thread is waiting for the connection preface to be sent. It locks a count down latch to await the receival of this preface. In order to wake itself up, any HttpClient creates a helper thread that reads incoming traffic. This thread is supposed to wake up the main thread but sometimes, this never happens. If you run your example, often enough, you will see that this sometimes work. I guess there is a race for reading the preface.
Unfortunately, the reading of the preface does not respect any timeout either, so there is no way of waking up the main thread, other than interrupting the main thread.
Here is an official ticket: https://bugs.openjdk.java.net/browse/JDK-8181430

How to interrupt a function call in Java

I am trying to use a Third Party Internal Library which is processing a given request. Unfortunately it is synchronous in nature. Also I have no control on the code for the same. Basically it is a function call. This function seems to a bit erratic in behavior. Sometimes this function takes 10 ms to complete processing and sometimes it takes up to 300 secs to process the request.
Can you suggest me a way to write a wrapper around this function so that it would throw an interrupted exception if the function does not complete processing with x ms/secs. I can live with not having the results and continue processing, but cannot tolerate a 3 min delay.
PS: This function internally sends an update to another system using JMS and waits for that system to respond and sends apart from some other calculations.

Can you suggest me a way to write a wrapper around this function so that it would throw an interrupted exception if the function does not complete processing with x ms/secs.
This is not possible. InterruptException only gets thrown by specific methods. You can certainly call thread.stop() but this is deprecated and not recommended for a number of reasons.
A better alternative would be for your code to wait for the response for a certain amount of time and just abandon the call if doesn't work. For example, you could submit a Callable to a thread pool that actually makes the call to the "Third Party Internal Library". Then your main code would do a future.get(...) with a specific timeout.
// allows 5 JMS calls concurrently, change as necessary or used newCachedThreadPool()
ExecutorService threadPool = Executors.newFixedThreadPool(5);
...
// submit the call to be made in the background by thread-pool
Future<Response> future = threadPool.submit(new Callable<Response>() {
public Response call() {
// this damn call can take 3 to 3000ms to complete dammit
return thirdPartyInternalLibrary.makeJmsRequest();
}
});
// wait for some max amount of time
Response response = null;
try {
response = future.get(TimeUnit.MILLISECONDS, 100);
} catch (TimeoutException te) {
// log that it timed out and continue or throw an exception
}
The problem with this method is that you might spawn a whole bunch of threads waiting for the library to respond to the remote JMS query that you would not have a lot of control over.
No easy solution.

This will throw a TimeoutException if the lambda doesn't finish in the time allotted:
CompletableFuture.supplyAsync(() -> yourCall()).get(1, TimeUnit.SECONDS)

Being that this is 3rd party you cannot modify the code. As such you will need to do two things
Launch the execution in a new thread.
Wait for execution in current thread, with timeout.
One possible way would be to use a Semaphore.
final Semaphore semaphore = new Semaphore(0);
Thread t = new Thread(new Runnable() {
#Override
public void run() {
// do work
semaphore.release();
}
});
t.start();
try {
semaphore.tryAcquire(1, TimeUnit.SECONDS); // Whatever your timeout is
} catch (InterruptedException e) {
// handle cleanup
}
The above method is gross, I would suggest instead updateing your desing to use a dedicated worker queue or RxJava with a timeout if possible.

Vert.x Event loop - How is this asynchronous?

I'm playing around with Vert.x and quite new to the servers based on event loop as opposed to the thread/connection model.
public void start(Future<Void> fut) {
vertx
.createHttpServer()
.requestHandler(r -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request received - "+start.format(DateTimeFormatter.ISO_DATE_TIME));
final MyModel model = new MyModel();
try {
for(int i=0;i<10000000;i++){
//some simple operation
}
model.data = start.format(DateTimeFormatter.ISO_DATE_TIME) +" - "+LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);
} catch (Exception e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
r.response().end(
new Gson().toJson(model)
);
})
.listen(4568, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
System.out.println("Server started ..");
}
I'm just trying to simulate a long running request handler to understand how this model works.
What I've observed is the so called event loop is blocked until my first request completes. Whatever little time it takes, subsequent request is not acted upon until the previous one completes.
Obviously I'm missing a piece here and that's the question that I have here.
Edited based on the answers so far:
Isn't accepting all requests considered to be asynchronous? If a new
connection can only be accepted when the previous one is cleared
off, how is it async?
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request). So it means, the
event loop can't accept a new connection until the previous request
finishes(even if its winds up in a second). And If I as a programmer
have to think through all these and push such request handlers to a
worker thread , then how does it differ from a thread/connection
model?
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve
c10k problem, when it can't start all concurrent requests parallely and have to wait till the previous one terminates?
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
Edits and topping this question for a bounty
Do not completely understand how this model is claimed to asynchronous.
Vert.x has an async JDBC client (Asyncronous is the keyword) which I tried to adapt with RXJava.
Here is a code sample (Relevant portions)
server.requestStream().toObservable().subscribe(req -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request for " + req.absoluteURI() +" received - " +start.format(DateTimeFormatter.ISO_DATE_TIME));
jdbc.getConnectionObservable().subscribe(
conn -> {
// Now chain some statements using flatmap composition
Observable<ResultSet> resa = conn.queryObservable("SELECT * FROM CALL_OPTION WHERE UNDERLYING='NIFTY'");
// Subscribe to the final result
resa.subscribe(resultSet -> {
req.response().end(resultSet.getRows().toString());
System.out.println("Request for " + req.absoluteURI() +" Ended - " +LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME));
}, err -> {
System.out.println("Database problem");
err.printStackTrace();
});
},
// Could not connect
err -> {
err.printStackTrace();
}
);
});
server.listen(4568);
The select query there takes 3 seconds approx to return the complete table dump.
When I fire concurrent requests(tried with just 2), I see that the second request completely waits for the first one to complete.
If the JDBC select is asynchronous, Isn't it a fair expectation to have the framework handle the second connection while it waits for the select query to return anything.?

Vert.x event loop is, in fact, a classical event loop existing on many platforms. And of course, most explanations and docs could be found for Node.js, as it's the most popular framework based on this architecture pattern. Take a look at one more or less good explanation of mechanics under Node.js event loop. Vert.x tutorial has fine explanation between "Don’t call us, we’ll call you" and "Verticles" too.
Edit for your updates:
First of all, when you are working with an event loop, the main thread should work very quickly for all requests. You shouldn't do any long job in this loop. And of course, you shouldn't wait for a response to your call to the database.
- Schedule a call asynchronously
- Assign a callback (handler) to result
- Callback will be executed in the worker thread, not event loop thread. This callback, for example, will return a response to the socket.
So, your operations in the event loop should just schedule all asynchronous operations with callbacks and go to the next request without awaiting any results.
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request).
In that case, your request has some computation expensive parts or access to IO - your code in the event loop shouldn't wait for the result of these operations.
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or all the I/O op are handled asynchronously?
When you have too many concurrent requests and a traditional programming model, you will make thread per each request. What this thread will do? They will be mostly waiting for IO operations (for example, result from database). It's a waste of resources. In our event loop model, you have one main thread that schedule operations and preallocated amount of worker threads for long tasks. + None of these workers actually wait for the response, they just can execute another code while waiting for IO result (it can be implemented as callbacks or periodical checking status of IO jobs currently in progress). I would recommend you go through Java NIO and Java NIO 2 to understand how this async IO can be actually implemented inside the framework. Green threads is a very related concept too, that would be good to understand. Green threads and coroutines are a type of shadowed event loop, that trying to achieve the same thing - fewer threads because we can reuse system thread while green thread waiting for something.
How does it even solve c10k problem, when it can't start all concurrent requests parallel and have to wait till the previous one terminates?
For sure we don't wait in the main thread for sending the response for the previous request. Get request, schedule long/IO tasks execution, next request.
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
If you make everything right - no. Even more, you will get good data locality and execution flow prediction. One CPU core will execute your short event loop and schedule async work without context switching and nothing more. Other cores make a call to the database and return response and only this. Switching between callbacks or checking different channels for IO status doesn't actually require any system thread's context switching - it's actually working in one worker thread. So, we have one worker thread per core and this one system thread await/checks results availability from multiple connections to database for example. Revisit Java NIO concept to understand how it can work this way. (Classical example for NIO - proxy-server that can accept many parallel connections (thousands), proxy requests to some other remote servers, listen to responses and send responses back to clients and all of this using one or two threads)
About your code, I made a sample project for you to demonstrate that everything works as expected:
public class MyFirstVerticle extends AbstractVerticle {
#Override
public void start(Future<Void> fut) {
JDBCClient client = JDBCClient.createShared(vertx, new JsonObject()
.put("url", "jdbc:hsqldb:mem:test?shutdown=true")
.put("driver_class", "org.hsqldb.jdbcDriver")
.put("max_pool_size", 30));
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
// create a table
connection.execute("create table test(id int primary key, name varchar(255))", create -> {
if (create.failed()) {throw new RuntimeException(create.cause());}
});
});
vertx
.createHttpServer()
.requestHandler(r -> {
int requestId = new Random().nextInt();
System.out.println("Request " + requestId + " received");
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
connection.execute("insert into test values ('" + requestId + "', 'World')", insert -> {
// query some data with arguments
connection
.queryWithParams("select * from test where id = ?", new JsonArray().add(requestId), rs -> {
connection.close(done -> {if (done.failed()) {throw new RuntimeException(done.cause());}});
System.out.println("Result " + requestId + " returned");
r.response().end("Hello");
});
});
});
})
.listen(8080, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
}
}
#RunWith(VertxUnitRunner.class)
public class MyFirstVerticleTest {
private Vertx vertx;
#Before
public void setUp(TestContext context) {
vertx = Vertx.vertx();
vertx.deployVerticle(MyFirstVerticle.class.getName(),
context.asyncAssertSuccess());
}
#After
public void tearDown(TestContext context) {
vertx.close(context.asyncAssertSuccess());
}
#Test
public void testMyApplication(TestContext context) {
for (int i = 0; i < 10; i++) {
final Async async = context.async();
vertx.createHttpClient().getNow(8080, "localhost", "/",
response -> response.handler(body -> {
context.assertTrue(body.toString().contains("Hello"));
async.complete();
})
);
}
}
}
Output:
Request 1412761034 received
Request -1781489277 received
Request 1008255692 received
Request -853002509 received
Request -919489429 received
Request 1902219940 received
Request -2141153291 received
Request 1144684415 received
Request -1409053630 received
Request -546435082 received
Result 1412761034 returned
Result -1781489277 returned
Result 1008255692 returned
Result -853002509 returned
Result -919489429 returned
Result 1902219940 returned
Result -2141153291 returned
Result 1144684415 returned
Result -1409053630 returned
Result -546435082 returned
So, we accept a request - schedule a request to the database, go to the next request, we consume all of them and send a response for each request only when everything is done with the database.
About your code sample I see two possible issues - first, it looks like you don't close() connection, which is important to return it to pool. Second, how your pool is configured? If there is only one free connection - these requests will serialize waiting for this connection.
I recommend you to add some printing of a timestamp for both requests to find a place where you serialize. You have something that makes the calls in the event loop to be blocking. Or... check that you send requests in parallel in your test. Not next after getting a response after previous.

How is this asynchronous? The answer is in your question itself
What I've observed is the so called event loop is blocked until my
first request completes. Whatever little time it takes, subsequent
request is not acted upon until the previous one completes
The idea is instead of having a new for serving each HTTP request, same thread is used which you have blocked by your long running task.
The goal of event loop is to save the time involved in context switching from one thread to another thread and utilize the ideal CPU time when a task is using IO/Network activities. If while handling your request it had to other IO/Network operation eg: fetching data from a remote MongoDB instance during that time your thread will not be blocked and instead an another request would be served by the same thread which is the ideal use case of event loop model (Considering that you have concurrent requests coming to your server).
If you have long running tasks which does not involve Network/IO operation, you should consider using thread pool instead, if you block your main event loop thread itself other requests would be delayed. i.e. for long running tasks you are okay to pay the price of context switching for for server to be responsive.
EDIT:
The way a server can handle requests can vary:
1) Spawn a new thread for each incoming request (In this model the context switching would be high and there is additional cost of spawning a new thread every time)
2) Use a thread pool to server the request (Same set of thread would be used to serve requests and extra requests gets queued up)
3) Use a event loop (single thread for all the requests. Negligible context switching. Because there would be some threads running e.g: to queue up the incoming requests)
First of all context switching is not bad, it is required to keep application server responsive, but, too much context switching can be a problem if the number of concurrent requests goes too high (roughly more than 10k). If you want to understand in more detail I recommend you to read C10K article
Assume a typical request takes anywhere between 100 ms to 1 sec (based
on the kind and nature of the request). So it means, the event loop
can't accept a new connection until the previous request finishes(even
if its winds up in a second).
If you need to respond to large number of concurrent requests (more than 10k) I would consider more than 500ms as a longer running operation. Secondly, Like I said there are some threads/context switching involved e.g.: to queue up incoming requests, but, the context switching amongst threads would be greatly reduced as there would be too few threads at a time. Thirdly, if there is a network/IO operation involved in resolving first request second request would get a chance to be resolved before first is resolved, this is where this model plays well.
And If I as a programmer have to think
through all these and push such request handlers to a worker thread ,
then how does it differ from a thread/connection model?
Vertx is trying to give you best of threads and event loop, so, as programmer you can make a call on how to make your application efficient under both the scenario i.e. long running operation with and without network/IO operation.
I'm just trying to understand how is this model better from a
traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve c10k
problem, when it can't start all concurrent requests parallely and
have to wait till the previous one terminates?
The above explanation should answer this.
Even if I decide to push all these operations to a worker
thread(pooled), then I'm back to the same problem isn't it? Context
switching between threads?
Like I said, both have pros and cons and vertx gives you both the model and depending on your use case you got to choose what is ideal for your scenario.

In these sort of processing engines, you are supposed to turn long running tasks in to asynchronously executed operations and these is a methodology for doing this, so that the critical thread can complete as quickly as possible and return to perform another task. i.e. any IO operations are passed to the framework to call you back when the IO is done.
The framework is asynchronous in the sense that it supports you producing and running these asynchronous tasks, but it doesn't change your code from being synchronous to asynchronous.

Behavior of HttpClient with caller thread being cancelled

We have a callable class A which actually makes HttpCalls through HttpClient.executeMethod(GetMethod) with a lot of other pre-computations. HttpClient is initialized in the constructor with MultiThreadedHttpConnectionManager.
Another class B creates list of threads for class A through ExecutorService and submits task to the pool and expects future objects to be returned. We have following logic in class B:
for( Future f : futures ){
try{
String str = f.get(timeOut, TimeUnit.SECONDS);
}catch(TimeoutException te){
f.cancel(true);
}
}
This way, our thread gets terminated after a specified time and execution of the task will be terminated and this thread will be available for next task.
I want to confirm the following:
If an external connection is made though HttpClient, how does that get handled on future.cancel of the thread?
In above case or in general, does the http connection pool gets the connection back by properly releasing the previous one? We do release the connection in finally but I don't think interrupting the thread will hit that block.
Could it cause any kind of leak on client or extra resource consumption on the server?
Thanks!

It depends.
If the Http Client uses java.net.Socket, its I/O isn't interrruptible, so the cancel will have no effect.
If it uses NIO, the interrupt will close the channel and cause an exception. At the server this will cause a premature end of stream or an exception on write, either of which the server should cope with corectly.

Cancelling Http connection in android

I am using org.apache.http and I've this code:
DefaultHttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet(url);
HttpResponse resp = client.execute(get);
HttpEntity entity = resp.getEntity();
InputStream input = entity.getContent();
...
//Read the bytes from input stream
This is the code I am using to download files over Http, I want to cancel the connection(may be user chooses to) What is the graceful way to close the connection. I found 2 ways, Both cancels the download.
Closing inputsteram, input.close(); which causes IOException.
Aborting HttpGet object, get.abort() causes SocketException.
I have try catch, so no erros, but without throwing exception,
is there a way to cancel or abort the connection?
What is the right way to go about it ?

The proper way doing this is sending FIN value to the server side.
How ever in android you do not have the option to be involved in this level, so you can implement by your self using C, or use one of the methods you mention in your question.

Using HttpUriRequest#about is the right way in my opinion. This will cause immediate termination of the underlying connection and its eviction from the connection pool. Newer versions of HttpClient (4.2 and newer) intercept SocketExceptions caused by premature request termination by the user. The problem is that Google ships a fork of HttpClient based on an extremely outdated version (pre-beta1). If you are not able or willing to use a newer version of HttpClient your only option is to catch and discard SocketException in your code.

Use this
client.getConnectionManager().closeExpiredConnections();
client.getConnectionManager().shutdown();
Now you can decide where would you like to write these 2 lines in code.. It will close the connection using the DefaultHttpClient object that you created.
Let me know if this helps you.

Try to cancel the task when you want to interrupt the connection:
task.cancel(true);
This will cancel the task and the threads running in it.
Check this for reference:
public final boolean cancel (boolean mayInterruptIfRunning)
Since: API Level 3
Attempts to cancel execution of this task. This attempt will fail if the task has already completed, already been cancelled, or could not be cancelled for some other reason. If successful, and this task has not started when cancel is called, this task should never run. If the task has already started, then the mayInterruptIfRunning parameter determines whether the thread executing this task should be interrupted in an attempt to stop the task.
Calling this method will result in onCancelled(Object) being invoked on the UI thread after doInBackground(Object[]) returns. Calling this method guarantees that onPostExecute(Object) is never invoked. After invoking this method, you should check the value returned by isCancelled() periodically from doInBackground(Object[]) to finish the task as early as possible.
Parameters
mayInterruptIfRunning true if the thread executing this task should be interrupted; otherwise, in-progress tasks are allowed to complete.
Returns
false if the task could not be cancelled, typically because it has already completed normally; true otherwise

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.