Related
So here's the situation: I'm implementing the caching of our webapp using vertx-redis (we were formerly using lettuce). Pretty simple mechanism, there is an anotation we use on endpoints which is responsible to invoke the redis-client (whatever implementation we are using) and, if there is cached info for the given key it should be used as response body and the request should be finished with no processing.
But there's this really annoying behavior with the vertx-redis implementation in which ending the request doesn't stop the processing. I make the request, get the quick response since there was cached info, but I can still see in the logs that the app keeps the processing going on, as if the request was still open. I believe that it's because I'm ending the response inside the handler for the Redis client call, like this:
client.get("key", onResponse -> {
if (onResponse.succeeded() && onResponse.result() != null) {
//ending request from here
}
});
I realize that I could maybe reproduce the behavior as it was before if I could do something like this:
String cachedInfo = client.get("key").map(onResponse -> onResponse.result());
// endResponse
But as we know, vertx-redis is a semantic API and every method returns the same instance of RedisClient. I also thought about doing something like this:
private String cachedInfo;
...
client.get("key", onResult -> {
if (onResponse.succeeded()) {
this.cachedInfo = onResponse.result();
}
});
if (cachedInfo != null) { // The value could be unset since the lambda is running in other thread
//end request
}
Really don't know what to do, is there a way to return the contents of the AsyncResult to a variable or maybe set it to a variable synchronously somehow? I've also been searching for ways to somehow stop the whole flow of the current request but couldn't find any satisfactory, non-aggressive solution so far, but I'm really open to this option either.
After spending the day of learning about the java Concurrency API, I still dont quite get how could I create the following functionality with the help of CompletableFuture and ExecutorService classes:
When I get a request on my REST endpoint I need to:
Start an asynchronous task (includes DB query, filtering, etc.), which will give me a list of String URLs at the end
In the meanwhile, responde back to the REST caller with HTTP OK, that the request was received, I'm working on it
When the asynchronous task is finished, I need to send HTTP requests (with the payload, the REST caller gave me) to the URLs I got from the job. At most the number of URLs would be around a 100, so I need these to happen in parallel.
Ideally I have some syncronized counter which counts how many of the http requests were a success/fail, and I can send this information back to the REST caller (the URL I need to send it back to is provided inside the request payload).
I have the building blocks (methods like: getMatchingObjectsFromDB(callerPayload), getURLs(resultOfgetMachingObjects), sendHttpRequest(Url, methodType), etc...) written for these already, I just cant quite figure out how to tie step 1 and step 3 together. I would use CompletableFuture.supplyAsync() for step 1, then I would need the CompletableFuture.thenComponse method to start step 3, but it's not clear to me how parallelism can be done with this API. It is rather intuitive with ExecutorService executor = Executors.newWorkStealingPool(); though, which creates a thread pool based on how much processing power is available and the tasks can be submitted via the invokeAll() method.
How can I use CompletableFutureand ExecutorService together? Or how can I guarantee parallel execution of a list of tasks with CompletableFuture? Demonstrating code snippet would be much appreciated. Thanks.
You should use join() to wait for all thread finish.
Create Map<String, Boolean> result to store your request result.
In your controller:
public void yourControllerMethod() {
CompletableFuture.runAsync(() -> yourServiceMethod());
}
In your service:
// Execute your logic to get List<String> urls
List<CompletableFuture> futures = urls.stream().map(v ->
CompletableFuture.supplyAsync(url -> requestUrl(url))
.thenAcceptAsync(requestResult -> result.put(url, true or false))
).collect(toList()); // You have list of completeable future here
Then use .join() to wait for all thread (Remember that your service are executed in its own thread already)
CompletableFuture.allOf(futures).join();
Then you can determine which one success/fail by accessing result map
Edit
Please post your proceduce code so that other may understand you also.
I've read your code and here are the needed modification:
When this for loop was not commented out, the receiver webserver got
the same request twice,
I dont understand the purpose of this for loop.
Sorry in my previous answer, I did not clean it up. That's just a temporary idea on my head that I forgot to remove at the end :D
Just remove it from your code
// allOf() only accepts arrays, so the List needed to be converted
/* The code never gets over this part (I know allOf() is a blocking call), even long after when the receiver got the HTTP request
with the correct payload. I'm not sure yet where exactly the code gets stuck */
Your map should be a ConcurrentHashMap because you're modifying it concurrently later.
Map<String, Boolean> result = new ConcurrentHashMap<>();
If your code still does not work as expected, I suggest to remove the parallelStream() part.
CompletableFuture and parallelStream use common forkjoin pool. I think the pool is exhausted.
And you should create your own pool for your CompletableFuture:
Executor pool = Executors.newFixedThreadPool(10);
And execute your request using that pool:
CompletableFuture.supplyAsync(YOURTASK, pool).thenAcceptAsync(Yourtask, pool)
For the sake of completion here is the relevant parts of the code, after clean-up and testing (thanks to Mạnh Quyết Nguyễn):
Rest controller class:
#POST
#Path("publish")
public Response publishEvent(PublishEvent eventPublished) {
/*
Payload verification, etc.
*/
//First send the event to the right subscribers, then send the resulting hashmap<String url, Boolean subscriberGotTheRequest> back to the publisher
CompletableFuture.supplyAsync(() -> EventHandlerService.propagateEvent(eventPublished)).thenAccept(map -> {
if (eventPublished.getDeliveryCompleteUri() != null) {
String callbackUrl = Utility
.getUri(eventPublished.getSource().getAddress(), eventPublished.getSource().getPort(), eventPublished.getDeliveryCompleteUri(), isSecure,
false);
try {
Utility.sendRequest(callbackUrl, "POST", map);
} catch (RuntimeException e) {
log.error("Callback after event publishing failed at: " + callbackUrl);
e.printStackTrace();
}
}
});
//return OK while the event publishing happens in async
return Response.status(Status.OK).build();
}
Service class:
private static List<EventFilter> getMatchingEventFilters(PublishEvent pe) {
//query the database, filter the results based on the method argument
}
private static boolean sendRequest(String url, Event event) {
//send the HTTP request to the given URL, with the given Event payload, return true if the response is positive (status code starts with 2), false otherwise
}
static Map<String, Boolean> propagateEvent(PublishEvent eventPublished) {
// Get the event relevant filters from the DB
List<EventFilter> filters = getMatchingEventFilters(eventPublished);
// Create the URLs from the filters
List<String> urls = new ArrayList<>();
for (EventFilter filter : filters) {
String url;
try {
boolean isSecure = filter.getConsumer().getAuthenticationInfo() != null;
url = Utility.getUri(filter.getConsumer().getAddress(), filter.getPort(), filter.getNotifyUri(), isSecure, false);
} catch (ArrowheadException | NullPointerException e) {
e.printStackTrace();
continue;
}
urls.add(url);
}
Map<String, Boolean> result = new ConcurrentHashMap<>();
Stream<CompletableFuture> stream = urls.stream().map(url -> CompletableFuture.supplyAsync(() -> sendRequest(url, eventPublished.getEvent()))
.thenAcceptAsync(published -> result.put(url, published)));
CompletableFuture.allOf(stream.toArray(CompletableFuture[]::new)).join();
log.info("Event published to " + urls.size() + " subscribers.");
return result;
}
Debugging this was a bit harder than usual, sometimes the code just magically stopped. To fix this, I only put code parts into the async task which was absolutely necessary, and I made sure the code in the task was using thread-safe stuff. Also I was a dumb-dumb at first, and my methods inside the EventHandlerService.class used the synchronized keyword, which resulted in the CompletableFuture inside the Service class method not executing, since it uses a thread pool by default.
A piece of logic marked with synchronized becomes a synchronized block, allowing only one thread to execute at any given time.
I'm playing around with Vert.x and quite new to the servers based on event loop as opposed to the thread/connection model.
public void start(Future<Void> fut) {
vertx
.createHttpServer()
.requestHandler(r -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request received - "+start.format(DateTimeFormatter.ISO_DATE_TIME));
final MyModel model = new MyModel();
try {
for(int i=0;i<10000000;i++){
//some simple operation
}
model.data = start.format(DateTimeFormatter.ISO_DATE_TIME) +" - "+LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);
} catch (Exception e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
r.response().end(
new Gson().toJson(model)
);
})
.listen(4568, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
System.out.println("Server started ..");
}
I'm just trying to simulate a long running request handler to understand how this model works.
What I've observed is the so called event loop is blocked until my first request completes. Whatever little time it takes, subsequent request is not acted upon until the previous one completes.
Obviously I'm missing a piece here and that's the question that I have here.
Edited based on the answers so far:
Isn't accepting all requests considered to be asynchronous? If a new
connection can only be accepted when the previous one is cleared
off, how is it async?
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request). So it means, the
event loop can't accept a new connection until the previous request
finishes(even if its winds up in a second). And If I as a programmer
have to think through all these and push such request handlers to a
worker thread , then how does it differ from a thread/connection
model?
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve
c10k problem, when it can't start all concurrent requests parallely and have to wait till the previous one terminates?
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
Edits and topping this question for a bounty
Do not completely understand how this model is claimed to asynchronous.
Vert.x has an async JDBC client (Asyncronous is the keyword) which I tried to adapt with RXJava.
Here is a code sample (Relevant portions)
server.requestStream().toObservable().subscribe(req -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request for " + req.absoluteURI() +" received - " +start.format(DateTimeFormatter.ISO_DATE_TIME));
jdbc.getConnectionObservable().subscribe(
conn -> {
// Now chain some statements using flatmap composition
Observable<ResultSet> resa = conn.queryObservable("SELECT * FROM CALL_OPTION WHERE UNDERLYING='NIFTY'");
// Subscribe to the final result
resa.subscribe(resultSet -> {
req.response().end(resultSet.getRows().toString());
System.out.println("Request for " + req.absoluteURI() +" Ended - " +LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME));
}, err -> {
System.out.println("Database problem");
err.printStackTrace();
});
},
// Could not connect
err -> {
err.printStackTrace();
}
);
});
server.listen(4568);
The select query there takes 3 seconds approx to return the complete table dump.
When I fire concurrent requests(tried with just 2), I see that the second request completely waits for the first one to complete.
If the JDBC select is asynchronous, Isn't it a fair expectation to have the framework handle the second connection while it waits for the select query to return anything.?
Vert.x event loop is, in fact, a classical event loop existing on many platforms. And of course, most explanations and docs could be found for Node.js, as it's the most popular framework based on this architecture pattern. Take a look at one more or less good explanation of mechanics under Node.js event loop. Vert.x tutorial has fine explanation between "Don’t call us, we’ll call you" and "Verticles" too.
Edit for your updates:
First of all, when you are working with an event loop, the main thread should work very quickly for all requests. You shouldn't do any long job in this loop. And of course, you shouldn't wait for a response to your call to the database.
- Schedule a call asynchronously
- Assign a callback (handler) to result
- Callback will be executed in the worker thread, not event loop thread. This callback, for example, will return a response to the socket.
So, your operations in the event loop should just schedule all asynchronous operations with callbacks and go to the next request without awaiting any results.
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request).
In that case, your request has some computation expensive parts or access to IO - your code in the event loop shouldn't wait for the result of these operations.
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or all the I/O op are handled asynchronously?
When you have too many concurrent requests and a traditional programming model, you will make thread per each request. What this thread will do? They will be mostly waiting for IO operations (for example, result from database). It's a waste of resources. In our event loop model, you have one main thread that schedule operations and preallocated amount of worker threads for long tasks. + None of these workers actually wait for the response, they just can execute another code while waiting for IO result (it can be implemented as callbacks or periodical checking status of IO jobs currently in progress). I would recommend you go through Java NIO and Java NIO 2 to understand how this async IO can be actually implemented inside the framework. Green threads is a very related concept too, that would be good to understand. Green threads and coroutines are a type of shadowed event loop, that trying to achieve the same thing - fewer threads because we can reuse system thread while green thread waiting for something.
How does it even solve c10k problem, when it can't start all concurrent requests parallel and have to wait till the previous one terminates?
For sure we don't wait in the main thread for sending the response for the previous request. Get request, schedule long/IO tasks execution, next request.
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
If you make everything right - no. Even more, you will get good data locality and execution flow prediction. One CPU core will execute your short event loop and schedule async work without context switching and nothing more. Other cores make a call to the database and return response and only this. Switching between callbacks or checking different channels for IO status doesn't actually require any system thread's context switching - it's actually working in one worker thread. So, we have one worker thread per core and this one system thread await/checks results availability from multiple connections to database for example. Revisit Java NIO concept to understand how it can work this way. (Classical example for NIO - proxy-server that can accept many parallel connections (thousands), proxy requests to some other remote servers, listen to responses and send responses back to clients and all of this using one or two threads)
About your code, I made a sample project for you to demonstrate that everything works as expected:
public class MyFirstVerticle extends AbstractVerticle {
#Override
public void start(Future<Void> fut) {
JDBCClient client = JDBCClient.createShared(vertx, new JsonObject()
.put("url", "jdbc:hsqldb:mem:test?shutdown=true")
.put("driver_class", "org.hsqldb.jdbcDriver")
.put("max_pool_size", 30));
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
// create a table
connection.execute("create table test(id int primary key, name varchar(255))", create -> {
if (create.failed()) {throw new RuntimeException(create.cause());}
});
});
vertx
.createHttpServer()
.requestHandler(r -> {
int requestId = new Random().nextInt();
System.out.println("Request " + requestId + " received");
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
connection.execute("insert into test values ('" + requestId + "', 'World')", insert -> {
// query some data with arguments
connection
.queryWithParams("select * from test where id = ?", new JsonArray().add(requestId), rs -> {
connection.close(done -> {if (done.failed()) {throw new RuntimeException(done.cause());}});
System.out.println("Result " + requestId + " returned");
r.response().end("Hello");
});
});
});
})
.listen(8080, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
}
}
#RunWith(VertxUnitRunner.class)
public class MyFirstVerticleTest {
private Vertx vertx;
#Before
public void setUp(TestContext context) {
vertx = Vertx.vertx();
vertx.deployVerticle(MyFirstVerticle.class.getName(),
context.asyncAssertSuccess());
}
#After
public void tearDown(TestContext context) {
vertx.close(context.asyncAssertSuccess());
}
#Test
public void testMyApplication(TestContext context) {
for (int i = 0; i < 10; i++) {
final Async async = context.async();
vertx.createHttpClient().getNow(8080, "localhost", "/",
response -> response.handler(body -> {
context.assertTrue(body.toString().contains("Hello"));
async.complete();
})
);
}
}
}
Output:
Request 1412761034 received
Request -1781489277 received
Request 1008255692 received
Request -853002509 received
Request -919489429 received
Request 1902219940 received
Request -2141153291 received
Request 1144684415 received
Request -1409053630 received
Request -546435082 received
Result 1412761034 returned
Result -1781489277 returned
Result 1008255692 returned
Result -853002509 returned
Result -919489429 returned
Result 1902219940 returned
Result -2141153291 returned
Result 1144684415 returned
Result -1409053630 returned
Result -546435082 returned
So, we accept a request - schedule a request to the database, go to the next request, we consume all of them and send a response for each request only when everything is done with the database.
About your code sample I see two possible issues - first, it looks like you don't close() connection, which is important to return it to pool. Second, how your pool is configured? If there is only one free connection - these requests will serialize waiting for this connection.
I recommend you to add some printing of a timestamp for both requests to find a place where you serialize. You have something that makes the calls in the event loop to be blocking. Or... check that you send requests in parallel in your test. Not next after getting a response after previous.
How is this asynchronous? The answer is in your question itself
What I've observed is the so called event loop is blocked until my
first request completes. Whatever little time it takes, subsequent
request is not acted upon until the previous one completes
The idea is instead of having a new for serving each HTTP request, same thread is used which you have blocked by your long running task.
The goal of event loop is to save the time involved in context switching from one thread to another thread and utilize the ideal CPU time when a task is using IO/Network activities. If while handling your request it had to other IO/Network operation eg: fetching data from a remote MongoDB instance during that time your thread will not be blocked and instead an another request would be served by the same thread which is the ideal use case of event loop model (Considering that you have concurrent requests coming to your server).
If you have long running tasks which does not involve Network/IO operation, you should consider using thread pool instead, if you block your main event loop thread itself other requests would be delayed. i.e. for long running tasks you are okay to pay the price of context switching for for server to be responsive.
EDIT:
The way a server can handle requests can vary:
1) Spawn a new thread for each incoming request (In this model the context switching would be high and there is additional cost of spawning a new thread every time)
2) Use a thread pool to server the request (Same set of thread would be used to serve requests and extra requests gets queued up)
3) Use a event loop (single thread for all the requests. Negligible context switching. Because there would be some threads running e.g: to queue up the incoming requests)
First of all context switching is not bad, it is required to keep application server responsive, but, too much context switching can be a problem if the number of concurrent requests goes too high (roughly more than 10k). If you want to understand in more detail I recommend you to read C10K article
Assume a typical request takes anywhere between 100 ms to 1 sec (based
on the kind and nature of the request). So it means, the event loop
can't accept a new connection until the previous request finishes(even
if its winds up in a second).
If you need to respond to large number of concurrent requests (more than 10k) I would consider more than 500ms as a longer running operation. Secondly, Like I said there are some threads/context switching involved e.g.: to queue up incoming requests, but, the context switching amongst threads would be greatly reduced as there would be too few threads at a time. Thirdly, if there is a network/IO operation involved in resolving first request second request would get a chance to be resolved before first is resolved, this is where this model plays well.
And If I as a programmer have to think
through all these and push such request handlers to a worker thread ,
then how does it differ from a thread/connection model?
Vertx is trying to give you best of threads and event loop, so, as programmer you can make a call on how to make your application efficient under both the scenario i.e. long running operation with and without network/IO operation.
I'm just trying to understand how is this model better from a
traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve c10k
problem, when it can't start all concurrent requests parallely and
have to wait till the previous one terminates?
The above explanation should answer this.
Even if I decide to push all these operations to a worker
thread(pooled), then I'm back to the same problem isn't it? Context
switching between threads?
Like I said, both have pros and cons and vertx gives you both the model and depending on your use case you got to choose what is ideal for your scenario.
In these sort of processing engines, you are supposed to turn long running tasks in to asynchronously executed operations and these is a methodology for doing this, so that the critical thread can complete as quickly as possible and return to perform another task. i.e. any IO operations are passed to the framework to call you back when the IO is done.
The framework is asynchronous in the sense that it supports you producing and running these asynchronous tasks, but it doesn't change your code from being synchronous to asynchronous.
I'm fairly new to Java (I'm using Java SE 7) and the JVM and trying to write an asynchronous controller using:
Tomcat 7
Spring MVC 4.1.1
Spring Servlet 3.0
I have a component that my controller is delegating some work to that has an asynchronous portion and returns a ListenableFuture. Ideally, I'd like to free up the thread that initially handles the controller response as I'm waiting for the async operation to return, hence the desire for an async controller.
I'm looking at returning a DeferredResponse -- it seems pretty easy to bridge this with ListenableFuture -- but I can't seem to find any resources that explain how the response is delivered back to the client once the DeferredResponse resolves.
Maybe I'm not fully grok'ing how an asynchronous controller is supposed to work, but could someone explain how the response gets returned to the client once the DeferredResponse resolves? There has to be some thread that picks up the job of sending the response, right?
I recently used Spring's DeferredResponse to excellent effect in a long-polling situation that I recently coded. Focusing on the 'how' of the response getting back to the user is, I believe, not the correct way to think about the object. Depending upon where it's used, it returns messages to the user in exactly the same way as a regular, synchronous call would only in a delayed, asynchronous manner. Again, the object does not define nor propose a delivery mechanism. Just a way to 'insert' an asynchronous response into existing channels.
Per your query, yes, it does so by creating a thread that has a timeout of the user's specification. If the code completes before the timeout, using 'setResult', the object returns the code's result. Otherwise, if the timeout fires before the result, the default, also set by the user, is returned. Either way, the object does not return anything (other than the object itself) until one of these mechanisms is called. Also, the object has to then be discarded as it cannot be reused.
In my case, I was using a HTTP request/response function that would wrap the returned response in a DeferredResponse object that would provide a default response - asking for another packet from client so the browser would not time out - if the computation the code was working on did not return before the timeout. Whenever the computation was complete, it would send the response via the 'setResult' function call. In this situation both cases would simply use the HTTP response to send a packet back to the user. However, in neither case would the response go back to the user immediately.
In practice the object worked flawlessly and allowed me to implement an effective long-polling mechanism.
Here is a snippet of the code in my example:
#RequestMapping(method = RequestMethod.POST, produces = "application/text")
#ResponseBody
// public DeferredResult<String> onMessage(#RequestBody String message, HttpSession session) {
public DeferredResult<String> onMessage(InputStream is, HttpSession session) {
String message = convertStreamToString(is);
// HttpSession session = null;
messageInfo info = getMessageInfo(message);
String state = info.getState();
String id = info.getCallID();
DeferredResult<String> futureMessage =
new DeferredResult<>(refreshIntervalSecs * msInSec, getRefreshJsonMessage(id));
if(state != null && id != null) {
if(state.equals("REFRESH")) {
// Cache response for future and "swallow" call as it is restocking call
LOG.info("Refresh received for call " + id);
synchronized (lock) {
boolean isReplaceable = callsMap.containsKey(id) && callsMap.get(id).isSetOrExpired();
if (isReplaceable)
callsMap.put(id, futureMessage);
else {
LOG.warning("Refresh packet arrived on a non-existent call");
futureMessage.setResult(getExitJsonMessage(id));
}
}
} else if (state.equals("NEW")){
// Store response for future and pass the call onto the processing logic
LOG.info("New long-poll call received with id " + id);
ClientSupport cs = clientSupportMap.get(session.getId());
if(cs == null) {
cs = new ClientSupport(this, session.getId());
clientSupportMap.put(session.getId(), cs);
}
callsMap.put(id, futureMessage);
// *** IMPORTANT ****
// This method sets up a separate thread to do work
cs.newCall(message);
}
} else {
LOG.warning("Invalid call information");
// Return value immediately when return is called
futureMessage.setResult("");
}
return futureMessage;
}
I have a list of hostnames which I am supposed to make a call by making the proper URL from it. Let's say if I have four hostname (hostA, hostB, hostC, hostD) in the linked list -
Execute hostA url and if hostA is UP then get the data and return the response.
But if hostA is down, then add hostA to block list of hostnames and make sure no other thread is making call to hostA. And then try executing hostB url and return the response back.
But if hostB is also down, then add hostB to block list of hostnames as well and repeat the same thing.
Also, I have a background thread running in my application which will have the list of block hostnames (from my another service) which we are not supposed to make a call but it runs every 10 minutes so the list of block hostnames will get updated only after 10 minutes so if any block list of hostnames is present, then I won't make a call to that hostname from the main thread and I will try making a call to another hostname. Meaning if hostA is blocked, then it will have hostA present in the block list but if hostA is up, then that list will not have hostA in it.
Below is my background thread code which gets the data from my service URL and keep on running every 10 minutes once my application has started up. It will then parse the data coming from the URL and store it in a ClientData class variable -
TempScheduler
public class TempScheduler {
// .. scheduledexecutors service code to start the background thread
// call the service and get the data and then parse
// the response.
private void callServiceURL() {
String url = "url";
RestTemplate restTemplate = new RestTemplate();
String response = restTemplate.getForObject(url, String.class);
parseResponse(response);
}
// parse the response and store it in a variable
private void parseResponse(String response) {
//...
// get the block list of hostnames
Map<String, List<String>> coloExceptionList = gson.fromJson(response.split("blocklist=")[1], Map.class);
List<String> blockList = new ArrayList<String>();
for(Map.Entry<String, List<String>> entry : coloExceptionList.entrySet()) {
for(String hosts : entry.getValue()) {
blockList.add(hosts);
}
}
// store the block list of hostnames which I am not supposed to make a call
ClientData.replaceBlockedHosts(blockList);
}
}
Below is my ClientData class. replaceBlockedHosts method will only be called by a background thread meaning only one writer. But isHostBlocked method will be called by main application threads multiple times to check whether a particular hostname is blocked or not. And also blockHost method will be called from catch block multiple times to add the down host in the blockedHosts list so I need to make sure all the read threads can see the consistent data and are not making calls to that down host, instead they are making calls to next host in the hostnames linked list.
ClientData
public class ClientData {
// .. some other variables here which in turn used to decide the list of hostnames
private static final AtomicReference<ConcurrentHashMap<String, String>> blockedHosts =
new AtomicReference<ConcurrentHashMap<String, String>>(new ConcurrentHashMap<String, String>());
public static boolean isHostBlocked(String hostName) {
return blockedHosts.get().containsKey(hostName);
}
public static void blockHost(String hostName) {
blockedHosts.get().put(hostName, hostName);
}
public static void replaceBlockedHosts(List<String> hostNames) {
ConcurrentHashMap<String, String> newBlockedHosts = new ConcurrentHashMap<>();
for (String hostName : hostNames) {
newBlockedHosts.put(hostName, hostName);
}
blockedHosts.set(newBlockedHosts);
}
}
And below is my main application thread code in which I have list of hostnames which I am supposed to make a call. If the hostname is null or in the block list category, then I won't make a call to that particular hostname and will try next hostname in the list.
#Override
public DataResponse call() {
List<String> hostnames = new LinkedList<String>();
// .. some separate code here to populate the hostnames list
// from ClientData class
for (String hostname : hostnames) {
// If host name is null or host name is in block list category, skip sending request to this host
if (hostname == null || ClientData.isHostBlocked(hostname)) {
continue;
}
try {
String url = generateURL(hostname);
response = restTemplate.getForObject(url, String.class);
break;
} catch (RestClientException ex) {
// add host to block list,
// Is this call fully atomic and thread safe for blockHost method
// in ClientData class?
ClientData.blockHost(hostname);
}
}
}
I don't need to make a call to the hostname whenever it is down from the main thread. And my background thread gets these detail from one of my service as well, whenever any server is down, it will have the list of hostnames which are block hosts and whenever they are up, that list will get updated.
And also, whenever any RestClientException is being thrown, I will add that hostname in the blockedHosts concurrentmap since my background thread is running every 10 minutes so that map won't have this hostname until 10 minutes is done. And whenever this server came back up, my background will update this list automatically.
Is my above code for block list of hostnames is fully atomic and thread safe? Because what I want is - If hostA is down, then no other thread should make a call to hostA until the blocked host list is updated.
Keep in mind that communication with other hosts takes considerably more time than anything you are doing in your threads. I wouldn't worry about atomic operations in this case.
Let's say we have the threads t1 and t2. t1 sends a requests to hostA and waits for a response. When the timeout is reached, a RestClientException will be thrown. Now there is a very tiny timespan between throwing the exception and adding that host to the list of blocked host. It could happen that t2 tries to sends a request to hostA in this moment, before the host is blocked - but it is way more likely that t2 already send it during the long time t1 was waiting for a response, which you can't prevent.
You can try to set reasonable timeouts. Of course there are other types of errors that don't await a timeout, but even those way more time than handling the exception.
Using a ConcurrentHashMap is thread safe and should be enough to keep track of blocked hosts.
An AtomicReference by itself doesn't do much unless you use methods like compareAndSet, so the call is not atomic (but as explained above doesn't need to be in my opinion). If you really want to block a host immediately after you got an exception, you should use some kind of synchronization. You could use a synchronized set to store blocked hosts. This still wouldn't solve the problem that it takes some time until any connection error is actually detected.
Regarding the update: As said in the comments the Future timeout should be larger than the request timeout. Otherwise the Callable might be canceled and the host won't be added to the list. You probably don't even need a timeout when using Future.get because the request will eventually succeed or fail.
The actual problem why you see many exceptions when host A goes down could simply be that many thread are still waiting for the response of host A. You only check for blocked hosts before starting a request, not during any requests. Any thread still waiting for a response from that host will continue to do so until the the timeout is reached.
If you want to prevent this you could try to periodically check if the current host isn't blocked yet. This is a very naive solution and kind of defeats the purpose of futures since it's basically polling. It should help understanding the general problem though.
// bad pseudo code
DataTask dataTask = new DataTask(dataKeys, restTemplate);
future = service.submit(dataTask);
while(!future.isDone()) {
if( blockedHosts.contains(currentHost) ) {
// host unreachable, don't wait for http timeout
future.cancel();
}
thread.sleep(/* */);
}
A better way would be to send an interrupt to all DataTask threads that are waiting for the same hosts when it goes down, so they can abort the request and try the next host.
The atomicity of your operation doesn’t change when you put the ConcurrentHashMap into an AtomicReference. put and get are atomic anyway and the only affected operation, replaceBlockedHosts, would work with a simple volatile reference as well. But I don’t know why you need this though.
What you have within your call() method is a check-then-act pattern:
First, you call ClientData.isHostBlocked(hostname)
then you call restTemplate.getForObject(generateURL(hostname), …).
So the atomicity of blockHost and isHostBlocked does prevent a thread from being right after the isHostBlocked call when another thread calls blockHost. Therefore the former would still proceed with the network operation after the latter has added the host to the block list.
If you want to limit the number of threads that may fail at the same host, you must limit the number of threads accessing the same host. There’s no way around it.