How to send parallel GET requests and wait for result responses? - java

I'm using apache http client within spring mvc 3.2.2 to send 5 get requests synchronously as illustrated.
How can I send all of these asynchronously (in parallel) and wait for the requests to return in order to return a parsed payload string from all GET requests?
public String myMVCControllerGETdataMethod()
{
// Send 1st request
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://api/data?type=1");
ResponseHandler<String> responseHandler = new BasicResponseHandler();
String responseBody = httpclient.execute(httpget, responseHandler);
// Send 2st request
HttpClient httpclient2 = new DefaultHttpClient();
HttpGet httpget2 = new HttpGet("http://api/data?type=2");
ResponseHandler2<String> responseHandler2 = new BasicResponseHandler();
String responseBody2 = httpclient.execute(httpget, responseHandler2);
// o o o more gets here
// Perform some work here...and wait for all requests to return
// Parse info out of multiple requests and return
String results = doWorkwithMultipleDataReturned();
model.addAttribute(results, results);
return "index";
}

Just in general, you need to encapsulate your units of work in a Runnable or java.util.concurrent.Callable and execute them via java.util.concurrent.Executor (or org.springframework.core.task.TaskExecutor). This allows each unit of work to be executed separately, typically in an asynchronous fashion (depending on the implementation of the Executor).
So for your specific problem, you could do something like this:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.Executor;
import java.util.concurrent.FutureTask;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.BasicResponseHandler;
import org.apache.http.impl.client.DefaultHttpClient;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.RequestMapping;
#Controller
public class MyController {
//inject this
private Executor executor;
#RequestMapping("/your/path/here")
public String myMVCControllerGETdataMethod(Model model) {
//define all async requests and give them to injected Executor
List<GetRequestTask> tasks = new ArrayList<GetRequestTask>();
tasks.add(new GetRequestTask("http://api/data?type=1", this.executor));
tasks.add(new GetRequestTask("http://api/data?type=2", this.executor));
//...
//do other work here
//...
//now wait for all async tasks to complete
while(!tasks.isEmpty()) {
for(Iterator<GetRequestTask> it = tasks.iterator(); it.hasNext();) {
GetRequestTask task = it.next();
if(task.isDone()) {
String request = task.getRequest();
String response = task.getResponse();
//PUT YOUR CODE HERE
//possibly aggregate request and response in Map<String,String>
//or do something else with request and response
it.remove();
}
}
//avoid tight loop in "main" thread
if(!tasks.isEmpty()) Thread.sleep(100);
}
//now you have all responses for all async requests
//the following from your original code
//note: you should probably pass the responses from above
//to this next method (to keep your controller stateless)
String results = doWorkwithMultipleDataReturned();
model.addAttribute(results, results);
return "index";
}
//abstraction to wrap Callable and Future
class GetRequestTask {
private GetRequestWork work;
private FutureTask<String> task;
public GetRequestTask(String url, Executor executor) {
this.work = new GetRequestWork(url);
this.task = new FutureTask<String>(work);
executor.execute(this.task);
}
public String getRequest() {
return this.work.getUrl();
}
public boolean isDone() {
return this.task.isDone();
}
public String getResponse() {
try {
return this.task.get();
} catch(Exception e) {
throw new RuntimeException(e);
}
}
}
//Callable representing actual HTTP GET request
class GetRequestWork implements Callable<String> {
private final String url;
public GetRequestWork(String url) {
this.url = url;
}
public String getUrl() {
return this.url;
}
public String call() throws Exception {
return new DefaultHttpClient().execute(new HttpGet(getUrl()), new BasicResponseHandler());
}
}
}
Note that this code has not been tested.
For your Executor implementation, check out Spring's TaskExecutor and task:executor namespace. You probably want a reusable pool of threads for this use-case (instead of creating a new thread every time).

You should use AsyncHttpClient. You can make any number of requests, and it will make a call back to you when it gets a response. You can configure how many connections it can create. All the threading is handled by the library, so it's alot easier than managing the threads yourself.
take a look a the example here: https://github.com/AsyncHttpClient/async-http-client

Move your request code to separate method:
private String executeGet(String url){
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet(url);
ResponseHandler<String> responseHandler = new BasicResponseHandler();
return httpclient.execute(httpget, responseHandler);
}
And submit them to ExecutorService:
ExecutorService executorService = Executors.newCachedThreadPool();
Future<String> firstCallFuture = executorService.submit(() -> executeGet(url1));
Future<String> secondCallFuture = executorService.submit(() -> executeGet(url2));
String firstResponse = firstCallFuture.get();
String secondResponse = secondCallFuture.get();
executorService.shutdown();
Or
Future<String> firstCallFuture = CompletableFuture.supplyAsync(() -> executeGet(url1));
Future<String> secondCallFuture = CompletableFuture.supplyAsync(() -> executeGet(url2));
String firstResponse = firstCallFuture.get();
String secondResponse = secondCallFuture.get();
Or use RestTemplate as described in How to use Spring WebClient to make multiple calls simultaneously?

For parallel execution of multiple request with single HttpClient instance.
configure PoolingHttpClientConnectionManager for parallel execution.
HttpClientBuilder builder = HttpClientBuilder.create();
PlainConnectionSocketFactory plainConnectionSocketFactory = new PlainConnectionSocketFactory();
Registry<ConnectionSocketFactory> registry = RegistryBuilder.<ConnectionSocketFactory>create()
.register("http", plainConnectionSocketFactory).build();
PoolingHttpClientConnectionManager ccm = new PoolingHttpClientConnectionManager(registry);
ccm.setMaxTotal(BaseConstant.CONNECTION_POOL_SIZE); // For Example : CONNECTION_POOL_SIZE = 10 for 10 thread parallel execution
ccm.setDefaultMaxPerRoute(BaseConstant.CONNECTION_POOL_SIZE);
builder.setConnectionManager((HttpClientConnectionManager) ccm);
HttpClient objHttpClient = builder.build();

Related

Reactive Spring Boot API wrapping Elasticsearch's async bulk indexing

I am developing prototype for a new project. The idea is to provide a Reactive Spring Boot microservice to bulk index documents in Elasticsearch. Elasticsearch provides a High Level Rest Client which provides an Async method to bulk process indexing requests. Async delivers callbacks using listeners are mentioned here. The callbacks receive index responses (per requests) in batches. I am trying to send this response back to the client as Flux. I have come up with something based on this blog post.
Controller
#RestController
public class AppController {
#SuppressWarnings("unchecked")
#RequestMapping(value = "/test3", method = RequestMethod.GET)
public Flux<String> index3() {
ElasticAdapter es = new ElasticAdapter();
JSONObject json = new JSONObject();
json.put("TestDoc", "Stack123");
Flux<String> fluxResponse = es.bulkIndex(json);
return fluxResponse;
}
ElasticAdapter
#Component
class ElasticAdapter {
String indexName = "test2";
private final RestHighLevelClient client;
private final ObjectMapper mapper;
private int processed = 1;
Flux<String> bulkIndex(JSONObject doc) {
return bulkIndexDoc(doc)
.doOnError(e -> System.out.print("Unable to index {}" + doc+ e));
}
private Flux<String> bulkIndexDoc(JSONObject doc) {
return Flux.create(sink -> {
try {
doBulkIndex(doc, bulkListenerToSink(sink));
} catch (JsonProcessingException e) {
sink.error(e);
}
});
}
private void doBulkIndex(JSONObject doc, BulkProcessor.Listener listener) throws JsonProcessingException {
System.out.println("Going to submit index request");
BiConsumer<BulkRequest, ActionListener<BulkResponse>> bulkConsumer =
(request, bulkListener) ->
client.bulkAsync(request, RequestOptions.DEFAULT, bulkListener);
BulkProcessor.Builder builder =
BulkProcessor.builder(bulkConsumer, listener);
builder.setBulkActions(10);
BulkProcessor bulkProcessor = builder.build();
// Submitting 5,000 index requests ( repeating same JSON)
for (int i = 0; i < 5000; i++) {
IndexRequest indexRequest = new IndexRequest(indexName, "person", i+1+"");
String json = doc.toJSONString();
indexRequest.source(json, XContentType.JSON);
bulkProcessor.add(indexRequest);
}
System.out.println("Submitted all docs
}
private BulkProcessor.Listener bulkListenerToSink(FluxSink<String> sink) {
return new BulkProcessor.Listener() {
#Override
public void beforeBulk(long executionId, BulkRequest request) {
}
#SuppressWarnings("unchecked")
#Override
public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
for (BulkItemResponse bulkItemResponse : response) {
JSONObject json = new JSONObject();
json.put("id", bulkItemResponse.getResponse().getId());
json.put("status", bulkItemResponse.getResponse().getResult
sink.next(json.toJSONString());
processed++;
}
if(processed >= 5000) {
sink.complete();
}
}
#Override
public void afterBulk(long executionId, BulkRequest request, Throwable failure) {
failure.printStackTrace();
sink.error(failure);
}
};
}
public ElasticAdapter() {
// Logic to initialize Elasticsearch Rest Client
}
}
I used FluxSink to create the Flux of Responses to send back to the Client. At this point, I have no idea whether this correct or not.
My expectation is that the calling client should receive the responses in batches of 10 ( because bulk processor processess it in batches of 10 - builder.setBulkActions(10); ). I tried to consume the endpoint using Spring Webflix Client. But unable to work it out. This is what I tried
WebClient
public class FluxClient {
public static void main(String[] args) {
WebClient client = WebClient.create("http://localhost:8080");
Flux<String> responseFlux = client.get()
.uri("/test3")
.retrieve()
.bodyToFlux(String.class);
responseFlux.subscribe(System.out::println);
}
}
Nothing is printing on console as I expected. I tried to use System.out.println(responseFlux.blockFirst());. It prints all the responses as a single batch at the end and not in batches at .
If my approach is correct, what is the correct way to consume it? For the solution in my mind, this client will reside is another Webapp.
Notes: My understanding of Reactor API is limited. The version of elasticsearch used is 6.8.
So made the following changes to your code.
In ElasticAdapter,
public Flux<Object> bulkIndex(JSONObject doc) {
return bulkIndexDoc(doc)
.subscribeOn(Schedulers.elastic(), true)
.doOnError(e -> System.out.print("Unable to index {}" + doc+ e));
}
Invoked subscribeOn(Scheduler, requestOnSeparateThread) on the Flux, Got to know about it from, https://github.com/spring-projects/spring-framework/issues/21507
In FluxClient,
Flux<String> responseFlux = client.get()
.uri("/test3")
.headers(httpHeaders -> {
httpHeaders.set("Accept", "text/event-stream");
})
.retrieve()
.bodyToFlux(String.class);
responseFlux.delayElements(Duration.ofSeconds(1)).subscribe(System.out::println);
Added "Accept" header as "text/event-stream" and delayed Flux elements.
With the above changes, was able to get the response in real time from the server.

Java Parallel HTTP requests with CompleteableFuture not very performant

I have a web service that makes http calls to another service. The web service breaks down one-to-many requests and attempts to make parallel one-to-one requests. For testing performance, I have kept the throughput to the backend constant. For example, I was able to achieve a throughput of 1000 req/sec with a 99th percentile latency of 100ms. So to test parallel requests that get broken down to 2 requests to the backend per each request to the web service, I sent 500 req/sec but achieved only a 150ms 99th percentile latency. Am I creating thread contention and/or making blocking http calls with the following code?
import java.util.HashMap;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentMap;
import java.util.stream.Collectors;
public class Foo {
private HTTPClient myHTTPClient = new HTTPClient("http://my_host.com"); //java ws rs http client
private interface Handler<REQ, RES> {
RES work(REQ req);
}
private <REQ, RES> CompletableFuture<RES> getAsync(REQ req, Handler<REQ, RES> handler) {
CompletableFuture<RES> future = CompletableFuture.supplyAsync(() -> {
return handler.work(req);
});
return future;
}
public RouteCostResponse getRouteCost(Point sources, List<Point> destinations) {
Map<String, Request> requests = new HashMap<>();
// create request bodies and keep track of request id's
for (Point destination : destinations) {
requests.put(destination.getId(), new RouteCostRequest(source, destination))
}
//create futures
ConcurrentMap<String, CompletableFuture<RouteCost>> futures = requests.entrySet().parallelStream()
.collect(Collectors.toConcurrentMap(
entry -> entry.getKey(),
entry -> getAsync(entry.getValue(), route -> myHTTPClient.getRoute(route)))
));
//retrieve results
ConcurrentMap<String, RouteCost> result = futures.entrySet().parallelStream()
.collect(Collectors.toConcurrentMap(
entry -> entry.getKey(),
entry -> entry.getValue().join()
));
RouteCostResponse response = new RouteCostResponse(result);
return response;
}
}
There is no thread contention with the following code, though it seems i have run into I/O issues. The key is to use an explicit thread pool. ForkJoinPool or Executors.fixedThreadPool
import java.util.HashMap;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.ForkJoinPool;
import java.util.stream.Collectors;
public class Foo {
private HTTPClient myHTTPClient = new HTTPClient("http://my_host.com"); //java ws rs http client
private static final ForkJoinPool pool = new ForkJoinPool(1000);
private interface Handler<REQ, RES> {
RES work(REQ req);
}
private <REQ, RES> CompletableFuture<RES> getAsync(REQ req, Handler<REQ, RES> handler) {
CompletableFuture<RES> future = CompletableFuture.supplyAsync(() -> {
return handler.work(req);
});
return future;
}
public RouteCostResponse getRouteCost(Point sources, List<Point> destinations) {
Map<String, Request> requests = new HashMap<>();
// create request bodies and keep track of request id's
for (Point destination : destinations) {
requests.put(destination.getId(), new RouteCostRequest(source, destination))
}
//create futures
ConcurrentMap<String, CompletableFuture<RouteCost>> futures = requests.entrySet().stream()
.collect(Collectors.toConcurrentMap(
entry -> entry.getKey(),
entry -> getAsync(entry.getValue(), route -> myHTTPClient.getRoute(route)))
));
//retrieve results
ConcurrentMap<String, RouteCost> result = futures.entrySet().stream()
.collect(Collectors.toConcurrentMap(
entry -> entry.getKey(),
entry -> entry.getValue().join()
));
RouteCostResponse response = new RouteCostResponse(result);
return response;
}
}

Cancel rest request if response takes too long

I would like to cancel a REST request, if the response takes longer than 3 seconds, but I haven't managed to figure out how to do it.
So let's say I have a thread A:
#Override
public void run() {
RestTemplate restTemplate = new RestTemplate();
IsAliveMessage isAliveMessage = new IsAliveMessage(nodeInfo.getHostname(), nodeInfo.getPort());
IsAliveResponse isAliveResponse = restTemplate.postForObject(
"http://" + nodeInfo.getHostname() + ":" + nodeInfo.getPort() + "/node/isAlive",
isAliveMessage,
IsAliveResponse.class);
}
that makes a request and expects an answer from B:
#RequestMapping( value="/isAlive", method= RequestMethod.POST)
public IsAliveResponse isAlive() throws ConnectException {
try {
Thread.sleep(100000);
IsAliveResponse response = new IsAliveResponse("here here!" ,true);
return response;
} catch (Exception e) {
Thread.currentThread().interrupt();
}
}
B "sleeps" and doesn't answer, but A keeps waiting for that answer to come. How can I make A give up the waiting after a certain time span?
Thanks in advance
You can configure your RestTemplate to wait three seconds for response like this:
RestTemplate restTemplate = new RestTemplate(getClientHttpRequestFactory());
private ClientHttpRequestFactory getClientHttpRequestFactory() {
int timeout = 3000;
HttpComponentsClientHttpRequestFactory clientHttpRequestFactory =
new HttpComponentsClientHttpRequestFactory();
clientHttpRequestFactory.setConnectTimeout(timeout);
clientHttpRequestFactory.setReadTimeout(timeout);
return clientHttpRequestFactory;
}

How to use RestTemplate efficiently with RequestFactory?

I am working on a project in which I need to make a HTTP URL call to my server which is running Restful Service which returns back the response as a JSON String. I am using RestTemplate here along with HttpComponentsClientHttpRequestFactory to execute an url.
I have setup a http request timeout (READ and CONNECTION time out) on my RestTemplate by using HttpComponentsClientHttpRequestFactory.
Below is my Interface:
public interface Client {
// for synchronous
public String getSyncData(String key, long timeout);
// for asynchronous
public String getAsyncData(String key, long timeout);
}
Below is my implementation of Client interface -
public class DataClient implements Client {
private final RestTemplate restTemplate = new RestTemplate();
private ExecutorService executor = Executors.newFixedThreadPool(10);
// for synchronous call
#Override
public String getSyncData(String key, long timeout) {
String response = null;
try {
Task task = new Task(key, restTemplate, timeout);
// direct call, implementing sync call as async + waiting is bad idea.
// It is meaningless and consumes one thread from the thread pool per a call.
response = task.call();
} catch (Exception ex) {
PotoLogging.logErrors(ex, DataErrorEnum.CLIENT_ERROR, key);
}
return response;
}
// for asynchronous call
#Override
public Future<String> getAsyncData(String key, long timeout) {
Future<String> future = null;
try {
Task task = new Task(key, restTemplate, timeout);
future = executor.submit(task);
} catch (Exception ex) {
PotoLogging.logErrors(ex, DataErrorEnum.CLIENT_ERROR, key);
}
return future;
}
}
And below is my simple Task class
class Task implements Callable<String> {
private RestTemplate restTemplate;
private String key;
private long timeout; // in milliseconds
public Task(String key, RestTemplate restTemplate, long timeout) {
this.key = key;
this.restTemplate = restTemplate;
this.timeout = timeout;
}
public String call() throws Exception {
String url = "some_url_created_by_using_key";
// does this looks right the way I am setting request factory?
// or is there any other effficient way to do this?
restTemplate.setRequestFactory(clientHttpRequestFactory());
String response = restTemplate.exchange(url, HttpMethod.GET, null, String.class);
return response;
}
private static ClientHttpRequestFactory clientHttpRequestFactory() {
// is it ok to create a new instance of HttpComponentsClientHttpRequestFactory everytime?
HttpComponentsClientHttpRequestFactory factory = new HttpComponentsClientHttpRequestFactory();
factory.setReadTimeout(timeout); // setting timeout as read timeout
factory.setConnectTimeout(timeout); // setting timeout as connect timeout
return factory;
}
}
Now my question is - Does the way I am using RestTemplate along with setRequestFactory in the call method of Task class everytime is efficient? Since RestTemplate is very heavy to be created so not sure whether I got it right.
And is it ok to create a new instance of HttpComponentsClientHttpRequestFactory everytime? Will it be expensive?
What is the right and efficient way to use RestTemplate if we need to setup Read and Connection timeout on it.
This library will be used like this -
String response = DataClientFactory.getInstance().getSyncData(keyData, 100);
From what I can tell, you're reusing the same RestTemplate object repeatedly, but each Task is performing this line: restTemplate.setRequestFactory(clientHttpRequestFactory());. This seems like it can have race conditions, e.g. one Task can set the RequestFactory that another Task will then accidentally use.
Otherwise, it seems like you're using RestTemplate correctly.
How often do your timeouts change? If you mostly use one or two timeouts, you can create one or two RestTemplates using the RequestFactory constructor with the pre-loaded timeout. If you're a stickler for efficiency, create a HashMap<Integer, RestTemplate> that caches a RestTemplate with a particular timeout each time a new timeout is requested.
Otherwise, looking at the code for RestTemplate's constructor, and for HttpComponentsClientHttpRequestFactory's constructor, they don't look exceptionally heavy, so calling them repeatedly probably won't be much of a bottleneck.

Handling multiple threads/outgoing http requests

I am working on a Java/Spring web application that for each incoming request does the following:
fires off a number of requests to third party web servers,
retrieves the response from each,
parses each response into a list of JSON objects,
collates the lists of JSON objects into a single list and returns it.
I am creating a separate thread for each request sent to the third party web servers. I am using the Apache PoolingClientConnectionManager. Here is an outline of the code I am using:
public class Background {
static class CallableThread implements Callable<ArrayList<JSONObject>> {
private HttpClient httpClient;
private HttpGet httpGet;
public CallableThread(HttpClient httpClient, HttpGet httpGet) {
this.httpClient = httpClient;
this.httpGet = httpGet;
}
#Override
public ArrayList<JSONObject> call() throws Exception {
HttpResponse response = httpClient.execute(httpGet);
return parseResponse(response);
}
private ArrayList<JSONObject> parseResponse(HttpResponse response) {
ArrayList<JSONObject> list = null;
// details omitted
return list;
}
}
public ArrayList<JSONObject> getData(List<String> urlList, PoolingClientConnectionManager connManager) {
ArrayList<JSONObject> jsonObjectsList = null;
int numThreads = urlList.size();
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
List<Future<ArrayList<JSONObject>>> list = new ArrayList<Future<ArrayList<JSONObject>>>();
HttpClient httpClient = new DefaultHttpClient(connManager);
for (String url : urlList) {
HttpGet httpGet = new HttpGet(url);
CallableThread worker = new CallableThread(httpClient, httpGet);
Future<ArrayList<JSONObject>> submit = executor.submit(worker);
list.add(submit);
}
for (Future<ArrayList<JSONObject>> future : list) {
try {
if (future != null) {
if (jsonObjectsList == null) {
jsonObjectsList = future.get();
} else {
if (future.get() != null) {
jsonObjectsList.addAll(future.get());
}
}
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
executor.shutdown();
return jsonObjectsList;
}
}
This all works fine. My question is in relation to how well this code will scale as the traffic to my website increases? Is there a better way to implement this? For example, by implementing non-blocking I/O to reduce the number of threads being created. Are there libraries or frameworks that might help?
At the moment, I am using Java 6 and Spring Framework 3.1
Thanks in advance
I wouldn't recommend to implement this as a synchronous service. Do it asynchronously. Get your request, pool the callables, and return a resource location where the client can later request the result.
You've got to be pooling this callables in an executor. Poll the executor in a background process and make avalable the results in the location you returned at the first request. Doing it this way, it would be easier to control your available resuources, and deny cleanly a processing requests if there aren't any more resources available.
Non blocking IO won't reduce the number of threads, it just delegates the "job" to another thread, in order for the service thread not to be blocked and to be able to receive more requests.
use REST.
Receive a POST request, and answer with something like this:
HTTP/1.1 202 Accepted
Location: /result/to/consult/later
The client can then request the resutl at the given location. If the processing has not finished, then answer with:
HTTP/1.1 201 Created
If its done then return a HTTP/1.1 200 OK with the resulting JSON.

Categories

Resources