Why does async-http-client does not throttle my requests?

Why does async-http-client does not throttle my requests? - java

I have an Akka actor that owns an AsyncHttpClient. This actor must handles a lot of asynchronous requests. Because my system cannot handle thousands of requests simultaneously, I need to limit the number of concurrent requests.
Right now, I'm doing this :
AsyncHttpClientConfig config = new AsyncHttpClientConfig.Builder().setAllowPoolingConnection(true)
.addRequestFilter(new ThrottleRequestFilter(32))
.setMaximumConnectionsPerHost(16)
.setMaxRequestRetry(5)
.build();
final AsyncHttpClient httpClient = new AsyncHttpClient(new NettyAsyncHttpProvider(config));
When my actor receives a message, I use the client like this :
Future<Integer> f = httpClient.prepareGet(url).execute(
new AsyncCompletionHandler<Integer>() {
#Override
public Integer onCompleted(Response response) throws Exception {
// handle successful request
}
#Override
public void onThrowable(Throwable t){
// handle failed request
}
}
);
The problem is that requests are never put in the client queue and are all processed like the configuration doesn't matter. Why doesn't this work as it should?

From the maintainer:
setMaxConnectionsPerHost only caps the number of connections that can be open to a given host. There's no built-in queuing mechanism for requests that might need a connection while there's none available.
So basically, it's a hard limit. Also, in versions of the library prior to, I believe, 1.9.10, the maximumConnectionsPerHost field was not being properly utilized by the code to limit the number of concurrent connections per host. Instead, there was a bug where the client only looked at the maximumConnectionsTotal field.
Link to issue referenced on GitHub

Related

parallel stream with kafka consumer records

I have kafka records:
ConsumerRecords<String, Events> records = kafkaConsumer.poll(POLL_TIMEOUT);
I want to run the below code using parallel streams, not multithreading.
records.forEach((record) -> {
Event event = record.value();
HTTPSend.send(event);
});
I tried with mlutithreading but I want to try parallelstream:
for (ConsumerRecord<String, Event> record : records) {
executor.execute(new Runnable() {
#Override
public void run() {
HTTPSend.send(Event);
}
});
}
Actually I'm facing issue with HTTP.send with multithreading (even with a thread pool of 1 thread). I'm getting
"Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target".
This is a request over https. This error comes only for the first time the request is made. Afterwards, the exception vanishes. poof!
For multithreading i'm using:
int threadCOunt=1;
BlockingQueue<Runnable> queue = new ArrayBlockingQueue<Runnable>(threadCOunt, true);
RejectedExecutionHandler handler = new ThreadPoolExecutor.CallerRunsPolicy();
ExecutorService executor = new ThreadPoolExecutor(threadCOunt, threadCOunt, 0L, TimeUnit.MILLISECONDS, queue, handler);
HTTPSend.send() is:
long sizeSend = 0;
SSLContext sc = null;
try {
sc = SSLContext.getInstance("TLS");
sc.init(null, TRUST_ALL_CERTS, new SecureRandom());
} catch (NoSuchAlgorithmException | KeyManagementException e) {
LOGGER.error("Failed to create SSL context", e);
}
// Ignore differences between given hostname and certificate hostname
HostnameVerifier hv = (hostname, session) -> true;
// Create the REST client and configure it to connect meta
Client client = ClientBuilder.newBuilder()
.hostnameVerifier(hv)
.sslContext(sc).build();
WebTarget baseTarget = client.target(getURL()).path(HTTP_PATH);
Response jsonResponse = null;
try {
StringBuilder eventsBatchString = new StringBuilder();
eventsBatchString.append(this.getEvent(event));
Entity<String> entity = Entity.entity(eventsBatchString.toString(), MediaType.APPLICATION_JSON_TYPE);
builder = baseTarget.request();
LOGGER.debug("about to send the event {} and URL {}", entity, getURL());
jsonResponse = builder.header(HTTP_ACK_CHANNEL, guid.toString())
.header("Content-type", MediaType.APPLICATION_JSON)
.header("Authorization", String.format("Meta %s", eventsModuleConfig.getSecretKey()))
.post(entity);

I see what you want to do, and I'm not sure that's the best idea (I'm also not sure it's not).
The poll / commit model of Kafka allows simple backpressure and retention of the last item processed if you crashed. By returning to your poll loop "immediately" you are telling Kafka "I am ready for more", and committing the offset (manually or automatically) tells Kafka that you have successfully read up to that point.
What you seem to want to do is read off Kafka as fast as possible, committing offsets, then putting the Kafka records into an executor queue then you balance your requests per second etc from that.
I'm not 100% sure that's a good idea: what happens if your app crashes? You may have committed some Kafka messages that actually didn't make it upstream. If you do really want to do this, I would suggest manually committing the offset (via commitSync) upon completion of the Runnable, instead of letting the high level consumer do it for you.
Why might you want to use a thread executor: I think these can be accomplished with Kafka too.
You may want to post multiple messages to the web server at the same time. A well paritioned Kafka topic will let multiple consumers / consumer groups consumer multiple partitions, thus - assuming a perfectly scaling HTTP server - would let you parallelize the posting of messages to your server. Yay for process based concurrency!
Maybe the web server is not perfectly scalable, or slow for this request (say each request takes 1 second): you need to limit the number of requests per second the web server takes, if you have a queue you might have a couple threads posting while not backing up Kafka.
In this case you can set max.poll.records to a scalable value that your web server requires. There's probably a better way to do this too, although it's escaping me at the moment.
If your web server takes a long time to respond you may get errors related to failing heartbeats. In that case I direct you to this SO answer on the timeout / heartbeat topic.
Instead of using a thread executioner, thus making synchronous HTTP requests appear to be async, I would use an evented HTTP client like Netty, thus achieving parallelism without thread based concurrency.

For solving a "slow consumer" use case where you're doing I/O processing, you should use something like Parallel Consumer (PC) to avoid the "head of line blocking" problem you're describing.
By using PC, you can processing all your keys in parallel, regardless of how long it takes to do your I/O.
It also comes with a non blocking Vert.x module which more efficiently uses the CPU.
PC directly solves for this, by sub partitioning the input partitions by key and processing each key in parallel.
It also tracks per record acknowledgement. Check out Parallel Consumer on GitHub (it's open source BTW, and I'm the author).

Non-blocking reverse proxy with netty

I'm trying to write a non-blocking proxy with netty 4.1. I have a "FrontHandler" which handles incoming connections, and then a "BackHandler" which handles outgoing ones. I'm following the HexDumpProxyHandler (https://github.com/netty/netty/blob/ed4a89082bb29b9e7d869c5d25d6b9ea8fc9d25b/example/src/main/java/io/netty/example/proxy/HexDumpProxyFrontendHandler.java#L67)
In this code I have found:
#Override
public void channelRead(final ChannelHandlerContext ctx, Object msg) {
if (outboundChannel.isActive()) {
outboundChannel.writeAndFlush(msg).addListener(new ChannelFutureListener() {, I've seen:
Meaning that the incoming message is only written if the outbound client connection is already ready. This is obviously not ideal in a HTTP proxy case, so I am thinking what would be the best way to handle it.
I am wondering if disabling auto-read on the front-end connection (and only trigger reads manually once the outgoing client connection is ready) is a good option. I could then enable autoRead over the child socket again, in the "channelActive" event of the backend handler. However, I am not sure about how many messages would I get in the handler for each "read()" invocation (using HttpDecoder, I assume I would get the initial HttpRequest, but I'd really like to avoid getting the subsequent HttpContent / LastHttpContent messages until I manually trigger the read() again and enable autoRead over the channel).
Another option would be to use a Promise to get the Channel from the client ChannelPool:
private void setCurrentBackend(HttpRequest request) {
pool.acquire(request, backendPromise);
backendPromise.addListener((FutureListener<Channel>) future -> {
Channel c = future.get();
if (!currentBackend.compareAndSet(null, c)) {
pool.release(c);
throw new IllegalStateException();
}
});
}
and then do the copying from input to output thru that promise. Eg:
private void handleLastContent(ChannelHandlerContext frontCtx, LastHttpContent lastContent) {
doInBackend(c -> {
c.writeAndFlush(lastContent).addListener((ChannelFutureListener) future -> {
if (future.isSuccess()) {
future.channel().read();
} else {
pool.release(c);
frontCtx.close();
}
});
});
}
private void doInBackend(Consumer<Channel> action) {
Channel c = currentBackend.get();
if (c == null) {
backendPromise.addListener((FutureListener<Channel>) future -> action.accept(future.get()));
} else {
action.accept(c);
}
}
but I'm not sure about how good it is to keep the promise there forever and do all the writes from "front" to "back" by adding listeners to it. I'm also not sure about how to instance the promise so that the operations are performed in the right thread... right now I'm using:
backendPromise = group.next().<Channel> newPromise(); // bad
// or
backendPromise = frontCtx.channel().eventLoop().newPromise(); // OK?
(where group is the same eventLoopGroup as used in the ServerBootstrap of the frontend).
If they're not handled thru the right thread, I assume it could be problematic to have the "else { }" optimization in the "doInBackend" method to avoid using the Promise and write to the channel directly.

The no-autoread approach doesn't work by itself, because the HttpRequestDecoder creates several messages even if only one read() was performed.
I have solved it by using chained CompletableFutures.

I have worked on a similar proxy application based on the MQTT protocol. So it was basically used to create a real-time chat application. The application that I had to design however was asynchronous in nature so I naturally did not face any such problem. Because in case the
outboundChannel.isActive() == false
then I can simply keep the messages in a queue or a persistent DB and then process them once the outboundChannel is up. However, since you are talking about an HTTP application, so this means that the application is synchronous in nature meaning that the client cannot keep on sending packets until the outboundChannel is up and running. So the option you suggest is that the packet will only be read once the channel is active and you can manually handle the message reads by disabling the auto read in ChannelConfig.
However, what I would like to suggest is that you should check if the outboundChannel is active or not. In case the channel is active, send he packet forward for processing. In case the channel is not active, you should reject the packet by sending back a response similar to Error404
Along with this you should configure your client to keep on retrying sending the packets after certain intervals and accordingly handle what needs to be done in case the channel takes too long a time to become active and become readable. Manually handling channelRead is generally not preferred and is an anti pattern. You should let Netty handle that for you in the most efficient way.

Interrupt a long running Jersey Client operation

I am using the Oracle Jersey Client, and am trying to cancel a long running get or put operation.
The Client is constructed as:
JacksonJsonProvider provider = new JacksonJsonProvider(new ObjectMapper());
ClientConfig clientConfig = new DefaultClientConfig();
clientConfig.getSingletons().add(provider);
Client client = Client.create(clientConfig);
The following code is executed on a worker thread:
File bigZipFile = new File("/home/me/everything.zip");
WebResource resource = client.resource("https://putfileshere.com");
Builder builder = resource.getRequestBuilder();
builder.type("application/zip").put(bigZipFile); //This will take a while!
I want to cancel this long-running put. When I try to interrupt the worker thread, the put operation continues to run. From what I can see, the Jersey Client makes no attempt to check for Thread.interrupted().
I see the same behavior when using an AsyncWebResource instead of WebResource and using Future.cancel(true) on the Builder.put(..) call.
So far, the only solution I have come up with to interrupt this is throwing a RuntimeException in a ContainerListener:
client.addFilter(new ConnectionListenerFilter(
new OnStartConnectionListener(){
public ContainerListener onStart(ClientRequest cr) {
return new ContainerListener(){
public void onSent(long delta, long bytes) {
//If the thread has been interrupted, stop the operation
if (Thread.interrupted()) {
throw new RuntimeException("Upload or Download canceled");
}
//Report progress otherwise
}
}...
I am wondering if there is a better solution (perhaps when creating the Client) that correctly handles interruptible I/O without using a RuntimeException.

I am wondering if there is a better solution (perhaps when creating the Client) that correctly handles interruptible I/O without using a RuntimeException.
Yeah, interrupting the thread will only work if the code is watching for the interrupts or calling other methods (such as Thread.sleep(...)) that watch for it.
Throwing an exception out of listener doesn't sound like a bad idea. I would certainly create your own RuntimeException class such as TimeoutRuntimeException or something so you can specifically catch and handle it.
Another thing to do would be to close the underlying IO stream that is being written to which would cause an IOException but I'm not familiar with Jersey so I'm not sure if you can get access to the connection.
Ah, here's an idea. Instead of putting the File, how about putting some sort of extension on a BufferedInputStream that is reading from the File but also has a timeout. So Jersey would be reading from the buffer and at some point it would throw an IOException if the timeout expires.

As of Jersey 2.35, the above API has changed. A timeout has been introduces in the client builder which can set read timeout. If the server takes too long to respond, the underlying socket will timeout. However, if the server starts sending the response, it shall not timeout. This can be utilized, if the server does not start sending partial response, which depends on the server implementation.
client=(JerseyClient)JerseyClientBuilder
.newBuilder()
.connectTimeout(1*1000, TimeUnit.MILLISECONDS)
.readTimeout(5*1000, TimeUnit.MILLISECONDS).build()
The current filters and interceptors are for data only and the solution posted in the original question will not work with filters and interceptors (though I admit I may have missed something there).
Another way is to get hold of the underlying HttpUrlConnection (for standard Jersey client configuration) and it seems to be possible with org.glassfish.jersey.client.HttpUrlConnectorProvider
HttpUrlConnectorProvider httpConProvider=new HttpUrlConnectorProvider();
httpConProvider.connectionFactory(new CustomHttpUrlConnectionfactory());
public static class CustomHttpUrlConnectionfactory implements
HttpUrlConnectorProvider.ConnectionFactory{
#Override
public HttpURLConnection getConnection(URL url) throws IOException {
System.out.println("CustomHttpUrlConnectionfactory ..... called");
return (HttpURLConnection)url.openConnection();
}//getConnection closing
}//inner-class closing
I did try the connection provider approach, however, I could not get that working. The idea would be to keep reference to the connection by some means (thread id etc.) and close it if the communication is taking too long. The primary problem was I could not find a way to register the provider with the client. The standard
.register(httpConProvider)
mechanism does not seem to work (or perhaps it is not supposed to work like that) and the documentation is a bit sketchy in that direction.

Modify Netty ServerBootstrap ChannelInitializer

I have a ServerBootstrap configured with a fairly standard Http-Codec ChannelInitializer.
On shutdown my server waits for a grace period where it can still handle incoming requests. My server supports keep-alive, but on shutdown I want to make sure every HttpResponse sent closes the connection with HTTP header "Connection: close" and that the channel is closed after the write. This is only necessary on server shutdown.
I have a ChannelHandler to support that:
#ChannelHandler.Sharable
public class CloseConnectionHandler extends ChannelOutboundHandlerAdapter {
#Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception {
HttpResponse response = (HttpResponse) msg;
if (isKeepAlive(response)) {
setKeepAlive(response, false);
promise.addListener(ChannelFutureListener.CLOSE);
}
ctx.write(msg, promise);
}
I keep a track of all connected clients using a ChannelGroup, so I can dynamically modify the pipeline of each client at the point of shutdown to include my CloseConnectionHandler, this works no problem.
However, new connections in the grace period have their pipeline configuration provided by the original ServerBootstrap ChannelInitializer, and I can't see a way of dynamically re-configuring that?
As a work-around I can have the CloseConnectionHandler configured in the standard pipeline and turned off with a boolean, only activating it on shutdown. But I'd rather avoid that if possible, seems a bit unnecessary.

there is currently no way to "replace" the initializer at run-time. So using a flag etc would be the best bet.

Atmosphere: Multiple subscriptions over single HttpConnection

I'm using Atmosphere in my Spring MVC app to facilitate push, using a streaming transport.
Throughout the lifecycle of my app, the client will subscribe and unsubscribe for many different topics.
Atmosphere seems to use a single http connection per subscription - ie., every call to $.atmosphere.subscribe(request) creates a new connection. This quickly exhausts the number of connections allowed from the browser to the atmosphere server.
Instead of creating a new resource each time, I'd like to be able to add and remove the AtmosphereResource to broadcasters after it's initial creation.
However, as the AtmosphereResource is a one-to-one representation of the inbound request, each time the client sends a request to the server, it arrives on a new AtomsphereResource, meaning I have no way to reference the original resource, and append it to the topic's Broadcaster.
I've tried using both $.atmosphere.subscribe(request) and calling atmosphereResource.push(request) on the resource returned from the original subscribe() call. However, this made no difference.
What is the correct way to approach this?

Here's how I got it working:
First, when the client does their initial connect, ensure that the atmosphere-specific headers are accepted by the browser before calling suspend():
#RequestMapping("/subscribe")
public ResponseEntity<HttpStatus> connect(AtmosphereResource resource)
{
resource.getResponse().setHeader("Access-Control-Expose-Headers", ATMOSPHERE_TRACKING_ID + "," + X_CACHE_DATE);
resource.suspend();
}
Then, when the client sends additional subscribe requests, although they come in on a different resource, they contain the ATMOPSHERE_TRACKING_ID of the original resource. This allows you to look it up via the resourceFactory:
#RequestMapping(value="/subscribe", method=RequestMethod.POST)
public ResponseEntity<HttpStatus> addSubscription(AtmosphereResource resource, #RequestParam("topic") String topic)
{
String atmosphereId = resource.getResponse().getHeader(ATMOSPHERE_TRACKING_ID);
if (atmosphereId == null || atmosphereId.isEmpty())
{
log.error("Cannot add subscription, as the atmosphere tracking ID was not found");
return new ResponseEntity<HttpStatus>(HttpStatus.BAD_REQUEST);
}
AtmosphereResource originalResource = resourceFactory.find(atmosphereId);
if (originalResource == null)
{
log.error("The provided Atmosphere tracking ID is not associated to a known resource");
return new ResponseEntity<HttpStatus>(HttpStatus.BAD_REQUEST);
}
Broadcaster broadcaster = broadcasterFactory.lookup(topic, true);
broadcaster.addAtmosphereResource(originalResource);
log.info("Added subscription to {} for atmosphere resource {}",topic, atmosphereId);
return getOkResponse();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.