Flux behaving differently for Webclient vs simple Flux.just - java

I have basic rest controller in Spring.To try out spring webflux and understand its non blocking nature.I created two controller mappings one to read and and one to serve the webclient call(as shown below)
#GetMapping("/slow-service-tweets")
private List<String> getAllTweets() {
try {
Thread.sleep(2000L); // delay
} catch (InterruptedException e) {
e.printStackTrace();
}
return Arrays.asList(
"Item1", "Item2","Item3");
}
And here is my testing get api which just triggers the code given below(first version)
#GetMapping("/test")
public void doSomething(){
log.info("Starting NON-BLOCKING Controller!");
Flux<String> tweetFlux = WebClient.create()
.get()
.uri("http://localhost:9090/slow-service-tweets")
.retrieve()
.bodyToFlux(String.class);
tweetFlux.subscribe(tweet ->{
try {
log.info("i am back");
Thread.sleep(6000L);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(tweet.toString());});
log.info("Exiting NON-BLOCKING Controller!");
The above code behaves exactly as it should.The output is
Starting NON-BLOCKING Controller!
Exiting NON-BLOCKING Controller!
Item1
Item2
Item3
The reason being the thread does not block on subscribe of flux and proceeds ahead.
Now please look in the second version of this below.
#GetMapping("/test")
public void doSomething(){
System.out.println("i am here");
Flux<Integer> f= Flux.just(1,2,3,4,5);
// Flux<Integer> f= Flux.fromIterable(testService.returnStaticList());
f.subscribe(consumer->{
try {
log.info("consuming");
Thread.sleep(2000L);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(consumer);
});
log.info("doing something else");
}
Ideally like the earlier example
"doing something else" must be printed immediately.
But no matter what i do it takes 10 seconds to print all element and then prints "doing something else".Output below:
i am here
consuming
1
2
3
4
5
doing something else
Can anyone please explain what am i missing here??

I feel like I need to start with a warning - this is most certainly not how to use Webflux. Any time you call subscribe, you're almost certainly doing something wrong - that should be left to the framework. Instead, you should be using:
doOnNext() for side effects like logging;
delayElements() if you want to delay each element in a Flux, rather than using Thread.sleep();
Returning your Flux (or returning some publisher created by chaining your Flux using operators) from your doSomething() method, to allow the framework to subscribe and therefore execute your reactive chain.
If you follow the above "normal" way of doing things, you likely won't run into these sorts of problems with blocking / threading causing unexpected problems. If you do more niche things like subscribe yourself, block threads without thinking if they should be blocked, etc. - then you'll likely cause yourself a world of issues.
However, to answer the question directly as to why this behaviour happens - it's because of threading. When you use Flux.just(), you're using the immediate scheduler (which means the same thread that's actually executing your doSomething() method in the first place.) Since there's only one thread at play here, your subscribe method blocks this one thread it until it's complete, so nothing else can execute. If you were to tell your Flux to publish on the boundedElastic() thread pool for instance like so, you'd find it behaves as you expect:
Flux<Integer> f = Flux.just(1, 2, 3, 4, 5).subscribeOn(Schedulers.boundedElastic());
In your other example, when you use WebClient, it's publishing on a different thread to the one executing your method - so blocking this different thread in a subscriber doesn't hold up the overall executing of your doSomething() method.

Related

How can I turn blocking code into a responsive style?

I have always been used to blocking programming and working in the Spring MVC framework. Recently, I considered learning reactive programming. I was full of doubts about how to convert the previous logic into a new style.
See the following processing(Pseudocode):
public Mono<List<String>> a() {
// 1...
List<String> strings = new ArrayList<>();
for (int i = 0; i < 100; i++) {
strings.add("hello " + i);
}
Mono<List<String>> mono = Mono.just(strings);
// 2...
mono.subscribe(e -> {
b();
});
// 3...
mono.subscribe(e -> {
c();
});
mono.subscribeOn(Schedulers.boundedElastic());
return mono;
}
// Simulate a time-consuming process.
public void b() {
try {
Thread.sleep(100);
} catch (InterruptedException err) {
throw new RuntimeException(err);
}
}
// Simulate the process of requesting an external HTTP interface once.
public int c() {
try {
Thread.sleep(300);
} catch (InterruptedException err) {
throw new RuntimeException(err);
}
return 1;
}
I tried to convert it into code that conforms to the responsive programming style, but found that the time-consuming code logic has blocked the current thread, which is inconsistent with my expectation.
I tested Webflux and Tomcat respectively, and the results show that the performance of the former is very poor. I suspect that the IO thread is blocked, which can be seen from the thread sleep time.
Thread.sleep() will pause the JVM thread running the request. You can't call this method with a Spring WebFlux application. By design WebFlux uses very few threads to handle the requests to avoid context switching but if your code intentionally blocks them you break the whole design.
In practice Spring WebFlux can be faster than a regular Spring MVC if the application workload is I/O bound e.g. a micro-service that calls multiple external APIs and doesn't perform significant calculations.
I'd suggest that you try simulating the I/O operations by making a network call to an actual server using a reactive library like Reactor Netty. Otherwise you would have to dig into the code and figure out how to create a meaningful mock for a network I/O operation, which might be tricky.
Reactive programming is good for juggling work across threads when "someone else other than this JVM" is busy.
So, handover to OS for file write, or to DB for record update or to another remote server over API call. Let them call you back when they have the result. Eventually, you should be juggling, but work goes on somewhere else and results flow in. This is where reactivity shines.
So, it's difficult to simulate with Thread.sleep or for loop running 10_000 times.
Also, it means the whole flow has to be reactive - your disk IO library should be reactive, your DB library should be reactive, your Rest client for calling other networked services should be reactive. Reactor should be helping with solutions for each of these.
Also, its not all-or-nothing. Even if your disk IO is blocking, you will still gain benefits if atleast the Rest client is non-blocking.

Async method followed by a parallelly executed method in Java 8

After spending the day of learning about the java Concurrency API, I still dont quite get how could I create the following functionality with the help of CompletableFuture and ExecutorService classes:
When I get a request on my REST endpoint I need to:
Start an asynchronous task (includes DB query, filtering, etc.), which will give me a list of String URLs at the end
In the meanwhile, responde back to the REST caller with HTTP OK, that the request was received, I'm working on it
When the asynchronous task is finished, I need to send HTTP requests (with the payload, the REST caller gave me) to the URLs I got from the job. At most the number of URLs would be around a 100, so I need these to happen in parallel.
Ideally I have some syncronized counter which counts how many of the http requests were a success/fail, and I can send this information back to the REST caller (the URL I need to send it back to is provided inside the request payload).
I have the building blocks (methods like: getMatchingObjectsFromDB(callerPayload), getURLs(resultOfgetMachingObjects), sendHttpRequest(Url, methodType), etc...) written for these already, I just cant quite figure out how to tie step 1 and step 3 together. I would use CompletableFuture.supplyAsync() for step 1, then I would need the CompletableFuture.thenComponse method to start step 3, but it's not clear to me how parallelism can be done with this API. It is rather intuitive with ExecutorService executor = Executors.newWorkStealingPool(); though, which creates a thread pool based on how much processing power is available and the tasks can be submitted via the invokeAll() method.
How can I use CompletableFutureand ExecutorService together? Or how can I guarantee parallel execution of a list of tasks with CompletableFuture? Demonstrating code snippet would be much appreciated. Thanks.
You should use join() to wait for all thread finish.
Create Map<String, Boolean> result to store your request result.
In your controller:
public void yourControllerMethod() {
CompletableFuture.runAsync(() -> yourServiceMethod());
}
In your service:
// Execute your logic to get List<String> urls
List<CompletableFuture> futures = urls.stream().map(v ->
CompletableFuture.supplyAsync(url -> requestUrl(url))
.thenAcceptAsync(requestResult -> result.put(url, true or false))
).collect(toList()); // You have list of completeable future here
Then use .join() to wait for all thread (Remember that your service are executed in its own thread already)
CompletableFuture.allOf(futures).join();
Then you can determine which one success/fail by accessing result map
Edit
Please post your proceduce code so that other may understand you also.
I've read your code and here are the needed modification:
When this for loop was not commented out, the receiver webserver got
the same request twice,
I dont understand the purpose of this for loop.
Sorry in my previous answer, I did not clean it up. That's just a temporary idea on my head that I forgot to remove at the end :D
Just remove it from your code
// allOf() only accepts arrays, so the List needed to be converted
/* The code never gets over this part (I know allOf() is a blocking call), even long after when the receiver got the HTTP request
with the correct payload. I'm not sure yet where exactly the code gets stuck */
Your map should be a ConcurrentHashMap because you're modifying it concurrently later.
Map<String, Boolean> result = new ConcurrentHashMap<>();
If your code still does not work as expected, I suggest to remove the parallelStream() part.
CompletableFuture and parallelStream use common forkjoin pool. I think the pool is exhausted.
And you should create your own pool for your CompletableFuture:
Executor pool = Executors.newFixedThreadPool(10);
And execute your request using that pool:
CompletableFuture.supplyAsync(YOURTASK, pool).thenAcceptAsync(Yourtask, pool)
For the sake of completion here is the relevant parts of the code, after clean-up and testing (thanks to Mạnh Quyết Nguyễn):
Rest controller class:
#POST
#Path("publish")
public Response publishEvent(PublishEvent eventPublished) {
/*
Payload verification, etc.
*/
//First send the event to the right subscribers, then send the resulting hashmap<String url, Boolean subscriberGotTheRequest> back to the publisher
CompletableFuture.supplyAsync(() -> EventHandlerService.propagateEvent(eventPublished)).thenAccept(map -> {
if (eventPublished.getDeliveryCompleteUri() != null) {
String callbackUrl = Utility
.getUri(eventPublished.getSource().getAddress(), eventPublished.getSource().getPort(), eventPublished.getDeliveryCompleteUri(), isSecure,
false);
try {
Utility.sendRequest(callbackUrl, "POST", map);
} catch (RuntimeException e) {
log.error("Callback after event publishing failed at: " + callbackUrl);
e.printStackTrace();
}
}
});
//return OK while the event publishing happens in async
return Response.status(Status.OK).build();
}
Service class:
private static List<EventFilter> getMatchingEventFilters(PublishEvent pe) {
//query the database, filter the results based on the method argument
}
private static boolean sendRequest(String url, Event event) {
//send the HTTP request to the given URL, with the given Event payload, return true if the response is positive (status code starts with 2), false otherwise
}
static Map<String, Boolean> propagateEvent(PublishEvent eventPublished) {
// Get the event relevant filters from the DB
List<EventFilter> filters = getMatchingEventFilters(eventPublished);
// Create the URLs from the filters
List<String> urls = new ArrayList<>();
for (EventFilter filter : filters) {
String url;
try {
boolean isSecure = filter.getConsumer().getAuthenticationInfo() != null;
url = Utility.getUri(filter.getConsumer().getAddress(), filter.getPort(), filter.getNotifyUri(), isSecure, false);
} catch (ArrowheadException | NullPointerException e) {
e.printStackTrace();
continue;
}
urls.add(url);
}
Map<String, Boolean> result = new ConcurrentHashMap<>();
Stream<CompletableFuture> stream = urls.stream().map(url -> CompletableFuture.supplyAsync(() -> sendRequest(url, eventPublished.getEvent()))
.thenAcceptAsync(published -> result.put(url, published)));
CompletableFuture.allOf(stream.toArray(CompletableFuture[]::new)).join();
log.info("Event published to " + urls.size() + " subscribers.");
return result;
}
Debugging this was a bit harder than usual, sometimes the code just magically stopped. To fix this, I only put code parts into the async task which was absolutely necessary, and I made sure the code in the task was using thread-safe stuff. Also I was a dumb-dumb at first, and my methods inside the EventHandlerService.class used the synchronized keyword, which resulted in the CompletableFuture inside the Service class method not executing, since it uses a thread pool by default.
A piece of logic marked with synchronized becomes a synchronized block, allowing only one thread to execute at any given time.

Async API giving worse performance

Interesting, I would think have 255 concurrent users, an async API would have better performance. Here are 2 of my endpoints in my Spring server:
#RequestMapping("/async")
public CompletableFuture<String> g(){
CompletableFuture<String> f = new CompletableFuture<>();
f.runAsync(() -> {
try {
Thread.sleep(500);
f.complete("Finished");
} catch (InterruptedException e) {
e.printStackTrace();
}
});
return f;
}
#RequestMapping("/sync")
public String h() throws InterruptedException {
Thread.sleep(500);
return "Finished";
}
In the /async it runs it on a different thread. I am using Siege for load testing as follows:
siege http://localhost:8080/sync --concurrent=255 --time=10S > /dev/null
For the async endpoint, I got a transaction number of 27 hits
For the sync endpoint, I got a transaction number of 1531 hits
So why is this? Why isnt the async endpoint able to handle more transactions?
Because the async endpoint is using a shared (the small ForkJoinPool.commonPool()) threadpool to execute the sleeps, whereas the sync endpoint uses the larger threadpool of the application server. Since the common pool is so small, you're running maybe 4-8 operations (well, if you call sleeping an operation) at a time, while others are waiting for their turn to even get in the pool. You can use a bigger pool with CompletableFuture.runAsync(Runnable, Executor) (you're also calling the method wrong, it's a static method that returns a CompletableFuture).
Async isn't a magical "make things faster" technique. Your example is flawed as all the requests take 500ms and you're only adding overhead in the async one.

How to interrupt a function call in Java

I am trying to use a Third Party Internal Library which is processing a given request. Unfortunately it is synchronous in nature. Also I have no control on the code for the same. Basically it is a function call. This function seems to a bit erratic in behavior. Sometimes this function takes 10 ms to complete processing and sometimes it takes up to 300 secs to process the request.
Can you suggest me a way to write a wrapper around this function so that it would throw an interrupted exception if the function does not complete processing with x ms/secs. I can live with not having the results and continue processing, but cannot tolerate a 3 min delay.
PS: This function internally sends an update to another system using JMS and waits for that system to respond and sends apart from some other calculations.
Can you suggest me a way to write a wrapper around this function so that it would throw an interrupted exception if the function does not complete processing with x ms/secs.
This is not possible. InterruptException only gets thrown by specific methods. You can certainly call thread.stop() but this is deprecated and not recommended for a number of reasons.
A better alternative would be for your code to wait for the response for a certain amount of time and just abandon the call if doesn't work. For example, you could submit a Callable to a thread pool that actually makes the call to the "Third Party Internal Library". Then your main code would do a future.get(...) with a specific timeout.
// allows 5 JMS calls concurrently, change as necessary or used newCachedThreadPool()
ExecutorService threadPool = Executors.newFixedThreadPool(5);
...
// submit the call to be made in the background by thread-pool
Future<Response> future = threadPool.submit(new Callable<Response>() {
public Response call() {
// this damn call can take 3 to 3000ms to complete dammit
return thirdPartyInternalLibrary.makeJmsRequest();
}
});
// wait for some max amount of time
Response response = null;
try {
response = future.get(TimeUnit.MILLISECONDS, 100);
} catch (TimeoutException te) {
// log that it timed out and continue or throw an exception
}
The problem with this method is that you might spawn a whole bunch of threads waiting for the library to respond to the remote JMS query that you would not have a lot of control over.
No easy solution.
This will throw a TimeoutException if the lambda doesn't finish in the time allotted:
CompletableFuture.supplyAsync(() -> yourCall()).get(1, TimeUnit.SECONDS)
Being that this is 3rd party you cannot modify the code. As such you will need to do two things
Launch the execution in a new thread.
Wait for execution in current thread, with timeout.
One possible way would be to use a Semaphore.
final Semaphore semaphore = new Semaphore(0);
Thread t = new Thread(new Runnable() {
#Override
public void run() {
// do work
semaphore.release();
}
});
t.start();
try {
semaphore.tryAcquire(1, TimeUnit.SECONDS); // Whatever your timeout is
} catch (InterruptedException e) {
// handle cleanup
}
The above method is gross, I would suggest instead updateing your desing to use a dedicated worker queue or RxJava with a timeout if possible.

Correct way to retry with delay on Couchbase getAndLock if lock was already held?

I am using the new Couchbase Java Client API 2.1.1 and therefore JavaRx to access my Couchbase cluster.
When using asynchronous getAndLock on an already locked document, getAndLock fails with a TemporaryLockFailureException. In another SO question (rxjava: Can I use retry() but with delay?) I found out how to retry with delay.
Here is my adopted code:
CountDownLatchWithResultData<JsonDocument> resultCdl = new CountDownLatchWithResultData<>(1);
couchbaseBucket.async().getAndLock(key, LOCK_TIME).retryWhen((errorObserver) -> {
return errorObserver.flatMap((Throwable t) -> {
if (t instanceof TemporaryLockFailureException) {
return Observable.timer(RETRY_DELAY_MS, TimeUnit.MILLISECONDS);
}
return Observable.error(t);
});
}).subscribe(new Subscriber<JsonDocument>() {
#Override
public void onCompleted() {
resultCdl.countDown();
}
#Override
public void onError(Throwable e) {
resultCdl.countDown();
}
#Override
public void onNext(JsonDocument t) {
resultCdl.setInformation(t);
}
});
........
resultCdl.await();
if (resultCdl.getInformation() == null) {
//do stuff
} else ....
(CountDownLatchWithResultData simply extends a normal CountDownLatch and adds two methods to store some information before the count has reached 0 and retrieve it afterwards)
So basically I'd like this code to
try to get the lock infinitely once every RETRY_DELAY_MS milliseconds if a TemporaryLockFailureException occured and then call onNext
or to fail completely on other exceptions
or to directly call onNext if there is no exception at all
The problem now is that when retrying, it only retries once and the JsonDocument from resultCdl.getInformation() is always null in this case even though the document exists. It seems onNext is never called.
If there is no exception, the code works fine.
So apparently I am doing something wrong here but I have no clue as to where the problem might be. Does returning Observable.timer imply that with this new Obervable also the previously associated retryWhen is executed again? Is it the CountDownLatch with count 1 getting in the way?
This one is subtle. Up to version 2.2.0, the Observables from the SDK are in the "hot" category. In effect that means that even if no subscription is made, they start emitting. They will also emit the same data to every newcoming Subscriber, so in effect they cache the data.
So what you retry does is resubscribe to an Observable that will always emit the same thing (in this case an error). I suspect it comes out of the retry loop just because the lock maximum duration is LOCK_TIME...
Try wrapping the call to asyncBucket.getAndLock inside an Observable.defer (or migrate to the 2.2.x SDK if that's something you could do, see release and migration notes starting from 2.2.0).

Categories

Resources