Observable.just(1, 2, 3, 4, 5)
.flatMap(
a -> {
if (a < 3) {
return Observable.just(a).delay(3, TimeUnit.SECONDS);
} else {
return Observable.just(a);
}
})
.doOnNext(
a -> System.out.println("Element: " + a )
.subscribe();
If 1 and 2 wait 3 seconds, why sometimes 2 comes first and then 1? Shouldn't it always be 1 first?
sometimes:
Element: 3
Element: 4
Element: 5
Element: 2
Element: 1
and
Element: 3
Element: 4
Element: 5
Element: 1
Element: 2
shouldn't it always go out like this (3,4,5,1,2)?
By default delay operator uses computation scheduler :
#CheckReturnValue
#SchedulerSupport(SchedulerSupport.COMPUTATION)
public final Observable<T> delay(long delay, TimeUnit unit) {
return delay(delay, unit, Schedulers.computation(), false);
}
That means each delay operation is executed in a thread that is taken from computation pool.
flatMap do not wait the last item was completely processed, So for 1 and 2 they are processed in different threads (taken from computation pool) in parallel. Different threads means no order guaranteed.
All time based operators use computation scheduler by default. You can take a look to my other answer here
For the following program I am trying to figure out why using 2 different streams parallelizes the task and using the same stream and calling join/get on the Completable future makes them take longer time equivalent to as if they were sequentially processed).
public class HelloConcurrency {
private static Integer sleepTask(int number) {
System.out.println(String.format("Task with sleep time %d", number));
try {
TimeUnit.SECONDS.sleep(number);
} catch (InterruptedException e) {
e.printStackTrace();
return -1;
}
return number;
}
public static void main(String[] args) {
List<Integer> sleepTimes = Arrays.asList(1,2,3,4,5,6);
System.out.println("WITH SEPARATE STREAMS FOR FUTURE AND JOIN");
ExecutorService executorService = Executors.newFixedThreadPool(6);
long start = System.currentTimeMillis();
List<CompletableFuture<Integer>> futures = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.collect(Collectors.toList());
executorService.shutdown();
List<Integer> result = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
long finish = System.currentTimeMillis();
long timeElapsed = (finish - start)/1000;
System.out.println(String.format("done in %d seconds.", timeElapsed));
System.out.println(result);
System.out.println("WITH SAME STREAM FOR FUTURE AND JOIN");
ExecutorService executorService2 = Executors.newFixedThreadPool(6);
start = System.currentTimeMillis();
List<Integer> results = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.map(CompletableFuture::join)
.collect(Collectors.toList());
executorService2.shutdown();
finish = System.currentTimeMillis();
timeElapsed = (finish - start)/1000;
System.out.println(String.format("done in %d seconds.", timeElapsed));
System.out.println(results);
}
}
Output
WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 6
Task with sleep time 5
Task with sleep time 1
Task with sleep time 3
Task with sleep time 2
Task with sleep time 4
done in 6 seconds.
[1, 2, 3, 4, 5, 6]
WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
Task with sleep time 2
Task with sleep time 3
Task with sleep time 4
Task with sleep time 5
Task with sleep time 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]
The two approaches are quite different, let me try to explain it clearly
1st approach : In the first approach you are spinning up all Async requests for all 6 tasks and then calling join function on each one of them to get the result
2st approach : But in the second approach you are calling the join immediately after spinning the Async request for each task. For example after spinning Async thread for task 1 calling join, make sure that thread to complete task and then only spin up the second task with Async thread
Note : Another side if you observe the output clearly, In the 1st approach output appears in random order since the all six tasks were executed asynchronously. But during second approach all tasks were executed sequentially one after the another.
I believe you have an idea how stream map operation is performed, or you can get more information from here or here
To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer)). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.
The stream framework does not define the order in which map operations are executed on stream elements, because it is not intended for use cases in which that might be a relevant issue. As a result, the particular way your second version is executing is equivalent, essentially, to
List<Integer> results = new ArrayList<>();
for (Integer sleepTime : sleepTimes) {
results.add(CompletableFuture
.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.join());
}
...which is itself essentially equivalent to
List<Integer> results = new ArrayList<>()
for (Integer sleepTime : sleepTimes) {
results.add(sleepTask(sleepTime));
}
#Deadpool answered it pretty well, just adding my answer which can help someone understand it better.
I was able to get an answer by adding more printing to both methods.
TLDR
2 stream approach: We are starting up all 6 tasks asynchronously and then calling join function on each one of them to get the result in a separate stream.
1 stream approach: We are calling the join immediately after starting up each task. For example after spinning a thread for task 1, calling join makes sure the thread waits for completion of task 1 and then only spin up the second task with async thread.
Note: Also, if we observe the output clearly, in the 1 stream approach, output appears sequential order since the all six tasks were executed in order. But during second approach all tasks were executed in parallel, hence the random order.
Note 2: If we replace stream() with parallelStream() in the 1 stream approach, it will work identically to 2 stream approach.
More proof
I added more printing to the streams which gave the following outputs and confirmed the note above :
1 stream:
List<Integer> results = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.map(f -> {
int num = f.join();
System.out.println(String.format("doing join on task %d", num));
return num;
})
.collect(Collectors.toList());
WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
doing join on task 1
Task with sleep time 2
doing join on task 2
Task with sleep time 3
doing join on task 3
Task with sleep time 4
doing join on task 4
Task with sleep time 5
doing join on task 5
Task with sleep time 6
doing join on task 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]
2 streams:
List<CompletableFuture<Integer>> futures = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.collect(Collectors.toList());
List<Integer> result = futures.stream()
.map(f -> {
int num = f.join();
System.out.println(String.format("doing join on task %d", num));
return num;
})
.collect(Collectors.toList());
WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 2
Task with sleep time 5
Task with sleep time 3
Task with sleep time 1
Task with sleep time 4
Task with sleep time 6
doing join on task 1
doing join on task 2
doing join on task 3
doing join on task 4
doing join on task 5
doing join on task 6
done in 6 seconds.
[1, 2, 3, 4, 5, 6]
I have a code where I´m making an interval until a condition acomplish and then in the subscribe send back the result.
But since is an interval the subscription continue.
I was wondering if there´s any way to unsubscribe an Observable interval once emmit something
here the code
Subscription subscriber = Observable.interval(0, 5, TimeUnit.MILLISECONDS)
.map(i -> eventHandler.getProcessedEvents())
.filter(eventsProcessed -> eventsProcessed >= 10)
.doOnNext(eventsProcessed -> eventHandler.initProcessedEvents())
.doOnNext(eventsProcessed -> logger.info(null, "Total number of events processed:" + eventsProcessed))
.subscribe(t -> resumeRequest(asyncResponse));
new TestSubscriber((Observer) subscriber).awaitTerminalEvent(10, TimeUnit.SECONDS);
subscriber.unsubscribe();
For now as a hack I use a timer and then unsubscribe, but it´s bad!
Regards
You can use the first operator
Subscription subscriber = Observable.interval(0, 5, TimeUnit.MILLISECONDS)
.map(i -> eventHandler.getProcessedEvents())
.first(eventsProcessed -> eventsProcessed >= 10)
.doOnNext(eventsProcessed -> eventHandler.initProcessedEvents())
.doOnNext(eventsProcessed -> logger.info(null, "Total number of events processed:" + eventsProcessed))
.subscribe(t -> resumeRequest(asyncResponse));
instead of the filter. This ensures that you only get a single emission if your condition is met. Note that you will get an exception if your condition interval Observable terminates without your condition being met.
Does Observable caches emitted items? I have two tests that lead me to different conclusions:
From the test #1 I make an conclusion that it does:
Test #1:
Observable<Long> clock = Observable
.interval(1000, TimeUnit.MILLISECONDS)
.take(10)
.map(i -> i++);
//subscribefor the first time
clock.subscribe(i -> System.out.println("a: " + i));
//subscribe with 2.5 seconds delay
Executors.newScheduledThreadPool(1).schedule(
() -> clock.subscribe(i -> System.out.println(" b: " + i)),
2500,
TimeUnit.MILLISECONDS
);
Output #1:
a: 0
a: 1
a: 2
b: 0
a: 3
b: 1
But the second test shows that we get different values for two observers:
Test #2:
Observable<Integer> observable = Observable
.range(1, 1000000)
.sample(7, TimeUnit.MILLISECONDS);
observable.subscribe(i -> System.out.println("Subscriber #1:" + i));
observable.subscribe(i -> System.out.println("Subscriber #2:" + i));
Output #2:
Subscriber #1:72745
Subscriber #1:196390
Subscriber #1:678171
Subscriber #2:336533
Subscriber #2:735521
There exist two kinds of Observables: hot and cold. Cold observables tend to generate the same sequence to its Observers unless you have external effects, such as a timer based action, associated with it.
In the first example, you get the same sequence twice because there are no external effects other than timer ticks you get one by one. In the second example, you sample a fast source and sampling with time has a non-deterministic effect: each nanosecond counts so even the slightest imprecision leads to different value reported.
I'm trying to get behaviour like this:
Thread 1: |--- 3 mins of doing task A ---|--- 3 mins of doing task B ---|--- 3 mins of task C ---|
Thread 2: |--- 3 mins of doing task A ---|--- 3 mins of doing task B ---|--- 3 mins of task C ---|
...
Thread k: |--- 3 mins of doing task A ---|--- 3 mins of doing task B ---|--- 3 mins of task C ---|
and accumulate the results of the task B segment of each thread.
I don't need the threads to do the tasks in a synchronized way (i.e. have the start times of the task B phase aligned), but do need negligible gaps between task execution for each individual thread.
I don't have much in terms of code since I'm not sure how to structure this. So far I think I have to create Callable classes since I want to return a value at the end of task B, but I'm not sure if I should be creating 3 different callables for each task or if it's possible to have all of my code for the 3 tasks in 1.
The only way I can think of is do the following in each thread (in a Callable), but it seems like there's some extra time taken for each while loop header check, as well as the start and end time computation, and I really need precision in both the time period dedicated to the tasks, as well as to a reasonable extent their adjacency.
long start = System.currentTimeMillis();
long end = start + 180*1000;
while (System.currentTimeMillis() < end)
{
// task A
}
start = System.currentTimeMillis();
end = start + 180*1000;
while (System.currentTimeMillis() < end)
{
// task B
}
start = System.currentTimeMillis();
end = start + 180*1000;
while (System.currentTimeMillis() < end)
{
// task C
}
return task_b_results;
Is there a better way to do this? (new to Java multithreading)
This does not seem to have anything to do with multithreading. Everything you describe is happening serially.
Something like this perhaps:
List<Task> tasks = Arrays.asList(
new TaskA(),
new TaskB(),
new TaskC()
);
List<TaskResult> results = new List();
for(Task task: tasks) {
long until = System.currentTimeMillis() + 180000L;
TaskResult result = task.createResult();
while(System.currentTimeMillis() < until) {
result.add(task.call());
}
results.add(result);
}
return results.get(1);