Let's say that I have to run some (mostly independent) expensive tasks in parallel. Normally this can easily be done using the fork/join framework.
My problem is that, some of those tasks may also spawn subtasks, using a different ForkJoinPool (in some method deeper down in the call hierarchy). I know that this will spawn many threads, which may slow down my application, and I would like to avoid that. One solution is to use the global pool and add tasks there, but this is not an option in my case.
The reason this is useful for me is that some of the original tasks are dependent and may wait for each other. For example, say A1 and A2 are two tasks which need the results of B (which is parallelizable) in order to proceed to C1 and C2 respectively. In that case, the threads running A1 and A2 can focus on B to improve CPU utilization. A simple example is shown below.
ConcurrentHashMap<Integer, Integer> map = new ConcurrentHashMap<>();
public int expensiveComputation(int x) {
int result = x;
// do stuff using different ForkJoinPool!
return result;
}
public abstract class A {
public abstract run(int x);
}
public class A1 extends A {
public A1(int x) {
super(x);
}
#Override
public void run() {
// do stuff
// Only 1 thread will run this for a given value of x
map.putIfAbsent(x, expensiveComputation(x));
// do stuff
}
}
public class A2 extends A {
public A2(int x) {
super(x);
}
#Override
public void run() {
// do stuff
// Only 1 thread will run this for a given value of x
map.putIfAbsent(x, expensiveComputation(x));
// do stuff
}
}
public static void main(String[] args) {
LinkedList<A> tasks = new LinkedList<>();
tasks.add(new A1(0));
tasks.add(new A2(0));
// More tasks
ForkJoinPool pool = new ForkJoinPool(parallelism);
pool.submit(() -> tasks.parallelStream().forEach((x -> {
x.run();
})));
}
Is it possible to utilize the "parent" pool from within those tasks? In the example above, the parent pool is the one in the main method.
Naturally, I would like to do this without passing it as an argument through a long chain of method calls or using a global variable. Ideally I would like to restrict my program to the number of threads used by the parent pool, without doing any such tricks.
Related
I'm reading java multithreading tutorial which says thread only gives up key until it completes synchronised method, however when I run the following code (about 20 times):
public class SyncDemo implements Runnable{
#Override
public void run() {
for (int i = 0; i < 10; i++) {
sync();
}
}
private synchronized void sync() {
System.out.println(Thread.currentThread().getName());
}
public static void main(String[] args) {
SyncDemo s = new SyncDemo();
Thread a = new Thread(s, "a");
Thread b = new Thread(s, "b");
a.start();
b.start();
}
}
it only prints a then b, which I expect a mixed sequence of them because current thread will unlock every time after sync() is executed inside the loop? Thus giving the other thread a chance to print its name?
There is nothing in your program that would demand a certain execution order. So the run time will schedule the threads in a way that makes most sense in the current situation. Factors that may influence the order are number of processors, load situation, ...
I have some class which is not thread safe:
class ThreadUnsafeClass {
long i;
long incrementAndGet() { return ++i; }
}
(I've used a long as the field here, but we should think of its field as being some thread-unsafe type).
I now have a class which looks like this
class Foo {
final ThreadUnsafeClass c;
Foo(ThreadUnsafeClass c) {
this.c = c;
}
}
That is, the thread unsafe class is a final field of it. Now I'm going to do this:
public class JavaMM {
public static void main(String[] args) {
final ForkJoinTask<ThreadUnsafeClass> work = ForkJoinTask.adapt(() -> {
ThreadUnsafeClass t = new ThreadUnsafeClass();
t.incrementAndGet();
return new FC(t);
});
assert (work.fork().join().c.i == 1);
}
}
That is, from thread T (main), I invoke some work on T' (the fork-join-pool) which creates and mutates an instance of my unsafe class and then returns the result wrapped in a Foo. Please note that all mutation of my thread unsafe class happens on a single thread, T'.
Question 1: Am I guaranteed that the end-state of the instance of the thread-unsafe-class is safely ported across the T' ~> T thread boundary at the join?
Question 2: What if I had done this using parallel streams? For example:
Map<Long, Foo> results =
Stream
.of(new ThreadUnsafeClass())
.parallel()
.map(tuc -> {
tuc.incrementAndGet();
return new Foo(tuc);
})
.collect(
Collectors.toConcurrentMap(
foo -> foo.c.i,
Function.identity();
)
);
assert(results.get(1) != null)
I think ForkJoinTask.join() has the same memory effects as Future.get() (because it says in join() Javadoc that is is basically get() with interruption and exception differences). And Future.get() is specified as:
Actions taken by the asynchronous computation represented by a Future happen-before actions subsequent to the retrieval of the result via Future.get() in another thread.
In other words, this is basically a "safe publication" via Future/FJT. Which means, anything that the executor thread did and published via FJT result is visible to FJT.join() users. Since the example allocates the object and populates its field only within the executor thread, and nothing happens with the object after it gets returned from the executor, it stands to reason that we are only allowed to see the values the executor thread produced.
Note that putting the whole thing via final does not bring any additional benefit to it. Even if you just did the plain field stores, you would still be guaranteed this:
public static void main(String... args) throws Exception {
ExecutorService s = Executors.newCachedThreadPool();
Future<MyObject> f = s.submit(() -> new MyObject(42));
assert (f.get().x == 42); // guaranteed!
s.shutdown();
}
public class MyObject {
int x;
public MyObject(int x) { this.x = x; }
}
But notice that in the Stream example (if we assume the symmetry between Stream.of.parallel and Executor.submit, and between Stream.collect and FJT.join/Future.get), you have created the object in the caller thread, then passed it to executor to do something. This is a subtle difference, but it does not matter much still, because we also have HB on submit, that preclude seeing the old state of the object:
public static void main(String... args) throws Exception {
ExecutorService s = Executors.newCachedThreadPool();
MyObject o = new MyObject(42);
Future<?> f = s.submit(() -> o.x++); // new --hb--> submit
f.get(); // get -->hb--> read o.x
assert (o.x == 43); // guaranteed
s.shutdown();
}
public static class MyObject {
int x;
public MyObject(int x) { this.x = x; }
}
(In formal speak, that is because all the HB paths from read(o.x) go via the action of the executor thread that does store(o.x, 43))
I have a class as below with three methods
public class MyRunnable implements Runnable {
#Override
public void run() {
// what code need to write here
//to call the specific methods based on request types
}
public int add(int a, int b){
return a+b;
}
public int multiply(int a , int b){
return a*b;
}
public int division(int a , int b){
return a/b;
}
}
and my main class as blow
here r.multiply(), add() and division() methods will be executed sequentially, but i want to execute them in multi-threaded way hence i can get the result faster. how to call a method of a class dynamically based on inputs. how to pass to thread and how to return result from thread to calling thread.
public class ThreadDemo {
public static void main(String[] args) {
MyRunnable r = new MyRunnable();
// how to do below calculation using multihtreading
// how to call a method and how to get result of a thread of same class
int result = r.multiply(1, 2) + r.add(4, 5) + r.division(10, 5);
System.out.println(result);
int newResult = r.add(20, 50);
System.out.println(newResult);
}
}
Multi-threading would slow down this application (because the amount of processing per step is far to small to justify the overhead of distributing the work across threads), the application probably finishes well before you perceive it anyway.
Assuming it's a simplified example you can write
MyRunnable r = new MyRunnable();
Executor exec = Executors.newFixedThreadPool(3);
CompletableFuture<Integer> mult = CompletableFuture.runAsync(() -> r.multiply(1, 2),exec );
CompletableFuture<Integer> add = CompletableFuture.runAsync(() -> r.add(4, 5) ,exec);
CompletableFuture<Integer> div = CompletableFuture.runAsync(() -> r.division(10, 5),exec);
CompletableFuture<Integer> result = mult.thenCombine(add, (multRes,addRes) -> multRes+addRest)
.thenCombine(div, (total,divRes) -> total+divRes);
int answer = result.join();
UPDATE Why use an explicitly defined Executor?
It shows readers how to explicitly define an executor (the alternative is straightforward)
By defining the Executor as a variable, you can switch between the Common ForkJoinPool (or any other executor type) by changing just that variable assignment (you don't have to refactor all of the methods). E.g.
//ForkJoinPool common
Executor exec = ForkJoinPool.commonPool();
//Expanding thread pool
Executor exec = Executors.newCachedThreadPool();
//Just execute on the current thread
Executor exec = (Runnable r) -> r.run();
By default CompletableFuture.*Async methods share the Common ForkJoinPool and so do Parallel Streams, along with ForkJoinTasks without a specific executor. Unless all members of the team think carefully about when to / not to use the Common ForkJoinPool you could end up mixing async I/O operations with CPU bound processing in the same pool accidentally.
Also by default, the parallelism is set to Runtime.getRuntime().availableProcessors() - 1. Which again, may or may not suit the use case at hand (for some users it might mean this example was single threaded). It is configurable via the System Property "java.util.concurrent.ForkJoinPool.common.parallelism", if you need to change the defaults.
I am trying to run two methods in parallel. To do that I have written a code like below:
After getting critiques, I thought I should be clearer. method1() is a local method which runs on my local computer. method2() is a web method which sends some data to a remote computer to process and return result. Since the work takes too long I divide the data into 2 pieces and one part is processed on local and the other one is on remote.
After the jobs finishes I combine the results.
//str1 and str2 are defined outside the main method.
Thread[] threads = new Thread[2];
threads[0] = new Thread() {
#Override
public void run() { str1 = method1(); }
};
threads[1] = new Thread() {
#Override
public void run() { str2 = method2(); }
};
threads[0].start();
threads[1].start();
I get null from method2() in return when I try this way. Bu if I run method2() outside thread definition str2 is null.
Thread thread = new Thread() {
#Override
public void run() { str1 = method1(); }
};
thread.start();
str2 = method2();
method2() returns what it should return. The explanation could be that thread[1].start() does not start.
You should ask yourself the question, why would you like to run two methods parallel?
when working with threads always think in terms of task(work to be done) and the executor of the task(which is a thread in this case).
So if we say that the stuff which goes inside a method is a task, which we need to perform, then the next question is who will accomplish that task.
Now in your case you have two methods to run which means two tasks at hand.
Next question is are these tasks mutually exclusive? if yes then the method you adopted above is right.
If the tasks are not mutually exclusive and depend on a common state. Like below:-
//x and y are state variables which are shared across two tasks(method1 and method2)
int x; int y;
method1(){x++;y--;}
method2(){x--;y++;}
Thread[] threads = new Thread[2];
threads[0] = new Thread() {
#Override
public void run() { method1(); }
};
threads[1] = new Thread() {
#Override
public void run() { method2(); }
};
threads[0].start();
threads[1].start();
Now, Here we have to make sure that the two threads have sequential access to the state variables x and y . So we do that by synchronizing access to these state variables by using thread synchronization techniques.
There are various approaches to that, the simplest being adding the synchronized keyword to method1 and method 2 as follows:-
synchronized method1(){x++;y--;}
synchronized method2(){x--;y++;}
rest remains same. For reference you can see various thread synchronization approaches for a more efficient approach.
Your code should work for what you explained what you want to do. (Before your edit ...)
Nevertheless, let me suggest using one of Java's Executors. They may be of great use to you when it comes to extending the functionality or adding more tasks ...
After your edit:
I suspect a "racing condition". That is: You check for the result, before the thread has finished and set the variable.
I have a method which takes a list and do some processing on it and it updates another global list. I need to run multiple instances of this method with different lists input in parallel.
Does multi-threading support this? If yes, how can i use it i.e.: what shall i put in the thread? Examples are highly appreciated.
I am thinking of having a static list in the thread class which gets updated by the different instances of the thread while running (the list contains strings and counters, so the update is adding new strings or increasing the counters of existing ones).. i need to read whatever gets added to this global list every 10 seconds and print it.. is using static list suitable for this and how can i make it thread safe?
Yes, that's a very common usage of multithreaded programming.
class ListProcessor implements Runnable {
/* field/s representing param/s */
public ListProcessor(/* param/s */) {
/* ... */
}
#Override
public void run() {
/* process list */
}
}
Then, when you want to actually process some lists.
class SomeClass {
ExecutorService listProcessor;
public SomeClass(/* ... */) {
listProcessor = ExecutorService.newFixedThreadPool(numThreads);
/* for each thread, however you want to do it */
listProcessor.execute(new ListProcessor(/* param/s */));
/* when finished adding threads */
listProcessor.shutdown();
/* note that the above two lines of code (execute/shutdown) can be
* placed anywhere in the code. I just put them in the constructor to
* facilitate this example.
*/
}
}
#purtip31 has a start for the parallel processing stuff.
I'm concerned about the results - you mention that you update a "global list". If multiple threads at a time are trying to update that list at the same time there could be problems. A couple of options:
Make sure that list is properly thread safe. This may or may not be easy - depends on exactly what is getting changed.
Use ExecutorService, but with the invokeAll() method, which runs a bunch of Callables in parallel and waits till they are all done. Then you can go through all of the results and update them one at a time. No threading issues with the results. This means that your code will have to implement Callable instead of Runnable (not a big deal). I have a blog with an example here
Well Sam...i m not much cleared with your question.....
try this out....
Following is a code which would help u to run mulitple instances.......
Main thread
public class mainprocess
{
public static LinkedList globallist;
public static String str;
public int num;
public static void main(String Data[])
{
globallist = new LinkedList();
// LinkedList will be passed as pass by reference.....
// globalist is made static and assigned likewise for global use..
childprocess.assignlist(globallist);
childprocess p1 = new childprocess("string input"); // a string input...
childprocess p2 = new childprocess(number input); // a number input...
p1.t.join();
p2.t.join();
}
}
The Child Thread.....
public class childprocess implements Runnable
{
public Thread t1,t2;
public boolean inttype,stringtype;
String string;
int num;
public static LinkedList temp = new Linkedlist();
public static assignlist(LinkedList ll)
{
temp = ll;
}
public childprocess(String str)
{
string = str;
t1 = new Thread(this,"stringThread");
t1.start();
}
#override
public childprocess(int n)
{
num = n;
t2 = new Thread(this,"numberThread");
t2.start();
}
#override
public void run()
{
// Both will be executed in a threader manner based on the condition...
if(Thread.currentThread().getName().equals("stringThread")
{
// your process using string......
childprocess.temp.add(str);
}
else if(Thread.currentThread().getName().equals("numberThread")
{
// your process using number.....
chilprocess.temp.add(num);
}
}
}
If you are using functions that should be restricted to only one thread at a time...
include the syntax....
public synchronized func_type func_name()
{
}