This question comes from section sharing resources of Think in java.
Note that in this example the class that can be canceled is not Runnable. Instead, all the EvenChecker tasks that depend on the IntGenerator object test it to see whether it’s been canceled, as you can see in run( ).
And
For example, a task cannot depend on another task, because task
shutdown order is not guaranteed. Here, by making tasks depend on a
nontask object, we eliminate the potential race condition.
How to understand it?
public abstract class IntGenerator {
private volatile boolean canceled = false;
public abstract int next();
public void cancel() { canceled = true; }
public boolean isCanceled() { return canceled; }
}
public class EvenChecker implements Runnable {
private IntGenerator generator;
private final int id;
public EvenChecker(IntGenerator g, int ident) {
generator = g;
id = ident;
}
public void run() {
while(!generator.isCanceled()) {
int val = generator.next();
if(val % 2 != 0) {
System.out.println(val + " not even!");
generator.cancel();
}
}
}
// ...
}
A race condition occurs when two or more tasks start parallel and, depending on which task comes first, cause your program to react differently, inexpectedly or even crash. Without appropriate precautions (ExecutorService, for example) you can't control the order entirely as the underlying operating system always is the last to decide then.
For example. Your have an
ArrayList<String> listA
and you have 3 independed Runnables.
Runnable A is supposed to add 20 Strings to that list.
Runnable B has to put them all to lower case.
Runnable C drops duplicates.
Starting them parallel would cause chaos.
Maybe the wanted order is given. Then the expected result will be a list with no duplicates and all Strings have lower case.
But what if Runnable C comes first and even B is faster than A?
Then your listA would neither be free from duplicates nor would your Strings be brought to lower case.
This is what race condition is generally about. (put in simple words)
So back to your example.
Would IntGenerator be a Runnable too, you'd surely have a lot of trouble to harmonize both Runnables to interact properly with each other. I wouldn't go that far to say that it is impossible, but troublesome.
Related
I'm working at the moment on a simple Chess A.I. (calculate possible future turns, rate them, chosse the best one, + some tricks so you don't have to calculate every single turn). The code is written in Java and I'm using Netbeans. To make the calculations faster, I use multithreading. The code works roughly like this:
Main function makes first some calculations and then starts 8 threads.
the threads execute the main-calculations
when they finish, they set a boolean value in a boolean array (finished[]) true. This array is in the "main Class" (if you call it like this), where also the main function is.
during all this time the main function is waiting and checking constantly if every value of the finished[] - array is true. If that is the case, it continues it's work.
Now I have a strange problem. The code works perfectly on my PC, but when I run the EXACT same code on my laptop, the main function won't continue its work, after all values of the finished[]-array are true. I already made some changes in the code, so I can try it with different numbers of threads, but the result is always the same.
I have totally no idea what's going on here and would really appreciate it, if someone of you had any answers and/or suggestions!
If you need any more Information just ask, I'll try my best. :)
(Sorry for possible grammar mistakes, english isn't my native language, but I'm trying my best. ;))
So I was asked to show some Code I used in the program:
(Perhaps first a warning, yes I am still a big Noob in Java and this is my first time I work with threads so don't be shocked if you see terrible mistakes I possibly made. xD)
The main Class looks something like this:
public class Chess_ai_20 {
static boolean finished[] = new boolean[8];
Distributor[] Distributors = new Distributor[8];
...
public static void main(String[] args) {
boolean testing=false;
...
//some calculations and other stuff
...
Distributors[0] = new Distributor(...., "0"); //the String "0" will be the thread name.
Distributors[1] = new ...
...
Distributors[7] = new Distributor(...., "7");
for (int i = 0; i < 8; i++) {
Distributoren[i].start();
}
testing=false;
while(testing==false){
if(finished[0]==true && finished[1]==true && ... && finished[7]==true){
testing=true; //That's the point where I get stuck I suppose
}
}
System.out.println("I made it!");
}
public static void setFinished(int i) {
finished[i] = true;
System.out.println("finished [" + i + "] = " + finished[i]);
System.out.println(Arrays.toString(finished)); //To check how many values already are true
}
}
Then we got of course the class "Distributor"
public class Distributor extends Thread {
Thread t;
String threadname;
boolean running=false;
...
Distributor(......, String s) {
threadname=s;
...
...
}
#Override
public void start() {
running=true;
if (t == null) {
t = new Thread(this,threadname);
t.start();
}
}
#Override
public void run() {
if(running){
...
//Do the main calculations etc.
...
//All the Calculations habe been done at this point
Chess_ai_20.setFinished(Character.getNumericValue(threadname.charAt(0))); //Set the value of finished[] true in the main class
running=false;
}
}
}
As others have mentioned, using a Future would be much simpler and easy to understand. Below is a snippet demonstrating how you could rewrite your code. Check out the code in action.
First, you write a Callable to define the task that you want to do.
public class MyCallable implements Callable<Boolean> {
#Override
public Boolean call() {
// Do some job and return the result.
return Boolean.TRUE;
}
}
And then, you submit this task to an Executor. There are a lot of Executors in JDK. You want to go through the Concurrency Tutorial first.
ExecutorService executor = Executors.newFixedThreadPool(Runtime
.getRuntime().availableProcessors());
List<Callable<Boolean>> callables = new ArrayList<>();
for (int counter = 0; counter < 8; counter++) {
callables.add(new MyCallable());
}
List<Future<Boolean>> futures = executor.invokeAll(callables);
for (Future<Boolean> future : futures) {
System.out.println(future.get()); // You'd want to store this into an array or wherever you see fit.
}
executor.shutdown();
Remember that the futures returned by the executor are in the same order as the Callables you submitted (or added) to the Collection (in this case, an ArrayList). So you don't need to worry about returning the index, an ID or even the name of the Thread (if you assigned one) to map the corresponding result.
My program searches for a solution (any solution) to a problem through a divide-and-conquer approach, implemented using recursion and RecursiveTasks's: I fork a task for the first branch of the division, then recurse into the second branch: if the second branch has found a solution, then I cancel the first branch, otherwise I wait for its result.
This is perhaps not optimal. One approach would be for any of the launched tasks to throw an exception if a solution is found. But then, how would I cancel all the launched tasks? Does cancelling a task also cancel all sub-tasks?
You can use the simple approach with task manager. For example:
public class TaskManager<T> {
private List<ForkJoinTask<T>> tasks;
public TaskManager() {
tasks = new ArrayList<>();
}
public void addTask(ForkJoinTask<T> task) {
tasks.add(task);
}
public void cancelAllExcludeTask(ForkJoinTask<Integer> cancelTask) {
for (ForkJoinTask<T> task : tasks) {
if (task != cancelTask) {
task.cancel(true);
}
}
}
public void cancelTask(ForkJoinTask<Integer> cancelTask) {
for (ForkJoinTask<T> task : tasks) {
if (task == cancelTask) {
task.cancel(true);
}
}
}
}
And the task:
public class YourTask extends RecursiveTask<Integer> {
private TaskManager<Integer> taskManager;
#Override
protected Integer compute() {
// stuff and fork
newTask.fork();
// do not forget to save in managers list
taskManager.addTask(newTask);
// another logic
// if current task should be cancelled
taskManager.cancelTasks(this);
// or if you have decided to cancel all other tasks
taskManager.cancelAllExcludeTask(this);
}
}
The framework cannot cancel a task for the same reason you cannot cancel a thread. See the documentation on Thread.stop() for all the reasons. What locks could the task be holding? What outside resources could it have linkage to? All the same Thread.stop() reasons apply to tasks as well (after all, tasks run under threads.) You need to tell the task to stop just like you tell a thread to stop.
I manage another fork/join project that uses the scatter-gather technique. The way I do a cancel, or short-circuit, is that every task I create is passed an object (PassObject) that has a
protected volatile boolean stop_now = false;
and a method for stopping the task
protected void stopNow() {stop_now = true; }
Each task periodically checks the stop_now and when true it gracefully ends the task.
Unfortunately, the stop_now needs to be volatile since another thread is going to set it. This can add significant overhead if you check it frequently.
How to set this field in another task gets a little tricky. Each task I create also contains a reference to the array of references to every other task
int nbr_tasks = nbr_elements / threshold;
// this holds the common class passed to each task
PassObject[] passList = new PassObject[nbr_tasks];
for (int i = 0; i < nbr_tasks; i++)
passList[i] = new PassObject( passList,… other parms);
Once the list is formed I fork() each object in passList. Each PassObject contains a reference to the array, passList, which contains a reference to every object that is passed to each task. Therefore, every task knows about every other task and when one task want to cancel the others it simply calls the cancelOthers method with a reference to the passList.
private void cancelOthers (PassObject[] others) {
// tell all tasks to stop
for (int i = 0, max = others.length; i < max; i++)
others[i].stopNow();
If you’re using Java8 then you can do a form of scatter-gather with the CountedCompler class instead of the RecusiveTask. For Java7 or if you still want to use RecursiveTask, then the first task in the recursion needs to create an AtomicBoolean field (AtomicBoolean stop_now = new AtomicBoolean(false);) and include a reference to this field in every new RecursiveTask it creates. With recursion, you don’t know how many levels of tasks you’ll need in the beginning.
Again, you’ll need to check for a true in the boolean periodically in your code and when true, end the task gracefully.
The above is just a hint of how you can do a cancel. Every application is different. What I do works for my application – but the logic is the same. You need something common in every task that a task can set and every other task can see.
I'd add more code but the code insert only is taking one line at a time and it isn't practical.
I have been trying to parallelize a portion of a method within my code (as shown in the Example class's function_to_parallelize(...) method). I have examined the executor framework and found that Futures & Callables can be used to create several worker threads that will ultimately return values. However, the online examples often shown with the executor framework are very simple and none of them appear to suffer my particular case of requiring methods in the class that contains that bit of code I'm trying to parallelize. As per one Stackoverflow thread, I've managed to write an external class that implements Callable called Solver that implements that method call() and set up the executor framework as shown in the method function_to_parallelize(...). Some of the computation that would occur in each worker thread requires methods *subroutine_A(...)* that operate on the data members of the Example class (and further, some of these subroutines make use of random numbers for various sampling functions).
My issue is while my program executes and produces results (sometimes accurate, sometimes not), every time I run it the results of the combined computation of the various worker threads is different. I figured it must be a shared memory problem, so I input into the Solver constructor copies of every data member of the Example class, including the utility that contained the Random rng. Further, I copied the subroutines that I require even directly into the Solver class (even though it's able to call those methods from Example without this). Why would I be getting different values each time? Is there something I need to implement, such as locking mechanisms or synchronization?
Alternatively, is there a simpler way to inject some parallelization into that method? Rewriting the "Example" class or drastically changing my class structuring is not an option as I need it in its current form for a variety of other aspects of my software/system.
Below is my code vignette (well, it's an incredibly abstracted/reduced form so as to show you basic structure and the target area, even if it's a bit longer than usual vignettes):
public class Tools{
Random rng;
public Tools(Random rng){
this.rng = rng;
}...
}
public class Solver implements Callable<Tuple>{
public Tools toolkit;
public Item W;
public Item v;
Item input;
double param;
public Solver(Item input, double param, Item W, Item v, Tools toolkit){
this.input = input;
this.param = param;
//...so on & so forth for rest of arguments
}
public Item call() throws Exception {
//does computation that utilizes the data members W, v
//and calls some methods housed in the "toolkit" object
}
public Item subroutine_A(Item in){....}
public Item subroutine_B(Item in){....}
}
public class Example{
private static final int NTHREDS = 4;
public Tools toolkit;
public Item W;
public Item v;
public Example(...,Tools toolkit...){
this.toolkit = toolkit; ...
}
public Item subroutine_A(Item in){
// some of its internal computation involves sampling & random # generation using
// a call to toolkit, which houses functions that use the initialize Random rng
...
}
public Item subroutine_B(Item in){....}
public void function_to_parallelize(Item input, double param,...){
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
List<Future<Tuple>> list = new ArrayList<Future<Tuple>>();
while(some_stopping_condition){
// extract subset of input and feed into Solver constructor below
Callable<Tuple> worker = new Solver(input, param, W, v, toolkit);
Future<Tuple> submit = executor.submit(worker);
list.add(submit);
}
for(Future<Tuple> future : list){
try {
Item out = future.get();
// update W via some operation using "out" (like multiplying matrices for example)
}catch(InterruptedException e) {
e.printStackTrace();
}catch(ExecutionException e) {
e.printStackTrace();
}
}
executor.shutdown(); // properly terminate the threadpool
}
}
ADDENDUM: While flob's answer below did address a problem with my vignette/code (you should make sure that you are setting your code up to wait for all threads to catch up with .await()), the issue did not go away after I made this correction. It turns out that the problem lies in how Random works with threads. In essence, the threads are scheduled in various orders (via the OS/scheduler) and hence will not repeat the order in which they are executed every run of the program to ensure that a purely deterministic result is obtained. I examined the thread-safe version of Random (and used it to gain a bit more efficiency) but alas it does not allow you to set the seed. However, I highly recommend those who are looking to incorporate random computations within their thread workers to use this as the RNG for multi-threaded work.
The problem I see is you don't wait for all the tasks to finish before updating W and because of that some of the Callable instances will get the updated W instead of the one you were expecting
At this point W is updated even if not all tasks have finished
Blockquote
// update W via some operation using "out" (like multiplying matrices for example)
The tasks that are not finished will take the W updated above instead the one you expect
A quick solution (if you know how many Solver tasks you'll have) would be to use a CountDownLatch in order to see when all the tasks have finished:
public void function_to_parallelize(Item input, double param,...){
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
List<Future<Tuple>> list = new ArrayList<Future<Tuple>>();
CountDownLatch latch = new CountDownLatch(<number_of_tasks_created_in_next_loop>);
while(some_stopping_condition){
// extract subset of input and feed into Solver constructor below
Callable<Tuple> worker = new Solver(input, param, W, v, toolkit,latch);
Future<Tuple> submit = executor.submit(worker);
list.add(submit);
}
latch.await();
for(Future<Tuple> future : list){
try {
Item out = future.get();
// update W via some operation using "out" (like multiplying matrices for example)
}catch(InterruptedException e) {
e.printStackTrace();
}catch(ExecutionException e) {
e.printStackTrace();
}
}
executor.shutdown(); // properly terminate the threadpool
}
then in the Solver class you have to decrement the latch when call method ends:
public Item call() throws Exception {
//does computation that utilizes the data members W, v
//and calls some methods housed in the "toolkit" object
latch.countDown();
}
This question is follow up on this Question, Similar question but the execution is different, As In the below code, I don't have any lock on the object. So trying to understand clearly, I am right or not.
What I understand so far through reading the books and articles:-
Each Thread will enter the run method, and will get the id from the various pool (existPool or newPool) depending on if, else if block, then it will go into attributeMethod which has to be synchronized right? And there is another method in that attributeMethod which doesn't needs to be synchronized right?
So suppose if second thread also launch at the same time, so I will be having any problem with the below example?
private static final class Task implements Runnable {
private BlockingQueue<Integer> existPool;
private BlockingQueue<Integer> newPool;
private int existId;
private int newId;
private Service service;
public Task(Service service, BlockingQueue<Integer> pool1, BlockingQueue<Integer> pool2) {
this.service = service;
this.existPool = pool1;
this.newPool = pool2;
}
public void run() {
if(service.getCriteria.equals("Previous")) {
existId = existPool.take();
attributeMethod(existId);
} else if(service.getCriteria.equals("New")) {
newId = newPool.take();
attributeMethod(newId);
}
}
}
// So I need to make this method synchronized or not? Currently I have made this synchronized
private synchronized void attributeMethod(int range) {
// And suppose If I am calling any other method here-
sampleMethod();
}
// What about this method, I don't thinkg so, it will be synchronized as well as it will be in the scope of previous synchronized method whoever is calling, Right? or not?
private void sampleMethod() {
}
So suppose if second thread also launch at the same time, so I will be having any problem with the below example?
Potentially, yes you will. Reread the second bullet point in my answer to your previous question.
Basically, the problem is that the threads will each synchronize on a different instance of the Task class ... and that won't provide any mutual exclusion.
Whether this is actually a problem here will depend on whether the threads need to synchronize. In this case, it appears that the threads will be sharing Service and BlockingQueue instances. If that is the extent of their sharing AND you are using thread-safe implementation classes, then synchronization may not be necessary.
My advice to you would be go back to your Java textbook(s) / tutorial(s), and review what they say about what synchronized and primitive mutexes actually do. They are really quite simple ... but you need to fully understand the primitives before you can put them together correctly to achieve the goal you are trying to achieve.
There are a huge amount of tasks.
Each task is belong to a single group. The requirement is each group of tasks should executed serially just like executed in a single thread and the throughput should be maximized in a multi-core (or multi-cpu) environment. Note: there are also a huge amount of groups that is proportional to the number of tasks.
The naive solution is using ThreadPoolExecutor and synchronize (or lock). However, threads would block each other and the throughput is not maximized.
Any better idea? Or is there exist a third party library satisfy the requirement?
A simple approach would be to "concatenate" all group tasks into one super task, thus making the sub-tasks run serially. But this will probably cause delay in other groups that will not start unless some other group completely finishes and makes some space in the thread pool.
As an alternative, consider chaining a group's tasks. The following code illustrates it:
public class MultiSerialExecutor {
private final ExecutorService executor;
public MultiSerialExecutor(int maxNumThreads) {
executor = Executors.newFixedThreadPool(maxNumThreads);
}
public void addTaskSequence(List<Runnable> tasks) {
executor.execute(new TaskChain(tasks));
}
private void shutdown() {
executor.shutdown();
}
private class TaskChain implements Runnable {
private List<Runnable> seq;
private int ind;
public TaskChain(List<Runnable> seq) {
this.seq = seq;
}
#Override
public void run() {
seq.get(ind++).run(); //NOTE: No special error handling
if (ind < seq.size())
executor.execute(this);
}
}
The advantage is that no extra resource (thread/queue) is being used, and that the granularity of tasks is better than the one in the naive approach. The disadvantage is that all group's tasks should be known in advance.
--edit--
To make this solution generic and complete, you may want to decide on error handling (i.e whether a chain continues even if an error occures), and also it would be a good idea to implement ExecutorService, and delegate all calls to the underlying executor.
I would suggest to use task queues:
For every group of tasks You have create a queue and insert all tasks from that group into it.
Now all Your queues can be executed in parallel while the tasks inside one queue are executed serially.
A quick google search suggests that the java api has no task / thread queues by itself. However there are many tutorials available on coding one. Everyone feel free to list good tutorials / implementations if You know some:
I mostly agree on Dave's answer, but if you need to slice CPU time across all "groups", i.e. all task groups should progress in parallel, you might find this kind of construct useful (using removal as "lock". This worked fine in my case although I imagine it tends to use more memory):
class TaskAllocator {
private final ConcurrentLinkedQueue<Queue<Runnable>> entireWork
= childQueuePerTaskGroup();
public Queue<Runnable> lockTaskGroup(){
return entireWork.poll();
}
public void release(Queue<Runnable> taskGroup){
entireWork.offer(taskGroup);
}
}
and
class DoWork implmements Runnable {
private final TaskAllocator allocator;
public DoWork(TaskAllocator allocator){
this.allocator = allocator;
}
pubic void run(){
for(;;){
Queue<Runnable> taskGroup = allocator.lockTaskGroup();
if(task==null){
//No more work
return;
}
Runnable work = taskGroup.poll();
if(work == null){
//This group is done
continue;
}
//Do work, but never forget to release the group to
// the allocator.
try {
work.run();
} finally {
allocator.release(taskGroup);
}
}//for
}
}
You can then use optimum number of threads to run the DoWork task. It's kind of a round robin load balance..
You can even do something more sophisticated, by using this instead of a simple queue in TaskAllocator (task groups with more task remaining tend to get executed)
ConcurrentSkipListSet<MyQueue<Runnable>> sophisticatedQueue =
new ConcurrentSkipListSet(new SophisticatedComparator());
where SophisticatedComparator is
class SophisticatedComparator implements Comparator<MyQueue<Runnable>> {
public int compare(MyQueue<Runnable> o1, MyQueue<Runnable> o2){
int diff = o2.size() - o1.size();
if(diff==0){
//This is crucial. You must assign unique ids to your
//Subqueue and break the equality if they happen to have same size.
//Otherwise your queues will disappear...
return o1.id - o2.id;
}
return diff;
}
}
Actor is also another solution for this specified type of issues.
Scala has actors and also Java, which provided by AKKA.
I had a problem similar to your, and I used an ExecutorCompletionService that works with an Executor to complete collections of tasks.
Here is an extract from java.util.concurrent API, since Java7:
Suppose you have a set of solvers for a certain problem, each returning a value of some type Result, and would like to run them concurrently, processing the results of each of them that return a non-null value, in some method use(Result r). You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers)
throws InterruptedException, ExecutionException {
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers)
ecs.submit(s);
int n = solvers.size();
for (int i = 0; i < n; ++i) {
Result r = ecs.take().get();
if (r != null)
use(r);
}
}
So, in your scenario, every task will be a single Callable<Result>, and tasks will be grouped in a Collection<Callable<Result>>.
Reference:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorCompletionService.html