Java Threading Tutorial Type Question

Java Threading Tutorial Type Question - java

I am fairly naive when it comes to the world of Java Threading and Concurrency. I am currently trying to learn. I made a simple example to try to figure out how concurrency works.
Here is my code:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ThreadedService {
private ExecutorService exec;
/**
* #param delegate
* #param poolSize
*/
public ThreadedService(int poolSize) {
if (poolSize < 1) {
this.exec = Executors.newCachedThreadPool();
} else {
this.exec = Executors.newFixedThreadPool(poolSize);
}
}
public void add(final String str) {
exec.execute(new Runnable() {
public void run() {
System.out.println(str);
}
});
}
public static void main(String args[]) {
ThreadedService t = new ThreadedService(25);
for (int i = 0; i < 100; i++) {
t.add("ADD: " + i);
}
}
}
What do I need to do to make the code print out the numbers 0-99 in sequential order?

Thread pools are usually used for operations which do not need synchronization or are highly parallel.
Printing the numbers 0-99 sequentially is not a concurrent problem and requires threads to be synchronized to avoid printing out of order.
I recommend taking a look at the Java concurrency lesson to get an idea of concurrency in Java.

The idea of threads is not to do things sequentially.
You will need some shared state to coordinate. In the example, adding instance fields to your outer class will work in this example. Remove the parameter from add. Add a lock object and a counter. Grab the lock, increment print the number, increment the number, release the number.

The simplest solution to your problem is to use a ThreadPool size of 1. However, this isn't really the kind of problem one would use threads to solve.
To expand, if you create your executor with:
this.exec = Executors.newSingleThreadExecutor();
then your threads will all be scheduled and executed in the order they were submitted for execution. There are a few scenarios where this is a logical thing to do, but in most cases Threads are the wrong tool to use to solve this problem.
This kind of thing makes sense to do when you need to execute the task in a different thread -- perhaps it takes a long time to execute and you don't want to block a GUI thread -- but you don't need or don't want the submitted tasks to run at the same time.

The problem is by definition not suited to threads. Threads are run independently and there isn't really a way to predict which thread is run first.
If you want to change your code to run sequentially, change add to:
public void add(final String str) {
System.out.println(str);
}
You are not using threads (not your own at least) and everything happens sequentially.

Related

Confused about java concurrency results

So I am studying up on java concurrency by trying to create bad concurrent examples, watch them fail and then fix them.
But the code never seems to be breaking... What am I missing here?
I have a "shared object", being my HotelWithMaximum instance. As far as I can tell, this class is not thread safe:
package playground.concurrent;
import java.util.ArrayList;
import java.util.List;
public class HotelWithMaximum {
private static final int MAXIMUM = 20;
private List<String> visitors = new ArrayList<String>();
public void register(IsVisitor visitor) {
System.out.println("Registering : " + visitor.getId());
System.out.println("Amount of visitors atm: " + visitors.size());
if(visitors.size() < MAXIMUM) {
//At some point, I do expect a thread to be interfering here where the condition is actually evaluated to
//true, but some other thread interfered, adds another visitor, causing the previous thread to go over the limit
System.out.println("REGISTERING ---------------------------------------------------------------------");
//The interference might also happen here i guess...
visitors.add(visitor.getId());
}
else{
System.out.println("We cant register anymore, we have reached our limit! " + visitors.size());
}
}
public int getAmountOfRegisteredVisitors() {
return visitors.size();
}
public void printVisitors() {
for(String visitor: visitors) {
System.out.println(visitors.indexOf(visitor) + " - " + visitor);
}
}
}
The visitors are 'Runnables' (they implement my interface IsVisitor which extends from Runnable), and they are implemented like this:
package playground.concurrent.runnables;
import playground.concurrent.HotelWithMaximum;
import playground.concurrent.IsVisitor;
public class MaxHotelVisitor implements IsVisitor{
private final String id;
private final HotelWithMaximum hotel;
public MaxHotelVisitor(String id, HotelWithMaximum hotel) {
this.hotel = hotel;
this.id = id;
}
public void run() {
System.out.println(String.format("My name is %s and I am trying to register...", id));
hotel.register(this);
}
public String getId() {
return this.id;
}
}
Then, to make all of this run in an example, I have the following code in a different class:
public static void executeMaxHotelExample() {
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(6);
HotelWithMaximum hotel = new HotelWithMaximum();
for(int i = 0; i<100; i++) {
executor.execute(new MaxHotelVisitor("MaxHotelVisitor-" + i, hotel));
}
executor.shutdown();
try{
boolean finished = executor.awaitTermination(30, TimeUnit.SECONDS);
if(finished) {
System.out.println("FINISHED WITH THE MAX HOTEL VISITORS EXAMPLE");
hotel.printVisitors();
}
}
catch(InterruptedException ie) {
System.out.println("Something interrupted me....");
}
}
public static void main(String[] args) {
executeMaxHotelExample();
}
Now, what am I missing? Why does this never seem to fail? The hotel class is not thread safe, right? And the only thing to make it 'enough' thread safe for this example (since no other code is messing with the thread unsafe List in the hotel class ), I should just make the register method "synchronized", right?
The result of the "printVisitors()" method in the main method, always looks like this:
FINISHED WITH THE MAX HOTEL VISITORS EXAMPLE
0 - MaxHotelVisitor-0
1 - MaxHotelVisitor-6
2 - MaxHotelVisitor-7
3 - MaxHotelVisitor-8
4 - MaxHotelVisitor-9
5 - MaxHotelVisitor-10
6 - MaxHotelVisitor-11
7 - MaxHotelVisitor-12
8 - MaxHotelVisitor-13
9 - MaxHotelVisitor-14
10 - MaxHotelVisitor-15
11 - MaxHotelVisitor-16
12 - MaxHotelVisitor-17
13 - MaxHotelVisitor-18
14 - MaxHotelVisitor-19
15 - MaxHotelVisitor-20
16 - MaxHotelVisitor-21
17 - MaxHotelVisitor-22
18 - MaxHotelVisitor-23
19 - MaxHotelVisitor-24
There are nevere more then 20 visitors in the list... I find that quite weird...

ThreadPoolExecutor is from the java.util.concurrent package
The Java Concurrency Utilities framework in the java.util.concurrent package is a library that contains thread-safe types that are used to handle concurrency in Java applications
So ThreadPoolExecutor is taking care of the syncorinous processing
take note: ThreadPoolExecutor uses BlockingQueue to manage its job queue
java.util.concurrent.BlockingQueue is an interface that request all implementations to be thread-safe.
From my understanding one of the main goals of java.util.concurrent was that you can to a large extent operate without the need to use java's low-level concurrency primitives synchronized, volatile, wait(), notify(), and notifyAll() which are difficult to use.
Also note that ThreadPoolExecutor implements ExecutorService which does not guarantee all implementations are thread-safe but according to the documentation
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html
Actions in a thread prior to the submission of a Runnable or Callable task to an ExecutorService happen-before any actions taken by that task, which in turn happen-before the result is retrieved via Future.get().
Explanation of happen-before:
Java™ Language Specification defines the happens-before relation on memory operations such as reads and writes of shared variables. The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation.
In other words - generally not thread-safe. BUT
The methods of all classes in java.util.concurrent and its subpackages extend these guarantees to higher-level synchronization.

the code never seems to be breaking... What am I missing here?
The Java Language Specification gives implementors a lot of leeway to make the most efficient use of any given multi-processor architecture.
If you obey the rules for writing "safe" multi-threaded code, then that's supposed to guarantee that a correctly implemented JVM will run your program in the way that you expect. But if you break the rules, that does not guarantee that your program will misbehave.
Finding concurrency bugs by testing is a hard problem. A non "thread-safe" program might work 100% of the time on one platform (i.e., architecture/OS/JVM combination), it might always fail on some other platform, and its performance in some third platform might depend on what other processes are running, on the time of day or, on other variables that you can only guess at.

You are right.
You can reproduce the concurrency issues when you use more executors at the same time, say Executors.newFixedThreadPool(100); instead of 6. Then more threads will try it at the same time and the probability is higher. Because the race condition/overflow can only happen once, you will have to run your main more times to get more visitors.
Further you need to add a Thread.yield() at both places where you expect the "interference", to make it more likely to happen. If the execution is very short/fast there will not be a task switch and the execution will be atomic (but not guaranteed).
You might also write the code using ThreadWeaver which does byte code manipulation (adding yields) to make such issues more likely.
With both changes I get 30 and more visitors in the hotel from time to time. I have 2x2 CPUs.

Other Threads stops when one thread reaches its destination

I am currently working on understanding the Java concept of multithreading. I went through a tutorial which uses the Tortoise and the Hare example to explain the concept of multithreading, and to a large extent I understood the syntax and the logic of the video tutorial. At the end of the video tutorial, the Youtuber gave an assignment that involves applying Multithreading to an olympic race track.
Using my knowledege from the example, I was able to create 10 threads (representing the athletes) that run within a loop, that executes 100 times (representing 100 meters).
My challenge is that when the Thread scheduler makes an Athlete to get to 100 meters before the other 9 athletes, the remaining 9 threads always do not complete their race. This is not usually the case in a standard race track. The fact that a Thread called Usain Bolts gets to 100 first, does not mean Yohan Blake should stop running if he is at 90m at that time.
I am also interested in getting the distance (note that they are all using the same variable) for each thread, so that I can use a function to return the positions of each Thread at the end of the race.
What I have done (that did not work):
1) I have tried to use an if else construct (containing nine "else"
statement) to assign the distance of each executing thread to a new integer variable. (using the Thread.currentThread().getName() property and the name of each thread) but that did not work well for me. This was an attempt to give positions to the athletes alone using their distance but does nothing about the 9 athletes not finishing the race.
2) I have also tried to use an ArrayList to populate the distance at runtime but for some strange reasons this still overwrites the distance each time it wants to add another distance.
Below are my codes:
package olympics100meters;
import java.util.ArrayList;
public class HundredMetersTrackRules implements Runnable {
public static String winner;
public void race() {
for (int distance=1;distance<=50;distance++) {
System.out.println("Distance covered by "+Thread.currentThread ().getName ()+" is "+distance+" meters.");
boolean isRaceWon=this.isRaceWon(distance);
if (isRaceWon) {
ArrayList<Integer> numbers = new ArrayList();
numbers.add(distance);
System.out.println("testing..."+numbers);
break;
}
}
}
private boolean isRaceWon(int totalDistanceCovered) {
boolean isRaceWon=false;
if ((HundredMetersTrackRules.winner==null)&& (totalDistanceCovered==50)) {
String winnerName=Thread.currentThread().getName();
HundredMetersTrackRules.winner=winnerName;
System.out.println("The winner is "+HundredMetersTrackRules.winner);
isRaceWon=true;
}
else if (HundredMetersTrackRules.winner==null) {
isRaceWon=false;
}
else if (HundredMetersTrackRules.winner!=null) {
isRaceWon=true;
}
return isRaceWon;
}
public void run() {
this.race();
}
}
This is my main method (I reduced it to 5 Athletes till I sort out the issues):
public class Olympics100Meters {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
HundredMetersTrackRules racer=new HundredMetersTrackRules();
Thread UsainBoltThread=new Thread(racer,"UsainBolt");
Thread TysonGayThread=new Thread (racer,"TysonGay");
Thread AsafaPowellThread=new Thread(racer,"AsafaPowell");
Thread YohanBlakeThread=new Thread (racer,"YohanBlake");
Thread JustinGatlinThread=new Thread (racer,"JustinGatlin");
UsainBoltThread.start();
TysonGayThread.start();
AsafaPowellThread.start();
YohanBlakeThread.start();
JustinGatlinThread.start();
}
}

My challenge is that ... the remaining 9 threads always do not complete their race.
This is caused by isRaceWon() method implementation. You check for it at each meter at each runner. As soon as the first runner achieves 100 meters, the break is called on next step of each runner loop (the race is won for every loop
btw, it makes sense to use volatile statuc String for winner's name, to avoid java's memory model ambiguities.
I am also interested in getting the distance ... for each thread, so that I can use a function to return the positions of each Thread at the end of the race.
If the final aim is to get the position, create a class field public List<String> finishingOrder = new ArrayList<String> and a method finish
private synchronized finish() {
finishingOrder.add(Thread.currentThread().getName())
}
and call it after the "run" loop
do not forget to call join() for all runner threads in your main. After that, the finishingOrder will contain names in order of finishing.

The code snippet below is causing isRaceWon to return true for every instance of HundredMetersTrackRules as soon as the shared winner field is set to non-null (i.e. someone wins.):
else if (HundredMetersTrackRules.winner!=null) {
isRaceWon=true;
}
This in turn causes the loop in race() to break for every instance of your Runnable. The run() method exits, terminating the thread.
The issue is just a logic error and not really specific to threading. But, as other posters have mentioned, there's some threading best-practices you can also adopt in this code, such as using volatile for fields shared by threads.

Actually For Race you need to start all the Threads at once then only its Race.
CountDownLatch is better one to Implement or write Race Program.
Many other way also we can write Race program without using the CountDownLatch.
If we need to implement using base / low level then we can use volatile boolean Flag and counter variable in synchronized blocks or using wait() and notifyAll() logic, etc.,
Introduced some time delay in your program inside the for loop. Then only you can feel the Experience. Why because you are not starting all the threads at once.
Hope you are Practicing Initial / base Level so I made few changes only for better understanding and Addressed all your queries.
import java.util.ArrayList;
import java.util.List;
import java.util.Collections;
class HundredMetersTrackRules implements Runnable {
public static Main main;
HundredMetersTrackRules(Main main){
this.main=main;
}
public static String winner;
public void race() {
try{
System.out.println(Thread.currentThread().getName()+" Waiting for others...");
while(!Main.start){
Thread.sleep(3);
}
for (int distance=1;distance<=50;distance++) {
System.out.println("Distance covered by "+Thread.currentThread().getName()+" is "+distance+" meters.");
Thread.sleep(1000);
}
synchronized(main){
Main.finish--;
}
Main.places.add(Thread.currentThread().getName());
}catch(InterruptedException ie){
ie.printStackTrace();
}
}
public void run() {
this.race();
}
}
public class Main
{
public static volatile boolean start = false;
public static int finish = 5;
final static List<String> places =
Collections.synchronizedList(new ArrayList<String>());
public static void main(String[] args) {
HundredMetersTrackRules racer=new HundredMetersTrackRules(new Main());
Thread UsainBoltThread=new Thread(racer,"UsainBolt");
Thread TysonGayThread=new Thread (racer,"TysonGay");
Thread AsafaPowellThread=new Thread(racer,"AsafaPowell");
Thread YohanBlakeThread=new Thread (racer,"YohanBlake");
Thread JustinGatlinThread=new Thread (racer,"JustinGatlin");
UsainBoltThread.start();
TysonGayThread.start();
AsafaPowellThread.start();
YohanBlakeThread.start();
JustinGatlinThread.start();
Main.start=true;
while(Main.finish!=0){
try{
Thread.sleep(100);
}catch(InterruptedException ie){
ie.printStackTrace();
}
}
System.out.println("The winner is "+places.get(0));
System.out.println("All Places :"+places);
}
}

Java Multithreading large arrays access

My main class, generates multiple threads based on some rules. (20-40 threads live for long time).
Each thread create several threads (short time ) --> I am using executer for this one.
I need to work on Multi dimension arrays in the short time threads --> I wrote it like it is in the code below --> but I think that it is not efficient since I pass it so many times to so many threads / tasks --. I tried to access it directly from the threads (by declaring it as public --> no success) --> will be happy to get comments / advices on how to improve it.
I also look at next step to return a 1 dimension array as a result (which might be better just to update it at the Assetfactory class ) --> and I am not sure how to.
please see the code below.
thanks
Paz
import java.util.concurrent.*;
import java.util.logging.Level;
public class AssetFactory implements Runnable{
private volatile boolean stop = false;
private volatile String feed ;
private double[][][] PeriodRates= new double[10][500][4];
private String TimeStr,Bid,periodicalRateIndicator;
private final BlockingQueue<String> workQueue;
ExecutorService IndicatorPool = Executors.newCachedThreadPool();
public AssetFactory(BlockingQueue<String> workQueue) {
this.workQueue = workQueue;
}
#Override
public void run(){
while (!stop) {
try{
feed = workQueue.take();
periodicalRateIndicator = CheckPeriod(TimeStr, Bid) ;
if (periodicalRateIndicator.length() >0) {
IndicatorPool.submit(new CalcMvg(periodicalRateIndicator,PeriodRates));
}
}
if ("Stop".equals(feed)) {
stop = true ;
}
} // try
catch (InterruptedException ex) {
logger.log(Level.SEVERE, null, ex);
stop = true;
}
} // while
} // run
Here is the CalcMVG class
public class CalcMvg implements Runnable {
private double [][][] PeriodRates = new double[10][500][4];
public CalcMvg(String Periods, double[][][] PeriodRates) {
System.out.println(Periods);
this.PeriodRates = PeriodRates ;
}
#Override
public void run(){
try{
// do some work with the data of PeriodRates array e.g. print it (no changes to array
System.out.println(PeriodRates[1][1][1]);
}
catch (Exception ex){
System.out.println(Thread.currentThread().getName() + ex.getMessage());
logger.log(Level.SEVERE, null, ex);
}
}//run
} // mvg class

There are several things going on here which seem to be wrong, but it is hard to give a good answer with the limited amount of code presented.
First the actual coding issues:
There is no need to define a variable as volatile if only one thread ever accesses it (stop, feed)
You should declare variables that are only used in a local context (run method) locally in that function and not globally for the whole instance (almost all variables). This allows the JIT to do various optimizations.
The InterruptedException should terminate the thread. Because it is thrown as a request to terminate the thread's work.
In your code example the workQueue doesn't seem to do anything but to put the threads to sleep or stop them. Why doesn't it just immediately feed the actual worker-threads with the required workload?
And then the code structure issues:
You use threads to feed threads with work. This is inefficient, as you only have a limited amount of cores that can actually do the work. As the execution order of threads is undefined, it is likely that the IndicatorPool is either mostly idle or overfilling with tasks that have not yet been done.
If you have a finite set of work to be done, the ExecutorCompletionService might be helpful for your task.
I think you will gain the best speed increase by redesigning the code structure. Imagine the following (assuming that I understood your question correctly):
There is a blocking queue of tasks that is fed by some data source (e.g. file-stream, network).
A set of worker-threads equal to the amount of cores is waiting on that data source for input, which is then processed and put into a completion queue.
A specific data set is the "terminator" for your work (e.g. "null"). If a thread encounters this terminator, it finishes it's loop and shuts down.
Now the following holds true for this construct:
Case 1: The data source is the bottle-neck. It cannot be speed-up by using multiple threads, as your harddisk/network won't work faster if you ask more often.
Case 2: The processing power on your machine is the bottle neck, as you cannot process more data than the worker threads/cores on your machine can handle.
In both cases the conclusion is, that the worker threads need to be the ones that seek for new data as soon as they are ready to process it. As either they need to be put on hold or they need to throttle the incoming data. This will ensure maximum throughput.
If all worker threads have terminated, the work is done. This can be i.E. tracked through the use of a CyclicBarrier or Phaser class.
Pseudo-code for the worker threads:
public void run() {
DataType e;
try {
while ((e = dataSource.next()) != null) {
process(e);
}
barrier.await();
} catch (InterruptedException ex) {
}
}
I hope this is helpful on your case.

Passing the array as an argument to the constructor is a reasonable approach, although unless you intend to copy the array it isn't necessary to initialize PeriodRates with a large array. It seems wasteful to allocate a large block of memory and then reassign its only reference straight away in the constructor. I would initialize it like this:
private final double [][][] PeriodRates;
public CalcMvg(String Periods, double[][][] PeriodRates) {
System.out.println(Periods);
this.PeriodRates = PeriodRates;
}
The other option is to define CalcMvg as an inner class of AssetFactory and declare PeriodRate as final. This would allow instances of CalcMvg to access PeriodRate in the outer instance of AssetFactory.
Returning the result is more difficult since it involves publishing the result across threads. One way to do this is to use synchronized methods:
private double[] result = null;
private synchronized void setResult(double[] result) {
this.result = result;
}
public synchronized double[] getResult() {
if (result == null) {
throw new RuntimeException("Result has not been initialized for this instance: " + this);
}
return result;
}
There are more advanced multi-threading concepts available in the Java libraries, e.g. Future, that might be appropriate in this case.
Regarding your concerns about the number of threads, allowing a library class to manage the allocation of work to a thread pool might solve this concern. Something like an Executor might help with this.

Junit test the correct number of threads has started

So I have a method that starts five threads. I want to write a unit test just to check that the five threads have been started. How do I do that? Sample codes are much appreciated.

Instead of writing your own method to start threads, why not use an Executor, which can be injected into your class? Then you can easily test it by passing in a dummy Executor.
Edit: Here's a simple example of how your code could be structured:
public class ResultCalculator {
private final ExecutorService pool;
private final List<Future<Integer>> pendingResults;
public ResultCalculator(ExecutorService pool) {
this.pool = pool;
this.pendingResults = new ArrayList<Future<Integer>>();
}
public void startComputation() {
for (int i = 0; i < 5; i++) {
Future<Integer> future = pool.submit(new Robot(i));
pendingResults.add(future);
}
}
public int getFinalResult() throws ExecutionException {
int total = 0;
for (Future<Integer> robotResult : pendingResults) {
total += robotResult.get();
}
return total;
}
}
public class Robot implements Callable<Integer> {
private final int input;
public Robot(int input) {
this.input = input;
}
#Override
public Integer call() {
// Some very long calculation
Thread.sleep(10000);
return input * input;
}
}
And here's how you'd call it from your main():
public static void main(String args) throws Exception {
// Note that the number of threads is now specified here
ExecutorService pool = Executors.newFixedThreadPool(5);
ResultCalculator calc = new ResultCalculator(pool);
try {
calc.startComputation();
// Maybe do something while we're waiting
System.out.printf("Result is: %d\n", calc.getFinalResult());
} finally {
pool.shutdownNow();
}
}
And here's how you'd test it (assuming JUnit 4 and Mockito):
#Test
#SuppressWarnings("unchecked")
public void testStartComputationAddsRobotsToQueue() {
ExecutorService pool = mock(ExecutorService.class);
Future<Integer> future = mock(Future.class);
when(pool.submit(any(Callable.class)).thenReturn(future);
ResultCalculator calc = new ResultCalculator(pool);
calc.startComputation();
verify(pool, times(5)).submit(any(Callable.class));
}
Note that all this code is just a sketch which I have not tested or even tried to compile yet. But it should give you an idea of how the code can be structured.

Rather than saying you are going to "test the five threads have been started", it would be better to step back and think about what the five threads are actually supposed to do. Then test to make sure that that "something" is actually being done.
If you really just want to test that the threads have been started, there are a few things you could do. Are you keeping references to the threads somewhere? If so, you could retrieve the references, count them, and call isAlive() on each one (checking that it returns true).
I believe there is some method on some Java platform class which you can call to find how many threads are running, or to find all the threads which are running in a ThreadGroup, but you would have to search to find out what it is.
More thoughts in response to your comment
If your code is as simple as new Thread(runnable).start(), I wouldn't bother to test that the threads are actually starting. If you do so, you're basically just testing that the Java platform works (it does). If your code for initializing and starting the threads is more complicated, I would stub out the thread.start() part and make sure that the stub is called the desired number of times, with the correct arguments, etc.
Regardless of what you do about that, I would definitely test that the task is completed correctly when running in multithreaded mode. From personal experience, I can tell you that as soon as you start doing anything remotely complicated with threads, it is devilishly easy to get subtle bugs which only show up under certain conditions, and perhaps only occasionally. Dealing with the complexity of multithreaded code is a very slippery slope.
Because of that, if you can do it, I would highly recommend you do more than just simple unit testing. Do stress tests where you run your task with many threads, on a multicore machine, on very large data sets, and make sure all the answers are exactly as expected.
Also, although you are expecting a performance increase from using threads, I highly recommend that you benchmark your program with varying numbers of threads, to make sure that the desired performance increase is actually achieved. Depending on how your system is designed, it's possible to wind up with concurrency bottlenecks which may make your program hardly faster with threads than without. In some cases, it can even be slower!

Which ThreadPool in Java should I use?

There are a huge amount of tasks.
Each task is belong to a single group. The requirement is each group of tasks should executed serially just like executed in a single thread and the throughput should be maximized in a multi-core (or multi-cpu) environment. Note: there are also a huge amount of groups that is proportional to the number of tasks.
The naive solution is using ThreadPoolExecutor and synchronize (or lock). However, threads would block each other and the throughput is not maximized.
Any better idea? Or is there exist a third party library satisfy the requirement?

A simple approach would be to "concatenate" all group tasks into one super task, thus making the sub-tasks run serially. But this will probably cause delay in other groups that will not start unless some other group completely finishes and makes some space in the thread pool.
As an alternative, consider chaining a group's tasks. The following code illustrates it:
public class MultiSerialExecutor {
private final ExecutorService executor;
public MultiSerialExecutor(int maxNumThreads) {
executor = Executors.newFixedThreadPool(maxNumThreads);
}
public void addTaskSequence(List<Runnable> tasks) {
executor.execute(new TaskChain(tasks));
}
private void shutdown() {
executor.shutdown();
}
private class TaskChain implements Runnable {
private List<Runnable> seq;
private int ind;
public TaskChain(List<Runnable> seq) {
this.seq = seq;
}
#Override
public void run() {
seq.get(ind++).run(); //NOTE: No special error handling
if (ind < seq.size())
executor.execute(this);
}
}
The advantage is that no extra resource (thread/queue) is being used, and that the granularity of tasks is better than the one in the naive approach. The disadvantage is that all group's tasks should be known in advance.
--edit--
To make this solution generic and complete, you may want to decide on error handling (i.e whether a chain continues even if an error occures), and also it would be a good idea to implement ExecutorService, and delegate all calls to the underlying executor.

I would suggest to use task queues:
For every group of tasks You have create a queue and insert all tasks from that group into it.
Now all Your queues can be executed in parallel while the tasks inside one queue are executed serially.
A quick google search suggests that the java api has no task / thread queues by itself. However there are many tutorials available on coding one. Everyone feel free to list good tutorials / implementations if You know some:

I mostly agree on Dave's answer, but if you need to slice CPU time across all "groups", i.e. all task groups should progress in parallel, you might find this kind of construct useful (using removal as "lock". This worked fine in my case although I imagine it tends to use more memory):
class TaskAllocator {
private final ConcurrentLinkedQueue<Queue<Runnable>> entireWork
= childQueuePerTaskGroup();
public Queue<Runnable> lockTaskGroup(){
return entireWork.poll();
}
public void release(Queue<Runnable> taskGroup){
entireWork.offer(taskGroup);
}
}
and
class DoWork implmements Runnable {
private final TaskAllocator allocator;
public DoWork(TaskAllocator allocator){
this.allocator = allocator;
}
pubic void run(){
for(;;){
Queue<Runnable> taskGroup = allocator.lockTaskGroup();
if(task==null){
//No more work
return;
}
Runnable work = taskGroup.poll();
if(work == null){
//This group is done
continue;
}
//Do work, but never forget to release the group to
// the allocator.
try {
work.run();
} finally {
allocator.release(taskGroup);
}
}//for
}
}
You can then use optimum number of threads to run the DoWork task. It's kind of a round robin load balance..
You can even do something more sophisticated, by using this instead of a simple queue in TaskAllocator (task groups with more task remaining tend to get executed)
ConcurrentSkipListSet<MyQueue<Runnable>> sophisticatedQueue =
new ConcurrentSkipListSet(new SophisticatedComparator());
where SophisticatedComparator is
class SophisticatedComparator implements Comparator<MyQueue<Runnable>> {
public int compare(MyQueue<Runnable> o1, MyQueue<Runnable> o2){
int diff = o2.size() - o1.size();
if(diff==0){
//This is crucial. You must assign unique ids to your
//Subqueue and break the equality if they happen to have same size.
//Otherwise your queues will disappear...
return o1.id - o2.id;
}
return diff;
}
}

Actor is also another solution for this specified type of issues.
Scala has actors and also Java, which provided by AKKA.

I had a problem similar to your, and I used an ExecutorCompletionService that works with an Executor to complete collections of tasks.
Here is an extract from java.util.concurrent API, since Java7:
Suppose you have a set of solvers for a certain problem, each returning a value of some type Result, and would like to run them concurrently, processing the results of each of them that return a non-null value, in some method use(Result r). You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers)
throws InterruptedException, ExecutionException {
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers)
ecs.submit(s);
int n = solvers.size();
for (int i = 0; i < n; ++i) {
Result r = ecs.take().get();
if (r != null)
use(r);
}
}
So, in your scenario, every task will be a single Callable<Result>, and tasks will be grouped in a Collection<Callable<Result>>.
Reference:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorCompletionService.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.