Distributing each thread a Particular Range - java

I am using ThreadPoolExecutor in my multithreading program, I want each thread should have particular range of ID's if ThreadSize is set as 10 and Start = 1 and End = 1000 then each thread would have range of 100 id's(basically by dividing end range with thread size) that it can use without stepping on other threads.
Thread1 will use 1 to 100 (id's)
Thread2 will use 101 to 200 (id's)
Thread3 will use 201 to 300 (id's)
-----
-----
Thread10 will use 901 to 1000
I know the logic basically, the logic can be like this-
Each thread gets `N = (End - Start + 1) / ThreadSize` numbers.
Thread number `i` gets range `(Start + i*N) - (Start + i*N + N - 1)`.
As I am working with ThreadPoolExecutor for the first time, so I am not sure where should I use this logic in my code so that each Thread is Using a predefined ID's without stepping on other threads. Any suggestions will be appreciated.
public class CommandExecutor {
private List<Command> commands;
ExecutorService executorService;
private static int noOfThreads = 3;
// Singleton
private static CommandExecutor instance;
public static synchronized CommandExecutor getInstance() {
if (instance == null) {
instance = new CommandExecutor();
}
return instance;
}
private CommandExecutor() {
try {
executorService = Executors.newFixedThreadPool(noOfThreads);
} catch(Exception e) {
System.out.println(e);
}
}
// Get the next command to execute based on percentages
private synchronized Command getNextCommandToExecute() {
}
// Runs the next command
public synchronized void runNextCommand() {
// If there are any free threads in the thread pool
if (!(((ThreadPoolExecutor) executorService).getActiveCount() < noOfThreads))
return;
// Get command to execute
Command nextCommand = getNextCommandToExecute();
// Create a runnable wrapping that command
Task nextCommandExecutorRunnable = new Task(nextCommand);
executorService.submit(nextCommandExecutorRunnable); // Submit it for execution
}
// Implementation of runnable (the real unit level command executor)
private static final class Task implements Runnable {
private Command command;
public Task(Command command) {
this.command = command;
}
public void run() {
// Run the command
command.run();
}
}
// A wrapper class that invoked at every certain frequency, asks CommandExecutor to execute next command (if any free threads are available)
private static final class CoreTask implements Runnable {
public void run() {
CommandExecutor commandExecutor = CommandExecutor.getInstance();
commandExecutor.runNextCommand();
}
}
// Main Method
public static void main(String args[]) {
// Scheduling the execution of any command every 10 milli-seconds
Runnable coreTask = new CoreTask();
ScheduledFuture<?> scheduledFuture = Executors.newScheduledThreadPool(1).scheduleWithFixedDelay(coreTask, 0, 10, TimeUnit.MILLISECONDS);
}
}

Whether this is a good idea or not I will leave it for you to decide. But to give you a hand, I wrote a little program that does what you want... in my case I am just summing over the "ids".
Here is the code:
public class Driver {
private static final int N = 5;
public static void main(String args[]) throws InterruptedException, ExecutionException{
int startId = 1;
int endId = 1000;
int range = (1 + endId - startId) / N;
ExecutorService ex = Executors.newFixedThreadPool(N);
List<Future<Integer>> futures = new ArrayList<Future<Integer>>(N);
// submit all the N threads
for (int i = startId; i < endId; i += range) {
futures.add(ex.submit(new SumCallable(i, range+i-1)));
}
// get all the results
int result = 0;
for (int i = 0; i < futures.size(); i++) {
result += futures.get(i).get();
}
System.out.println("Result of summing over everything is : " + result);
}
private static class SumCallable implements Callable<Integer> {
private int from, to, count;
private static int countInstance = 1;
public SumCallable(int from, int to) {
this.from = from;
this.to = to;
this.count = countInstance;
System.out.println("Thread " + countInstance++ + " will use " + from + " to " + to);
}
// example implementation: sums over all integers between from and to, inclusive.
#Override
public Integer call() throws Exception {
int result = 0;
for (int i = from; i <= to; i++) {
result += i;
}
System.out.println("Thread " + count + " got result : " + result);
return result;
}
}
}
which produces the following output (notice that in true multi-thread fashion, you have print statements in random order, as the threads are executed in whatever order the system decides):
Thread 1 will use 1 to 200
Thread 2 will use 201 to 400
Thread 1 got result : 20100
Thread 3 will use 401 to 600
Thread 2 got result : 60100
Thread 4 will use 601 to 800
Thread 3 got result : 100100
Thread 5 will use 801 to 1000
Thread 4 got result : 140100
Thread 5 got result : 180100
Result of summing over everything is : 500500

Related

Java Set multithread processing

all
I'm trying to check multithread processing of the some data set that contain number from 1 to N. For example, I want to sum all this number:
1) Hold the sum (result).
public class ResultHolder {
public static AtomicLong total_time = new AtomicLong(0);
public static Long sum = 0l;
public Long getSum() {
return sum;
} // END: getSum()
#PostConstruct
public void init() {
} // END: init()
public void setSum(Long sum) {
this.sum = sum;
} // END: setSum()
public void printSum() {
System.out.println("Sum is " + sum);
}
public void clearSum() {
sum = 0l;
}
} // ENDC: ResultHolder
2) Process part of number's set:
public class SumProcessor {
private static int global_id = 0;
final public int processor_id;
private final ArrayList<Long> numbers;
private Long processor_sum = 0l;
#Autowired
private final ResultHolder sumHoldder = null;
public SumProcessor(ArrayList<Long> numbers) {
this.numbers = numbers;
processor_id = ++global_id;
} // END: constructor
public void work() throws Exception {
long t1 = new java.util.Date().getTime();
int i = 0;
try {
if (numbers == null) throw new Exception("Не удалось получить массив чисел.");
for (i = 0; i < numbers.size(); i++) {
Long o = null;
try {
o = numbers.get(i);
if (o == null) throw new Exception("no number");
} catch (Exception e) {
throw new Exception("Ошибка извлечения числа из массива: " + e);
}
processor_sum += o;
} // END: for
if (sumHoldder == null) throw new Exception("No sum holder");
synchronized (sumHoldder) {
sumHoldder.setSum(sumHoldder.getSum() + processor_sum);
}
long t2 = new java.util.Date().getTime();
this.sumHoldder.total_time.addAndGet(t2 - t1);
} catch (Exception e) {
System.out.println("Work() error (" + i + ") " + e);
}
return;
} //END: method1
#PostConstruct
public void init() {
System.out.println("Initializated B: " + this);
} //END: method2
#PreDestroy
public void destroy() {
System.out.println("Destroy B: " + this);
} //END: method3
#Override
public String toString() {
return "" +
"Processor " + processor_id + " " +
"contain " + numbers.size() + " " +
"numbers from " + numbers.get(0) +
" to " + numbers.get(numbers.size() - 1);
} //END: toString()
} //END: class SumProcessor
3) Very simple profiler (calcs processing time)
#Aspect
public class MethodLoggerBasic {
#Pointcut("execution(* *.work(..))")
void around_work() {};
#Around("around_work()")
public void logMethodName(ProceedingJoinPoint joinPoint) throws Throwable {
long starttime = new Date().getTime();
joinPoint.proceed();
long endtime = new Date().getTime();
long time = endtime - starttime;
MainApp.time += time;
} // END:
} // ENDC
4) Main program (can start processing linear or in parallel)
public class MainApp {
static AnnotationConfigApplicationContext context;
public static long time = 0l;
public final static int SIZE = 40_000_000;
public final static int DIVIDE_FACTOR = 4;
public static ArrayList<Long>[] numbers = new ArrayList[DIVIDE_FACTOR];
public static ArrayList<SumProcessor> processors = new ArrayList<>();
public static void main(String[] args) throws Exception {
context = new AnnotationConfigApplicationContext(myConfig.class);
// form 4 datasets
int part_size = SIZE / DIVIDE_FACTOR;
int i;
int j;
for (j = 0; j < DIVIDE_FACTOR; j++) {
numbers[j] = new ArrayList<>();
for (i = 0; i < (int) part_size; i++) {
numbers[j].add(((j * part_size) + i + 1l));
}
}
// create 4 processors (bean)
for (i = 0; i < DIVIDE_FACTOR; i++) {
SumProcessor bean = context.getBean(SumProcessor.class, numbers[i]);
if (bean == null) throw new Exception("Error recive bean SumProcessor.class");
processors.add(bean);
}
// creates 4 threads fro processors
thread_process thread1 = new thread_process();
thread_process thread2 = new thread_process();
thread_process thread3 = new thread_process();
thread_process thread4 = new thread_process();
ResultHolder a;
a = context.getBean(ResultHolder.class);
try {
boolean isByPool = true; // flag
time = 0;
if (isByPool) {
System.out.println("-------------------");
System.out.println("Multithread compute");
System.out.println("-------------------");
ExecutorService pool = new ThreadPoolExecutor(
4,
4,
0,
TimeUnit.MICROSECONDS,
new ArrayBlockingQueue<>(4)
);
List<Callable<Boolean>> tasks = new ArrayList();
tasks.add(thread1);
tasks.add(thread2);
tasks.add(thread3);
tasks.add(thread4);
pool.invokeAll(tasks);
pool.shutdown();
pool.awaitTermination(60, TimeUnit.SECONDS);
} else {
thread1.start();
thread2.start();
thread3.start();
thread4.start();
thread1.join();
thread2.join();
thread3.join();
thread4.join();
}
a.printSum();
a.clearSum();
System.out.println("total time is " + a.total_time);
System.out.println("basic time is " + MainApp.time);
System.out.println("-------------");
System.out.println("Single thread");
System.out.println("-------------");
ArrayList<Long> numbers_tolal = new ArrayList<>();
for (i = 0; i < SIZE; i++) {
numbers_tolal.add((i + 1l));
}
SumProcessor sumProcessor = context.getBean(SumProcessor.class, numbers_tolal);
a.total_time.set(0l);
time = 0l;
sumProcessor.work();
a.printSum();
System.out.println("total time is " + a.total_time);
System.out.println("basic time is " + MainApp.time);
} catch (Exception e) {
throw new Exception("MainApp error: " + e);
}
context.close();
} // END: main
} // END: class MainApp
5) Thread process:
public class thread_process extends Thread implements Callable, Runnable {
static int index = 0;
#Override
public void run() {
try {
SumProcessor next = MainApp.processors.get(index++);
if (next == null) {
System.out.println("Нет процессора");
System.exit(-1);
}
next.work();
System.out.println("Thread " + this + " complete!");
} catch (Exception e) {
System.out.println("Error in thread " + this + ": " + e);
}
} //END: run()
#Override
public Boolean call() throws Exception {
run();
return true;
} //END: call()
}; //END: class thread_process
The output is:
Initializated B: Processor 1 contain 10000000 numbers from 1 to 10000000
Initializated B: Processor 2 contain 10000000 numbers from 10000001 to 20000000
Initializated B: Processor 3 contain 10000000 numbers from 20000001 to 30000000
Initializated B: Processor 4 contain 10000000 numbers from 30000001 to 40000000
-------------------
Multithread compute
-------------------
Thread Thread[Thread-3,5,main] complete!
Thread Thread[Thread-4,5,main] complete!
Thread Thread[Thread-2,5,main] complete!
Thread Thread[Thread-1,5,main] complete!
Sum is 800000020000000
total time is 11254
basic time is 11254
-------------
Single thread
-------------
Initializated B: Processor 5 contain 40000000 numbers from 1 to 40000000
Sum is 800000020000000
total time is 6995
basic time is 6995
Is there a method to make it faster in parallel than linear? Or do I perhaps not need to fork this task? Or maybe my profiler is not so good...
GitHub project
You are trying to perform a sequential task using multithreading, that isn't correct use of multithreading. Here, you have a resource for which you need to perform some work. You are using multiple threads to distribute that work, but at the same time, you are blocking one thread when the other thread is using the resource. So, why have worker threads in the first place if you don't want them to access the resource in parallel.
If not necessary, you can drop the Set implementation of the dataset and use List or Arrays where you can access elements using indices without blocking the worker thread.
Update 1: Just add one more line after pool.shutdown() call.
pool.shutdown(); // starts thread shutdown, or force execution of worker threads
pool.awaitTermination(60, TimeUnit.SECONDS); // blocks main thread until thread pool finishes
// ...
// now you can do your single thread task
Also, don't create too many threads since a single thread is fast enough to handle million array elements.
Update 2: So, I don't know why but putting the single thread out of try block seems to get me the expected result.
public class MainApp {static AnnotationConfigApplicationContext context;
public static long time = 0;
public final static int SIZE = 28_000_000;
public final static int DIVIDE_FACTOR = 4;
public static ArrayList<Long>[] numbers = new ArrayList[DIVIDE_FACTOR];
public static ArrayList<SumProcessor> processors = new ArrayList<>();
public static void main(String[] args) throws Exception {
context = new AnnotationConfigApplicationContext(AppConfig.class);
ResultHolder a = context.getBean(ResultHolder.class);
// form 4 datasets
int part_size = SIZE / DIVIDE_FACTOR;
int i;
int j;
for (j = 0; j < DIVIDE_FACTOR; j++) {
numbers[j] = new ArrayList<>(part_size);
for (i = 0; i < (int) part_size; i++) {
numbers[j].add(((j * part_size) + i + 1l));
}
}
// create 4 processors (bean)
for (i = 0; i < DIVIDE_FACTOR; i++) {
SumProcessor bean = context.getBean(SumProcessor.class, numbers[i]);
if (bean == null) throw new Exception("Error receive bean SumProcessor.class");
processors.add(bean);
}
// creates 4 threads fro processors
thread_process thread1 = new thread_process();
thread_process thread2 = new thread_process();
thread_process thread3 = new thread_process();
thread_process thread4 = new thread_process();
try {
boolean isByThread = true; // flag
time = 0;
System.out.println("-------------------");
System.out.println("Multithread compute");
System.out.println("-------------------");
ExecutorService pool = new ThreadPoolExecutor(
4,
4,
0,
TimeUnit.MICROSECONDS,
new LinkedBlockingDeque<>(4) // or ArrayBlockingDeque<>(4)
);
List<Callable<Boolean>> tasks = new ArrayList();
tasks.add(thread1);
tasks.add(thread2);
tasks.add(thread3);
tasks.add(thread4);
List<Future<Boolean>> futures = pool.invokeAll(tasks);
pool.shutdown();
pool.awaitTermination(60, TimeUnit.SECONDS);
System.out.println("Time is: " + time);
a.printSum();
a.clearSum();
time = 0;
} catch (Exception e) {
throw new Exception("MainApp error: " + e);
} // <---- moved single thread out of try block
ArrayList<Long> numbers_total = new ArrayList<>(SIZE);
for (i = 0; i < SIZE; i++) {
numbers_total.add((i + 1l));
}
System.out.println("-------------");
System.out.println("Single thread");
System.out.println("-------------");
SumProcessor sumProcessor = context.getBean(SumProcessor.class, numbers_total);
sumProcessor.work();
System.out.println("Time is: " + time);
a.printSum();
a.clearSum();
time = 0;
context.close();
} // END: main
}
Output:
Initialized B: Processor 1 contain 7000000 numbers from 1 to 7000000
Initialized B: Processor 2 contain 7000000 numbers from 7000001 to 14000000
Initialized B: Processor 3 contain 7000000 numbers from 14000001 to 21000000
Initialized B: Processor 4 contain 7000000 numbers from 21000001 to 28000000
-------------------
Multithread compute
-------------------
Thread[Thread-3,5,main] complete task.
Thread[Thread-2,5,main] complete task.
Thread[Thread-1,5,main] complete task.
Thread[Thread-4,5,main] complete task.
Time is: 5472
Sum is 392000014000000
-------------
Single thread
-------------
Initialized B: Processor 5 contain 28000000 numbers from 1 to 28000000
Time is: 10653
Sum is 392000014000000
Output [Reverse order]:
-------------
Single thread
-------------
Initialized B: Processor 1 contain 28000000 numbers from 1 to 28000000
Time is: 2265
Sum is 392000014000000
Initialized B: Processor 2 contain 7000000 numbers from 1 to 7000000
Initialized B: Processor 3 contain 7000000 numbers from 7000001 to 14000000
Initialized B: Processor 4 contain 7000000 numbers from 14000001 to 21000000
Initialized B: Processor 5 contain 7000000 numbers from 21000001 to 28000000
-------------------
Multithread compute
-------------------
Thread[Thread-2,5,main] complete task.
Thread[Thread-4,5,main] complete task.
Thread[Thread-1,5,main] complete task.
Thread[Thread-3,5,main] complete task.
Time is: 2115
Sum is 392000014000000

Thread Pool per key in Java

Suppose that you have a grid G of n x m cells, where n and m are huge.
Further, suppose that we have numerous tasks, where each task belong to a single cell in G, and should be executed in parallel (in a thread pool or other resource pool).
However, task belonging to the same cell must be done serially, that is, it should wait that previous task in the same cell to be done.
How can I solve this issue?
I've search and used several thread pools (Executors, Thread), but no luck.
Minimum Working Example
import java.util.Random;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class MWE {
public static void main(String[] args) {
ExecutorService threadPool = Executors.newFixedThreadPool(16);
Random r = new Random();
for (int i = 0; i < 10000; i++) {
int nx = r.nextInt(10);
int ny = r.nextInt(10);
Runnable task = new Runnable() {
public void run() {
try {
System.out.println("Task is running");
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
};
threadPool.submit(new Thread(task)); // Should use nx,ny here somehow
}
}
}
You can create a list of n Executors.newFixedThreadPool(1).
Then submit to the corresponding thread by using a hash function.
Ex. threadPool[key%n].submit(new Thread(task)).
A callback mechanism with a synchronized block could work efficiently here.
I have previously answered a similar question here.
There are some limitations (see the linked answer), but it is simple enough to keep track of what is going on (good maintainability).
I have adapted the source code and made it more efficient for your case where most tasks will be executed in parallel
(since n and m are huge), but on occasion must be serial (when a task is for the same point in the grid G).
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.locks.ReentrantLock;
// Adapted from https://stackoverflow.com/a/33113200/3080094
public class GridTaskExecutor {
public static void main(String[] args) {
final int maxTasks = 10_000;
final CountDownLatch tasksDone = new CountDownLatch(maxTasks);
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(16);
try {
GridTaskExecutor gte = new GridTaskExecutor(executor);
Random r = new Random();
for (int i = 0; i < maxTasks; i++) {
final int nx = r.nextInt(10);
final int ny = r.nextInt(10);
Runnable task = new Runnable() {
public void run() {
try {
// System.out.println("Task " + nx + " / " + ny + " is running");
Thread.sleep(1);
} catch (Exception e) {
e.printStackTrace();
} finally {
tasksDone.countDown();
}
}
};
gte.addTask(task, nx, ny);
}
tasksDone.await();
System.out.println("All tasks done, task points remaining: " + gte.size());
} catch (Exception e) {
e.printStackTrace();
} finally {
executor.shutdownNow();
}
}
private final Executor executor;
private final Map<Long, List<CallbackPointTask>> tasksWaiting = new HashMap<>();
// make lock fair so that adding and removing tasks is balanced.
private final ReentrantLock lock = new ReentrantLock(true);
public GridTaskExecutor(Executor executor) {
this.executor = executor;
}
public void addTask(Runnable r, int x, int y) {
Long point = toPoint(x, y);
CallbackPointTask pr = new CallbackPointTask(point, r);
boolean runNow = false;
lock.lock();
try {
List<CallbackPointTask> pointTasks = tasksWaiting.get(point);
if (pointTasks == null) {
if (tasksWaiting.containsKey(point)) {
pointTasks = new LinkedList<CallbackPointTask>();
pointTasks.add(pr);
tasksWaiting.put(point, pointTasks);
} else {
tasksWaiting.put(point, null);
runNow = true;
}
} else {
pointTasks.add(pr);
}
} finally {
lock.unlock();
}
if (runNow) {
executor.execute(pr);
}
}
private void taskCompleted(Long point) {
lock.lock();
try {
List<CallbackPointTask> pointTasks = tasksWaiting.get(point);
if (pointTasks == null || pointTasks.isEmpty()) {
tasksWaiting.remove(point);
} else {
System.out.println(Arrays.toString(fromPoint(point)) + " executing task " + pointTasks.size());
executor.execute(pointTasks.remove(0));
}
} finally {
lock.unlock();
}
}
// for a general callback-task, see https://stackoverflow.com/a/826283/3080094
private class CallbackPointTask implements Runnable {
final Long point;
final Runnable original;
CallbackPointTask(Long point, Runnable original) {
this.point = point;
this.original = original;
}
#Override
public void run() {
try {
original.run();
} finally {
taskCompleted(point);
}
}
}
/** Amount of points with tasks. */
public int size() {
int l = 0;
lock.lock();
try {
l = tasksWaiting.size();
} finally {
lock.unlock();
}
return l;
}
// https://stackoverflow.com/a/12772968/3080094
public static long toPoint(int x, int y) {
return (((long)x) << 32) | (y & 0xffffffffL);
}
public static int[] fromPoint(long p) {
return new int[] {(int)(p >> 32), (int)p };
}
}
This is were systems like Akka in java world make sense.If both X and Y are large, you may want to look at processing them using a message passing mechanism rather than bunch them up in a huge chain of callbacks and futures. One actor has the list of tasks to be done and is handed a cell and the actor would eventually compute the result and persist it. If something fails in the intermediate step, it's not end of world.
If I get you right, you want to execute X tasks (X is very big) in Y queues (Y is much smaller than X).
Java 8 has CompletableFuture class, which represents an (asynchronous) computation. Basically, it's Java's implementation of Promise. Here is how you can organize a chain of computations (generic types omitted):
// start the queue with a "completed" task
CompletableFuture queue = CompletableFuture.completedFuture(null);
// append a first task to the queue
queue = queue.thenRunAsync(() -> System.out.println("first task running"));
// append a second task to the queue
queue = queue.thenRunAsync(() -> System.out.println("second task running"));
// ... and so on
When you use thenRunAsync(Runnable), tasks will be executed using a thread pool (there are other possibilites - see API docs). You can also supply your own thread pool as well.
You can create Y of such chains (possibly keeping references to them in some table).
This library should do the job: https://github.com/jano7/executor
int maxTasks = 16;
ExecutorService threadPool = Executors.newFixedThreadPool(maxTasks);
KeySequentialBoundedExecutor executor = new KeySequentialBoundedExecutor(maxTasks, threadPool);
Random r = new Random();
for (int i = 0; i < 10000; i++) {
int nx = r.nextInt(10);
int ny = r.nextInt(10);
Runnable task = new Runnable() {
public void run() {
try {
System.out.println("Task is running");
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
};
executor.execute(new KeyRunnable<>((ny * 10) + nx, task));
}
The Scala example given below demonstrates how keys in a map can be executed in parallel and values of a key are executed in serial. Change it to Java syntax if you want to try it in Java (Scala uses JVM libraries). Basically chain the tasks future to have them execute sequentially.
import java.util.concurrent.{CompletableFuture, ExecutorService, Executors, Future, TimeUnit}
import scala.collection.concurrent.TrieMap
import scala.collection.mutable.ListBuffer
import scala.util.Random
/**
* For a given Key-Value pair with tasks as values, demonstrates sequential execution of tasks
* within a key and parallel execution across keys.
*/
object AsyncThreads {
val cachedPool: ExecutorService = Executors.newCachedThreadPool
var initialData: Map[String, ListBuffer[Int]] = Map()
var processedData: TrieMap[String, ListBuffer[Int]] = TrieMap()
var runningTasks: TrieMap[String, CompletableFuture[Void]] = TrieMap()
/**
* synchronous execution across keys and values
*/
def processSync(key: String, value: Int, initialSleep: Long) = {
Thread.sleep(initialSleep)
if (key.equals("key_0")) {
println(s"${Thread.currentThread().getName} -> sleep: $initialSleep. Inserting key_0 -> $value")
}
processedData.getOrElseUpdate(key, new ListBuffer[Int]).addOne(value)
}
/**
* parallel execution across keys
*/
def processASync(key: String, value: Int, initialSleep: Long) = {
val task: Runnable = () => {
processSync(key, value, initialSleep)
}
// 1. Chain the futures for sequential execution within a key
val prevFuture = runningTasks.getOrElseUpdate(key, CompletableFuture.completedFuture(null))
runningTasks.put(key, prevFuture.thenRunAsync(task, cachedPool))
// 2. Parallel execution across keys and values
// cachedPool.submit(task)
}
def process(key: String, value: Int, initialSleep: Int): Unit = {
//processSync(key, value, initialSleep) // synchronous execution across keys and values
processASync(key, value, initialSleep) // parallel execution across keys
}
def main(args: Array[String]): Unit = {
checkDiff()
0.to(9).map(kIndex => {
var key = "key_" + kIndex
var values = ListBuffer[Int]()
initialData += (key -> values)
1.to(10).map(vIndex => {
values += kIndex * 10 + vIndex
})
})
println(s"before data:$initialData")
initialData.foreach(entry => {
entry._2.foreach(value => {
process(entry._1, value, Random.between(0, 100))
})
})
cachedPool.awaitTermination(5, TimeUnit.SECONDS)
println(s"after data:$processedData")
println("diff: " + (initialData.toSet diff processedData.toSet).toMap)
cachedPool.shutdown()
}
def checkDiff(): Unit = {
var a1: TrieMap[String, List[Int]] = new TrieMap()
a1.put("one", List(1, 2, 3, 4, 5))
a1.put("two", List(11, 12, 13, 14, 15))
var a2: TrieMap[String, List[Int]] = new TrieMap()
a2.put("one", List(2, 1, 3, 4, 5))
a2.put("two", List(11, 12, 13, 14, 15))
println("a1: " + a1)
println("a2: " + a2)
println("check.diff: " + (a1.toSet diff a2.toSet).toMap)
}
}

Java CountDownLatch with Threads

I am looking to learn about using the Java CountDownLatch to control the execution of a thread.
I have two classes. One is called Poller and the other is Referendum. The threads are created in the Referendum class and their run() methods are contained in the Poller class.
In the Poller and Referendum classes I have imported the java countdown latch via import java.util.concurrent.CountDownLatch.
I am mainly looking to understand why and where the the *.countDown(); and *.await(); statements need to be applied and also to understand if I have correctly initialised the countDownLatch within the Poller constructor.
The complete code for the two classes are:
import java.util.concurrent.CountDownLatch;
public class Poller extends Thread
{
private String id; // pollster id
private int pollSize; // number of samples
private int numberOfPolls; // number of times to perform a poll
private Referendum referendum; // the referendum (implies voting population)
private int sampledVotes[]; // the counts of votes for or against
static CountDownLatch pollsAreComplete; //the CountDownLatch
/**
* Constructor for polling organisation.
* #param r A referendum on which the poller is gathering stats
* #param id The name of this polling organisation
* #param pollSize The size of the poll this poller will use
* #param pollTimes The number of times this poller will conduct a poll
* #param aLatch The coutn down latch that prevents the referendum results from being published
*/
public Poller(Referendum r, String id, int pollSize, int pollTimes, CountDownLatch aLatch)
{
this.referendum = r;
this.id = id;
this.pollSize = pollSize;
this.numberOfPolls = pollTimes;
this.pollsAreComplete = aLatch;
aLatch = new CountDownLatch(3);
// for and against votes to be counted
sampledVotes = new int[2];
}
// getter for numberOfPolls
public int getNumberOfPolls()
{
return numberOfPolls;
}
#Override
//to use the countdown latch
public void run()
{
for (int i = 0; i < getNumberOfPolls(); i++)
{
resetVotes();
pollVotes();
publishPollResults();
}
}
// make sure all sampledVotes are reset to zero
protected void resetVotes()
{
// initialise the vote counts in the poll
for (int i = 0; i < sampledVotes.length; i++)
{
sampledVotes[i] = 0;
}
}
// sampling the way citizens will vote in a referendum
protected void pollVotes()
{
for (int n = 0; n < pollSize; n++)
{
Citizen c = referendum.pickRandomCitizen();
//As things stand, pickRandomCitizen can return null
//because we haven't protected access to the collection
if (c != null)
{
sampledVotes[c.voteFor()]++;
}
}
}
protected void publishPollResults()
{
int vfor = 100 * sampledVotes[Referendum.FOR] / pollSize;
int vagainst = 100 * sampledVotes[Referendum.AGAINST] / pollSize;
System.out.printf("According to %-20s \t(", this.id + ":");
System.out.print("FOR " + vfor);
try
{
Thread.sleep(1000);
} catch (Exception e)
{
e.printStackTrace();
}
System.out.println(", AGAINST " + vagainst + ")");
}
}
And
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.CountDownLatch;
public class Referendum
{
private List<Citizen> citizens; //voters
private List<Poller> pollers; //vote samplers
public static final int FOR = 0; //index for votes array
public static final int AGAINST = 1; //index for votes array
private int votes[]; //for and against votes counters
public Referendum(int population)
{
citizens = new LinkedList<Citizen>();
pollers = new LinkedList<Poller>();
// initialise the referendum with the population
for (int i = 0; i < population; i++)
{
Citizen c = new Citizen(i % 4); //suppose equal party membership
citizens.add(c);
}
votes = new int[2]; //in this example, only For or Against
}
public void addPoller(Poller np)
{
pollers.add(np);
}
public Citizen removeCitizen(int i)
{
return citizens.remove(i);
}
public List<Poller> getPollers()
{
return pollers;
}
public void startPollsWithLatch()
{
//create some poller threads that use a latch
addPoller(new Poller(this, "The Daily Day", 100, 3, Poller.pollsAreComplete));
addPoller(new Poller(this, "Stats people", 100, 3, Poller.pollsAreComplete));
addPoller(new Poller(this, "TV Pundits", 100, 3, Poller.pollsAreComplete));
// start the polls
for (Poller p : pollers)
{
p.start();
}
}
// pick a citizen randomly - access not controlled yet
public Citizen pickRandomCitizen()
{
//TODO add code to this method for part (b)
Citizen randomCitizen;
// first get a random index
int index = (int) (Math.random() * getPopulationSize());
randomCitizen = citizens.remove(index);
return randomCitizen;
}
// Counting the actual votes cast in the referendum
public void castVotes()
{
for (int h = 0; h < getPopulationSize(); h++)
{
Citizen c = citizens.get(h);
votes[c.voteFor()]++;
}
}
// tell the size of population
public int getPopulationSize()
{
return citizens.size();
}
// display the referendum results
public void revealResults()
{
System.out.println(" **** The Referendum Results are out! ****");
System.out.println("FOR");
System.out.printf("\t %.2f %%\n", 100.0 * votes[FOR] / getPopulationSize());
System.out.println("AGAINST");
System.out.printf("\t %.2f %%\n", 100.0 * votes[AGAINST] / getPopulationSize());
}
public static void main(String[] args)
{
// Initialise referendum. The number of people
// has been made smaller here to reduce the simulation time.
Referendum r = new Referendum(50000);
r.startPollsWithLatch();
r.castVotes();
// reveal the results of referendum
r.revealResults();
}
}
In a nutshell...
All threads must execute the publishPollResults(); statement BEFORE the revealResults(); is executed.
OK,
Now, if the publishPollResults must be done by all before the reavelResults, then simply you need to wait for the proper count in your reaveal method. But to do so, the latch must be also shared with the referendum object not only with Pollers.
so, let the referendum creates the latch and pass it to the pollers:
public class Referendum
{
CountDownLatch pollsAreComplete;
...
public void startPollsWithLatch()
{
pollsAreComplete = new CountDownLatch(3); //create new latch to know when the voting is done
//create some poller threads that use a latch
addPoller(new Poller(this, "The Daily Day", 100, 3, pollsAreComplete)); //pass it to pollers
addPoller(new Poller(this, "Stats people", 100, 3, pollsAreComplete));
addPoller(new Poller(this, "TV Pundits", 100, 3, pollsAreComplete));
// start the polls
for (Poller p : pollers)
{
p.start();
}
}
public void revealResults()
{
pollsAreComplete.await(); //we can pass this line only if the latch count went to 0
System.out.println(" **** The Referendum Results are out! ****");
....
}
}
so the Pollers should share the latch. You are using static variable which is OKish but you want be able to use the Pollers with different referendums.
So it is betther, that have it is a instance field and pass it in constructor (you kind of started with constructor but then you passed the value to static variable which makes no sense (and it actaully was always null).
public class Poller extends Thread
{
...
private CountDownLatch pollsAreComplete; //the CountDownLatch shared with referendum
public Poller(Referendum r, String id, int pollSize, int pollTimes, CountDownLatch aLatch)
{
...
this.pollsAreComplete = aLatch;
}
public void run()
{
for (int i = 0; i < getNumberOfPolls(); i++)
{
resetVotes();
pollVotes();
publishPollResults();
}
pollsAreComplete.countDown(); //voting is finished, let the referendum publish the results.
}
}
So once the Poller finished its work it lowers the latch, and when all do it the referendum can continue and print the results.
Mind you all Poller thread will publish their results 3 times (as they have for loop) and only when all 3 are cycles are down they will signal the referendum.
If you wanted the 3 separate phases of referendum it will be very difficult to achieve if with latch as it cannot be reset once it's been down to 0.
If I understood correctly, you want all threads to execute before the results are shown. This requires a single CountDownLatch instance in the Referendum class that is passed to the constructor of each Poller thread. Each Pollercalls countdown() on the latch once it ends the poll, and Referendum calls await() to sleep until the latch countdown reaches zero:
class Referendum {
private CountDownLatch latch;
public CountDownLatch getLatch() {
return latch;
}
// ...
public void startVotesWithLatch() {
// You don't need to pass the latch in constructor,
// as you can retrieve it from the referendum object passed
addPoller(new Poller(this, "Stats people", 100, 3));
// Add other pollers
// Start all pollers
for (Poller p : pollers) {
p.start();
}
// Wait for all pollers to finish
latch.await();
}
}
And in the Poller class remove the latch variable as it is not needed, then in the publishPollResults() method:
public void publishPollResults() {
// Everything stays the same here, except we decrease the latch
// when finished...
referendum.getLatch().countDown();
}
Note however that this type of synchronization is quite simple and does not necessarily require a CountDownLatch, you can simply spawn your Poller threads and then call join() on the main thread (this will pause the main thread until the child threads finish execution).

Java ThreadPoolExecutor

I'm having big troubles understanding the Java ThreadPoolExecutor. For example, I want to calculate the squares of numbers 1-1000:
public static void main(String[] args) throws InterruptedException, ExecutionException {
Callable<ArrayList<Integer>> c = new squareCalculator(1000);
ExecutorService executor = Executors.newFixedThreadPool(5);
Future<ArrayList<Integer>> result = executor.submit(c);
for(Integer i: result.get()){
System.out.println(i);
}
}
And the
public class squareCalculator implements Callable<ArrayList<Integer>>{
private int i;
private int max;
private int threadID;
private static int id;
private ArrayList<Integer> squares;
public squareCalculator(int max){
this.max = max;
this.i = 1;
this.threadID = id;
id++;
squares = new ArrayList<Integer>();
}
public ArrayList<Integer> call() throws Exception {
while(i <= max){
squares.add(i*i);
System.out.println("Proccessed number " +i + " in thread "+this.threadID);
Thread.sleep(1);
i++;
}
return squares;
}
}
Now my problem is, that I only get one thread doing the calculations. I expected to get 5 threads.
If you want the Callable to run 5 times concurrently, you need to submit it 5 times.
You only submitted it once, and then ask for its result 5 times.
Javadoc of submit():
Submits a value-returning task for execution and returns a
Future representing the pending results of the task. The
Future's get method will return the task's result upon
successful completion.
You see that Javadoc for submit() uses the singular for "task", not "tasks".
The fix is easy: submit it multiple times:
Future<ArrayList<Integer>> result1 = executor.submit(c);
Future<ArrayList<Integer>> result2 = executor.submit(c);
Future<ArrayList<Integer>> result3 = executor.submit(c);
/// etc..
result1.get();
result2.get();
result3.get();
// etc..
The ExecutorService will use one thread to execute each Callable task that you submit. Therefore, if you want to have multiple threads calculating the squares, you have to submit multiple tasks, for example one task for each number. You would then get a Future<Integer> from each task, which you can store in a list and call get() on each one to get the results.
public class SquareCalculator implements Callable<Integer> {
private final int i;
public SquareCalculator(int i) {
this.i = i;
}
#Override
public Integer call() throws Exception {
System.out.println("Processing number " + i + " in thread " + Thread.currentThread().getName());
return i * i;
}
public static void main(String[] args) throws Exception {
ExecutorService executor = Executors.newFixedThreadPool(5);
List<Future<Integer>> futures = new ArrayList<>();
// Create a Callable for each number, submit it to the ExecutorService and store the Future
for (int i = 1; i <= 1000; i++) {
Callable<Integer> c = new SquareCalculator(i);
Future<Integer> future = executor.submit(c);
futures.add(future);
}
// Wait for the result of each Future
for (Future<Integer> future : futures) {
System.out.println(future.get());
}
executor.shutdown();
}
}
The output then looks something like this:
Processing number 2 in thread pool-1-thread-2
Processing number 1 in thread pool-1-thread-1
Processing number 6 in thread pool-1-thread-1
Processing number 7 in thread pool-1-thread-2
Processing number 8 in thread pool-1-thread-2
Processing number 9 in thread pool-1-thread-2
...
1
4
9
...
This is a funny problem to try to do in parallel because creating the result array (or list) runs in O(n) time because it gets initialized with zeros on creation.
public static void main(String[] args) throws InterruptedException {
final int chunks = Runtime.getRuntime().availableProcessors();
final int max = 1001;
ExecutorService executor = Executors.newFixedThreadPool(chunks);
final List<ArrayList<Long>> results = new ArrayList<>(chunks);
for (int i = 0; i < chunks; i++) {
final int start = i * max / chunks;
final int end = (i + 1) * max / chunks;
final ArrayList<Long> localResults = new ArrayList<>(0);
results.add(localResults);
executor.submit(new Runnable() {
#Override
public void run() {
// Reallocate enough space locally so it's done in parallel.
localResults.ensureCapacity(end - start);
for (int j = start; j < end; j++) {
localResults.add((long)j * (long)j);
}
}
});
}
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.MICROSECONDS);
int i = 0;
for (List<Long> list : results) {
for (Long l : list) {
System.out.printf("%d: %d\n", i, l);
++i;
}
}
}
Overhead dealing with the wrapper classes will kill performance, here, so you should use something like Fastutil. Then, you could join them with something like Guava's Iterables.concat, only a List version that's compatible with Fastutil's LongList.
This might also make a good ForkJoinTask, but again, you'll need efficient logical (mapping, not copying; the reverse of List.sublist) List concatenation functions to realize a speedup.

Multithreading benchmark test

I want to measure how long it takes for the 2 threads to count till 1000. How can I make a benchmark test of the following code?
public class Main extends Thread {
public static int number = 0;
public static void main(String[] args) {
Thread t1 = new Main();
Thread t2 = new Main();
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
#Override
public void run() {
for (int i = 0; i <= 1000; i++) {
increment();
System.out.println(this.getName() + " " + getNumber());
}
}
public synchronized void increment() {
number++;
}
public synchronized int getNumber() {
return number;
}
}
And why am I still getting the following result (extract) even though I use the synchronized keyword?
Thread-0 9
Thread-0 11
Thread-0 12
Thread-0 13
Thread-1 10
You are not synchronized. The synchronized keyword is an equivalent to synchonize (this) {} but you are increasing a static number which is not contained within your object. You actually have 2 objects/threads and both of them synchronize with themself, not with each other.
Either make you property volatile and don't synchronize at all or use a lock Object like this:
public static int number = 0;
public static final Object lock = new Object();
public void increment() {
synchronized (lock) {
number++;
}
}
public int getNumber() {
synchronized (lock) {
return number;
}
}
Output is not synchronized. The scenario is:
Thread-0 runs 9 iterations alone.
Thread-1 calls increment and getNumber, which returns 10.
Thread-0 runs three more iterations.
Thread-1 calls println with 10.
You are not synchronizing this:
for (int i = 0; i <= 1000; i++) {
increment();
System.out.println(this.getName() + " " + getNumber());
}
So, a thread can execute increment(), wait for the next thread, and after that keep with getValue() (thus getting your results). Given how fast is adding a value, changing a thread gives the other time for several iterations.
Do
public static final String LOCK = "lock";
synchronized(LOCK) {
for (int i = 0; i <= 1000; i++) {
increment();
System.out.println(this.getName() + " " + getNumber());
}
}
you do not need the synchronize for the methods (as I explain in my comment).
why am I still getting the following result (extract) even though I use the synchronized keyword?
You synchronize access to the number variable, however the increment and get are synchronized separately, and this does not make your println() atomic either. This sequence is perfectly possible:
0 -> inc
1 -> inc
0 -> getnumber
1 -> getnumber
1 -> print
0 -> print
First, if you want to solve the "increment and get" problem, you can use an AtomicInteger:
private static final AtomicInteger count = new AtomicInteger(0);
// ...
#Override
public void run()
{
final String me = getName();
for (int i = 0; i < 1000; i++)
System.out.println(me + ": " + count.incrementAndGet());
}
However, even this will not guarantee printing order. With the code above, this scenario is still possible:
0 -> inc
0 -> getnumber
1 -> inc
1 -> getnumber
1 -> print
0 -> print
To solve this problem, you need to use, for instance, a ReentrantLock:
private static final Lock lock = new ReentrantLock();
private static int count;
// ...
#Override
public void run()
{
final String me = getName;
for (int i = 0; i < 1000; i++) {
// ALWAYS lock() in front of a try block and unlock() in finally
lock.lock();
try {
count++;
System.out.println(me + ": " + count);
finally {
lock.unlock();
}
}
}

Categories

Resources