Using java, how can I create 50 threads to make a simple http get request to a particular URL?
I want each thread to make maybe 100-1k requests.
Is it possible to guarantee that all these threads connect at the same time?
I basically want something similiar to Apache bench, but written in java so I can learn some java along the way.
So the input would be:
1. # of requests in total
2. # of threads to use
3. url to make a request with
Update
I guess to keep track of request statistics i.e. how long a particular request took on average I would need a global collection that is thread-safe?
Here is some (incomplete) code:
public class Test {
private static int REQUESTS;
private static int NUM_THREADS;
private static String URL;
private static ArrayList<Statistic> result = new ArrayList<Statistic>();
private static class ThreadTask implements Runnable {
private int tid;
public ThreadTask(int tid) {
this.tid = tid;
}
#Override
public void run() {
Statistic stat = new Statistic();
for(int i = 0; i < REQUESTS; i++) {
// make request
// add results to stat
}
result.add(tid, stat); // no need to lock because each
// thread writes to a dedicated index
}
}
public static void main(String[] args) {
// take command line arguments
REQUESTS = Integer.parseInt(args[0]);
NUM_THREADS = Integer.parseInt(args[1]);
URL = args[2];
Thread[] threads = new Thread[NUM_THREADS];
// start threads
for(int i = 0; i < NUM_THREADS; i++) {
threads[i] = new Thread(new ThreadTask(i));
threads[i].start();
}
// wait for threads to finish
for(int i = 0; i < NUM_THREADS; i++) {
try {
threads[i].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
Class Statistic is something defined by you to collect whatever statistics you want.
Of course, many improvements can be suggested, this is just what I wrote in 5 minutes. :) Hope it helps.
You want to use a combination of a ThreadPoolExecutor for scheduling threads and a CyclicBarrier for activating all threads at the same time. Both classes are in the java.util.concurrency package.
I have used JMeter for this purpose. http://jmeter.apache.org.
There is a little bit of learning curve for the tool.
In Jmeter, threadgroup allows you to use number of threads, ramp-up period and loop count.
You can create an HTTP request and view results using view results in a Tree or Table. Hope this helps.
Related
I have a problem which I would like to solve using Java's ExecutorService and Future classes. I am currently taking many samples from a function that is very expensive for me to compute (each sample can take several minutes) using a for loop. I have a class FunctionEvaluator that evaluates this function for me and this class is quite expensive to instantiate, since it contains a lot of internal memory, so I have made this class easily reusable with some internal counters and a reset() method. So my current situation looks like this:
int numSamples = 100;
int amountOfData = 1000000;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator fe = new FunctionEvaluator();
for(int i=0; i<numSamples; i++) {
results[i] = fe.sampleAt(i, data);//very expensive computation
}
but I would like to get some multithreading going to speed things up. It should be easy enough, because while each sample will share whatever is inside of data, it is a read-only operation and each sample is independent of any other. Now I wouldn't be having any trouble with this since I've used Java's Future and ExecutorService before, but never in a context where the Callable had to be re-used. So in general, how would I go about setting this scenario up given that I can afford to run n instantiations of FunctionEvaluator? Something (very roughly) like this:
int numSamples = 100;
int amountOfData = 1000000;
int N = 10;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator[] fe = new FunctionEvaluator[N];
for(int i=0; i<numSamples; i++) {
//Somehow add available FunctionEvaluators to an ExecutorService
//so that N FunctionEvaluators can run in parallel. When a
//FunctionEvaluator is finished, reset then compute a new sample
//until numSamples samples have been taken.
}
Any help would be greatly appreciated! Many thanks.
EDIT
So here is a toy example (which doesn't work :P). In this case the "expensive function" that I want to sample is just squaring an integer and the "expensive to instantiate class" that does it for me is called CallableComputation:
In TestConc.java:
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class TestConc {
public static void main(String[] args) {
SquareCalculator squareCalculator = new SquareCalculator();
int numFunctionEvaluators = 2;
int numSamples = 10;
ExecutorService executor = Executors.newFixedThreadPool(2);
CallableComputation c1 = new CallableComputation(2);
CallableComputation c2 = new CallableComputation(3);
CallableComputation[] callables = new CallableComputation[numFunctionEvaluators];
Future<Integer>[] futures = (new Future[numFunctionEvaluators]);
int[] results = new int[numSamples];
for(int i=0; i<numFunctionEvaluators; i++) {
callables[i] = new CallableComputation(i);
futures[i] = executor.submit(callables[i]);
}
futures[0] = executor.submit(c1);
futures[1] = executor.submit(c2);
for(int i=numFunctionEvaluators; i<numSamples; ) {
for(int j=0; j<futures.length; j++) {
if(futures[j].isDone()) {
try {
results[i] = futures[j].get();
}
catch (InterruptedException e) {
e.printStackTrace();
}
catch (ExecutionException e) {
e.printStackTrace();
}
callables[j].set(i);
System.out.printf("Function evaluator %d given %d\n", j, i+1);
executor.submit(callables[j]);
i++;
}
}
}
executor.shutdown();
try {
executor.awaitTermination(1, TimeUnit.MINUTES);
}
catch (InterruptedException e) {
e.printStackTrace();
}
for (int i=0; i<results.length; i++) {
System.out.printf("res%d=%d, ", i, results[i]);
}
System.out.println();
}
private static boolean areDone(Future<Integer>[] futures) {
for(int i=0; i<futures.length; i++) {
if(!futures[i].isDone()) {
return false;
}
}
return true;
}
private static void printFutures(Future<Integer>[] futures) {
for (int i=0; i<futures.length; i++) {
System.out.printf("f%d=%s | ", i, futures[i].isDone()?"done" : "not done");
}System.out.printf("\n");
}
}
In CallableComputation.java:
import java.util.concurrent.Callable;
public class CallableComputation implements Callable<Integer>{
int input = 0;
public CallableComputation(int input) {
this.input = input;
}
public void set(int i) {
input = i;
}
#Override
public Integer call() throws Exception {
System.out.printf("currval=%d\n", input);
Thread.sleep(500);
return input * input;
}
}
In Java8:
double[] result = IntStream.range(0, numSamples)
.parallel()
.mapToDouble(i->fe.sampleAt(i, data))
.toArray();
The question asks how to execute heavy computational functions in parallel by loading as many CPU as possible.
Exert from the Parallelism tutorial:
Parallel computing involves dividing a problem into subproblems,
solving those problems simultaneously (in parallel, with each
subproblem running in a separate thread), and then combining the
results of the solutions to the subproblems. Java SE provides the
fork/join framework, which enables you to more easily implement
parallel computing in your applications. However, with this framework,
you must specify how the problems are subdivided (partitioned). With
aggregate operations, the Java runtime performs this partitioning and
combining of solutions for you.
The actual solution includes:
IntStream.range will generate the stream of integers from 0 to numSamples.
parallel() will split the stream and execute it will all available CPU on the box.
mapToDouble() will convert the stream of integers to the stream of doubles by applying the lamba expression that will do actual work.
toArray() is a terminal operation that will aggregate the result and return it as an array.
There is no special code change required, you can use the same Callable again and again without any issue. Also, to improve efficiency, as you are saying, creating an instance of FunctionEvaluator is expensive, you can use only one instance and ensure that sampleAt is thread safe. One option is, maybe you can use all function local variables and don't modify any of the passing argument at any point of time while any of the thread is running
Please find a quick example below:
Code Snippet:
ExecutorService executor = Executors.newFixedThreadPool(2);
Callable<String> task1 = new Callable<String>(){public String call(){System.out.println(Thread.currentThread()+"currentThread");return null;}}
executor.submit(task1);
executor.submit(task1);
executor.shutdown();
Please find the screenshot below:
You can wrap each FunctionEvaluator's actual work as a Callable/Runnanle, then using a fixdThreadPool with a queue, then you just need to sumbit the target callable/runnable to the threadPool.
I would like to get some multithreading going to speed things up.
Sounds like a good idea but your code is massively over complex. #Pavel has a dead simple Java 8 solution but even without Java 8 you can make it a lot easier.
All you need to do is to submit the jobs into the executor and then call get() on each one of the Futures that are returned. A Callable class is not needed although it does make the code a lot cleaner. But you certainly don't need the arrays which are a bad pattern anyway because a typo can easily generate out-of-bounds exceptions. Stick to collections or Java 8 streams.
ExecutorService executor = Executors.newFixedThreadPool(2);
List<Future<Integer>> futureList = new ArrayList<Future<Integer>>();
for (int i = 0; i < numSamples; i++ ) {
// start the jobs running in the background
futureList.add(executor.subject(new CallableComputation(i));
}
// shutdown executor if done submitting tasks, submitted jobs will keep running
executor.shutdown();
for (Future<Integer> future : futureList) {
// this will wait for the future to finish, it also throws some exceptions
Integer result = future.get();
// add result to a collection or something here
}
This question already has answers here:
How should I unit test multithreaded code?
(29 answers)
Closed 5 years ago.
How do I test something like this in multithreaded environment. I know it's gonna fail, cause this code is not thread-safe. I just wanna know how can i prove it? Creating bunch of threads and trying to add with those different threads? This code is intentionally not written properly cause of testing purposes !!!
public class Response_Unit_Manager {
private static HashMap<String, Response_Unit> Response_Unit_DB =
new HashMap<> ();
/**
*
* This subprogram adds a new Response_Unit to the data store. The
* new response unit must be valid Response_Unit object and it's ID must be
* unique (i.e., must not already exist in the data store.
*
* Exceptions Thrown: Null_Object_Exception
*/
public static void Add_Response_Unit (Response_Unit New_Unit)
throws Null_Object_Exception, Duplicate_Item_Exception {
String Unit_ID = New_Unit.Unit_ID ();
if (New_Unit == null)
throw new Null_Object_Exception ();
else if (Response_Unit_Exists (Unit_ID))
throw new Duplicate_Item_Exception (Unit_ID);
else
Response_Unit_DB.put (Unit_ID, New_Unit);
} //end Add_Response_Unit
You may get lucky and see a failure when running a test, but non-failing code doesn't mean that it's thread-safe code. The only automated ways to check thread-safety is with some static analysis tools that let you put annotations on methods/classes and scan for potential issues. For example, I know FindBugs support some annotations and does concurrency checking based on them. You should be able to apply this to your single Tester class. There is still a lot of room for improvement in the industry on this topic, but here are some current examples:
http://robertfeldt.net/publications/grahn_2010_comparing_static_analysis_tools_for_concurrency_bugs.pdf
http://homepages.inf.ed.ac.uk/dts/students/spathoulas/spathoulas.pdf
As others have noted, you can't write a test that will guarantee failure as the thread schedule might "just work out", but you can write tests that have a very low probability of passing if there are thread safety issues. For example, you're code attempts to disallow duplicate items in your DB but due to thread safety issues it can't do that. So spawn a ton of threads, have them all wait on a CountdownLatch or something to maximize your chances of triggering the race, then have them all try to insert the same item. Finally you can check that (a) all but one thread saw a Duplicate_Item_Exception and (b) Response_Unit_DB contains only a single item. For these kinds of tests you can also run it several times (in the same test) to maximize your chances of triggering the issue.
Here's an example:
#Test
public void testIsThreadSafe() {
final int NUM_ITERATIONS = 100;
for(int i = 0; i < NUM_ITERATIONS; ++i) {
oneIsThreaSafeTest();
}
}
public void oneIsThreadSafeTest() {
final int NUM_THREADS = 1000;
final int UNIT_ID = 1;
final Response_Unit_Manager manager = new Response_Unit_Manager();
ExecutorService exec = Executors.newFixedThreadPool(NUM_THREADS);
CountdownLatch allThreadsWaitOnThis = new CountdownLatch(1);
AtomicInteger numThreadsSawException = new AtomicInteger(0);
for (int i = 0; i < NUM_THREADS; ++i) {
// this is a Java 8 Lambda, if using Java 7 or less you'd use a
// class that implements Runnable
exec.submit(() -> {
allThreadsWaitOnThis.await();
// making some assumptions here about how you construct
// a Response_Unit
Response_Unit unit = new Response_Unit(UNIT_ID);
try {
manager.Add_Response_Unit(unit);
} catch (Duplicate_Item_Exception e) {
numThreadsSawException.incrementAndGet();
}
});
// release all the threads
allThreadsWaitOnThis.countdown();
// wait for them all to finish
exec.shutdown();
exec.awaitTermination(10, TimeUnits.MINUTES);
assertThat(numThreadsSawException.get()).isEqualTo(NUM_THREADS - 1);
}
You can construct similar tests for the other potential thread safety issues.
The easiest way to find errors with testing, like the one which is contained in your class, is to use a Testrunner like for example the following:
package com.anarsoft.mit;
import java.util.concurrent.atomic.AtomicInteger;
public class Test_Response_Unit_Manager implements Runnable {
private final AtomicInteger threadCount = new AtomicInteger();
public void test() throws Exception
{
for(int i = 0; i < 2 ;i++)
{
Thread thread = new Thread(this, "Thread " + i);
this.threadCount.incrementAndGet();
thread.start();
}
while( this.threadCount.get() > 0 )
{
Thread.sleep(1000);
}
Thread.sleep(10 * 1000);
}
public void run()
{
exec();
threadCount.decrementAndGet();
}
protected void exec()
{
Response_Unit_Manager.Add_Response_Unit(new Response_Unit(Thread.currentThread().getId()));
}
public static void main(String[] args) throws Exception
{
(new Test_Response_Unit_Manager()).test();
}
}
And to use a dynamic race condition detection tool like http://vmlens.com, a lightweight race condition detector. This will show you the following race conditions:
And the stacktraces leading to the bug. On the left the write and one the right the read.
http://vmlens.com works with eclipse, so it depens on the ide you are using, if its useful for you
I am trying to simulate a triatlon competition using CyclicBarrier but it doesn't work as expected and I don't know why.
Each part of the competition has to wait till all the runners have completed the previous one, but it seems like it's waiting forever.
This is the piece of code for the phase one:
class Runner implements Runnable
{
private CyclicBarrier bar = null;
private static int runners;
private static double[] time;
private int number;
public static String name;
public Runner(int runners, String name)
{
time = new double[runners];
for (int i=0; i<runners; i++)
time[i] = 0;
this.name= name;
}
public Runner(CyclicBarrier bar, int number)
{
this.bar = bar;
this.number = number;
}
public void run()
{
try { int i = bar.await(); }
catch(InterruptedException e) {}
catch (BrokenBarrierException e) {}
double tIni = System.nanoTime();
try { Thread.sleep((int)(100*Math.random()); } catch(InterruptedException e) {}
double t = System.nanoTime() - tIni;
time[number] += t;
}
}
public class Triatlon
{
public static void main(String[] args)
{
int runners = 100;
CyclicBarrier Finish_Line_1 = new CyclicBarrier (runners);
Runner c = new Runner(runners, "Triatlon");
ExecutorService e = Executors.newFixedThreadPool(runners);
for (int i=0; i<runners; i++)
e.submit(new Runner(Finish_Line_1, i));
System.out.println(Finish_Line_1.getNumberWaiting()); // this always shows 99
try { int i = Finish_Line_1.await(); }
catch(InterruptedException e01) {}
catch (BrokenBarrierException e02) {}
System.out.println("Swimming phase completed");
// here the rest of the competition, which works the same way
}
}
You have an off-by-one error: you create a CyclicBarrier for 100 threads, but execute 101 awaits, the one-off being in the main method. Due to the semantics of the cyclic barrier, and subject to nondeterministic conditions, your main thread will be the last to execute await, thereby being left alone waiting for another 99 threads to join in.
After you fix this problem, you'll find out that the application keeps running even after all work is done. This is because you didn't call e.shutdown(), so all the threads in the pool stay alive after the main thread is done.
BTW getNumberWaiting always shows 0 for me, which is the expected value after the barrier has been lowered due to 100 submitted threads reaching it. This is nondeterministic, however, and could change at any time.
CyclicBarrier cycles around once all parties have called await and the barrier is opened. Hence the name.
So if you create it with 5 parties and there are 6 calls to await the last one will trigger it to be waiting again for 4 more parties to join.
That's basically what happens here as you have the 1 extra await call in your main. It is waiting for another runners-1 calls to happen.
The simple fix is to create the CyclicBarrier with runners+1 parties.
I have encountered a problem when I want to do a time counting.
Basicly the problem is like this: there is a class A, which initiates a private thread in itself, and I have a instant of A in my class B, and in the main method of B I invoked some methods of A and want to test the time to run these methods.
A a = new A();
//start time counter
for (int i = 0; i < 10; i++){ invoke a.method() that takes some time}
//end time counter and prints the time elapsed
but by doing so the method in the for loop will running in a seperate thread in A and the prints method in the last line would probably be executed before the loop ends. So I want to access the thead in a and invokes a join() to wait until all stuff in the for loop get finished. Could you help me figure how to achieve this? Any ideas would be greatly appreciated.
List All Threads and their Groups
public class Main
{
public static void visit(final ThreadGroup group, final int level)
{
final Thread[] threads = new Thread[group.activeCount() * 2];
final int numThreads = group.enumerate(threads, false);
for (int i = 0; i < numThreads; i++)
{
Thread thread = threads[i];
System.out.format("%s:%s\n", group.getName(), thread.getName());
}
final ThreadGroup[] groups = new ThreadGroup[group.activeGroupCount() * 2];
final int numGroups = group.enumerate(groups, false);
for (int i = 0; i < numGroups; i++)
{
visit(groups[i], level + 1);
}
}
public static void main(final String[] args)
{
ThreadGroup root = Thread.currentThread().getThreadGroup().getParent();
while (root.getParent() != null)
{
root = root.getParent();
}
visit(root, 0);
}
}
Based on your edits, you might can find out what group and name the thread is and get a reference to it that way and do what you need to do.
For your own code in the future
You want to look at ExecutorCompletionService and the other thread management facilities in java.util.concurrent. You should not be managing threads manually in Java anymore, pretty much every case you can imagine is handled one or more of the ExecutorService implementations.
First and once more, thanks to all that already answered my question. I am not a very experienced programmer and it is my first experience with multithreading.
I got an example that is working quite like my problem. I hope it could ease our case here.
public class ThreadMeasuring {
private static final int TASK_TIME = 1; //microseconds
private static class Batch implements Runnable {
CountDownLatch countDown;
public Batch(CountDownLatch countDown) {
this.countDown = countDown;
}
#Override
public void run() {
long t0 =System.nanoTime();
long t = 0;
while(t<TASK_TIME*1e6){ t = System.nanoTime() - t0; }
if(countDown!=null) countDown.countDown();
}
}
public static void main(String[] args) {
ThreadFactory threadFactory = new ThreadFactory() {
int counter = 1;
#Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r, "Executor thread " + (counter++));
return t;
}
};
// the total duty to be divided in tasks is fixed (problem dependent).
// Increase ntasks will mean decrease the task time proportionally.
// 4 Is an arbitrary example.
// This tasks will be executed thousands of times, inside a loop alternating
// with serial processing that needs their result and prepare the next ones.
int ntasks = 4;
int nthreads = 2;
int ncores = Runtime.getRuntime().availableProcessors();
if (nthreads<ncores) ncores = nthreads;
Batch serial = new Batch(null);
long serialTime = System.nanoTime();
serial.run();
serialTime = System.nanoTime() - serialTime;
ExecutorService executor = Executors.newFixedThreadPool( nthreads, threadFactory );
CountDownLatch countDown = new CountDownLatch(ntasks);
ArrayList<Batch> batches = new ArrayList<Batch>();
for (int i = 0; i < ntasks; i++) {
batches.add(new Batch(countDown));
}
long start = System.nanoTime();
for (Batch r : batches){
executor.execute(r);
}
// wait for all threads to finish their task
try {
countDown.await();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
long tmeasured = (System.nanoTime() - start);
System.out.println("Task time= " + TASK_TIME + " ms");
System.out.println("Number of tasks= " + ntasks);
System.out.println("Number of threads= " + nthreads);
System.out.println("Number of cores= " + ncores);
System.out.println("Measured time= " + tmeasured);
System.out.println("Theoretical serial time= " + TASK_TIME*1000000*ntasks);
System.out.println("Theoretical parallel time= " + (TASK_TIME*1000000*ntasks)/ncores);
System.out.println("Speedup= " + (serialTime*ntasks)/(double)tmeasured);
executor.shutdown();
}
}
Instead of doing the calculations, each batch just waits for some given time. The program calculates the speedup, that would allways be 2 in theory but can get less than 1 (actually a speed down) if the 'TASK_TIME' is small.
My calculations take at the top 1 ms and are commonly faster. For 1 ms I find a little speedup of around 30%, but in practice, with my program, I notice a speed down.
The structure of this code is very similar to my program, so if you could help me to optimise the thread handling I would be very grateful.
Kind regards.
Below, the original question:
Hi.
I would like to use multithreading on my program, since it could increase its efficiency considerably, I believe. Most of its running time is due to independent calculations.
My program has thousands of independent calculations (several linear systems to solve), but they just happen at the same time by minor groups of dozens or so. Each of this groups would take some miliseconds to run. After one of these groups of calculations, the program has to run sequentially for a little while and then I have to solve the linear systems again.
Actually, it can be seen as these independent linear systems to solve are inside a loop that iterates thousands of times, alternating with sequential calculations that depends on the previous results. My idea to speed up the program is to compute these independent calculations in parallel threads, by dividing each group into (the number of processors I have available) batches of independent calculation. So, in principle, there isn't queuing at all.
I tried using the FixedThreadPool and CachedThreadPool and it got even slower than serial processing. It seems to takes too much time creating new Treads each time I need to solve the batches.
Is there a better way to handle this problem? These pools I've used seem to be proper for cases when each thread takes more time instead of thousands of smaller threads...
Thanks!
Best Regards!
Thread pools don't create new threads over and over. That's why they're pools.
How many threads were you using and how many CPUs/cores do you have? What is the system load like (normally, when you execute them serially, and when you execute with the pool)? Is synchronization or any kind of locking involved?
Is the algorithm for parallel execution exactly the same as the serial one (your description seems to suggest that serial was reusing some results from previous iteration).
From what i've read: "thousands of independent calculations... happen at the same time... would take some miliseconds to run" it seems to me that your problem is perfect for GPU programming.
And i think it answers you question. GPU programming is becoming more and more popular. There are Java bindings for CUDA & OpenCL. If it is possible for you to use it, i say go for it.
I'm not sure how you perform the calculations, but if you're breaking them up into small groups, then your application might be ripe for the Producer/Consumer pattern.
Additionally, you might be interested in using a BlockingQueue. The calculation consumers will block until there is something in the queue and the block occurs on the take() call.
private static class Batch implements Runnable {
CountDownLatch countDown;
public Batch(CountDownLatch countDown) {
this.countDown = countDown;
}
CountDownLatch getLatch(){
return countDown;
}
#Override
public void run() {
long t0 =System.nanoTime();
long t = 0;
while(t<TASK_TIME*1e6){ t = System.nanoTime() - t0; }
if(countDown!=null) countDown.countDown();
}
}
class CalcProducer implements Runnable {
private final BlockingQueue queue;
CalcProducer(BlockingQueue q) { queue = q; }
public void run() {
try {
while(true) {
CountDownLatch latch = new CountDownLatch(ntasks);
for(int i = 0; i < ntasks; i++) {
queue.put(produce(latch));
}
// don't need to wait for the latch, only consumers wait
}
} catch (InterruptedException ex) { ... handle ...}
}
CalcGroup produce(CountDownLatch latch) {
return new Batch(latch);
}
}
class CalcConsumer implements Runnable {
private final BlockingQueue queue;
CalcConsumer(BlockingQueue q) { queue = q; }
public void run() {
try {
while(true) { consume(queue.take()); }
} catch (InterruptedException ex) { ... handle ...}
}
void consume(Batch batch) {
batch.Run();
batch.getLatch().await();
}
}
class Setup {
void main() {
BlockingQueue<Batch> q = new LinkedBlockingQueue<Batch>();
int numConsumers = 4;
CalcProducer p = new CalcProducer(q);
Thread producerThread = new Thread(p);
producerThread.start();
Thread[] consumerThreads = new Thread[numConsumers];
for(int i = 0; i < numConsumers; i++)
{
consumerThreads[i] = new Thread(new CalcConsumer(q));
consumerThreads[i].start();
}
}
}
Sorry if there are any syntax errors, I've been chomping away at C# code and sometimes I forget the proper java syntax, but the general idea is there.
If you have a problem which does not scale to multiple cores, you need to change your program or you have a problem which is not as parallel as you think. I suspect you have some other type of bug, but cannot say based on the information given.
This test code might help.
Time per million tasks 765 ms
code
ExecutorService es = Executors.newFixedThreadPool(4);
Runnable task = new Runnable() {
#Override
public void run() {
// do nothing.
}
};
long start = System.nanoTime();
for(int i=0;i<1000*1000;i++) {
es.submit(task);
}
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
long time = System.nanoTime() - start;
System.out.println("Time per million tasks "+time/1000/1000+" ms");
EDIT: Say you have a loop which serially does this.
for(int i=0;i<1000*1000;i++)
doWork(i);
You might assume that changing to loop like this would be faster, but the problem is that the overhead could be greater than the gain.
for(int i=0;i<1000*1000;i++) {
final int i2 = i;
ex.execute(new Runnable() {
public void run() {
doWork(i2);
}
}
}
So you need to create batches of work (at least one per thread) so there are enough tasks to keep all the threads busy, but not so many tasks that your threads are spending time in overhead.
final int batchSize = 10*1000;
for(int i=0;i<1000*1000;i+=batchSize) {
final int i2 = i;
ex.execute(new Runnable() {
public void run() {
for(int i3=i2;i3<i2+batchSize;i3++)
doWork(i3);
}
}
}
EDIT2: RUnning atest which copied data between threads.
for (int i = 0; i < 20; i++) {
ExecutorService es = Executors.newFixedThreadPool(1);
final double[] d = new double[4 * 1024];
Arrays.fill(d, 1);
final double[] d2 = new double[4 * 1024];
es.submit(new Runnable() {
#Override
public void run() {
// nothing.
}
}).get();
long start = System.nanoTime();
es.submit(new Runnable() {
#Override
public void run() {
synchronized (d) {
System.arraycopy(d, 0, d2, 0, d.length);
}
}
});
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
// get a the values in d2.
for (double x : d2) ;
long time = System.nanoTime() - start;
System.out.printf("Time to pass %,d doubles to another thread and back was %,d ns.%n", d.length, time);
}
starts badly but warms up to ~50 us.
Time to pass 4,096 doubles to another thread and back was 1,098,045 ns.
Time to pass 4,096 doubles to another thread and back was 171,949 ns.
... deleted ...
Time to pass 4,096 doubles to another thread and back was 50,566 ns.
Time to pass 4,096 doubles to another thread and back was 49,937 ns.
Hmm, CachedThreadPool seems to be created just for your case. It does not recreate threads if you reuse them soon enough, and if you spend a whole minute before you use new thread, the overhead of thread creation is comparatively negligible.
But you can't expect parallel execution to speed up your calculations unless you can also access data in parallel. If you employ extensive locking, many synchronized methods, etc you'll spend more on overhead than gain on parallel processing. Check that your data can be efficiently processed in parallel and that you don't have non-obvious synchronizations lurkinb in the code.
Also, CPUs process data efficiently if data fully fit into cache. If data sets of each thread is bigger than half the cache, two threads will compete for cache and issue many RAM reads, while one thread, if only employing one core, may perform better because it avoids RAM reads in the tight loop it executes. Check this, too.
Here's a psuedo outline of what I'm thinking
class WorkerThread extends Thread {
Queue<Calculation> calcs;
MainCalculator mainCalc;
public void run() {
while(true) {
while(calcs.isEmpty()) sleep(500); // busy waiting? Context switching probably won't be so bad.
Calculation calc = calcs.pop(); // is it pop to get and remove? you'll have to look
CalculationResult result = calc.calc();
mainCalc.returnResultFor(calc,result);
}
}
}
Another option, if you're calling external programs. Don't put them in a loop that does them one at a time or they won't run in parallel. You can put them in a loop that PROCESSES them one at a time, but not that execs them one at a time.
Process calc1 = Runtime.getRuntime.exec("myCalc paramA1 paramA2 paramA3");
Process calc2 = Runtime.getRuntime.exec("myCalc paramB1 paramB2 paramB3");
Process calc3 = Runtime.getRuntime.exec("myCalc paramC1 paramC2 paramC3");
Process calc4 = Runtime.getRuntime.exec("myCalc paramD1 paramD2 paramD3");
calc1.waitFor();
calc2.waitFor();
calc3.waitFor();
calc4.waitFor();
InputStream is1 = calc1.getInputStream();
InputStreamReader isr1 = new InputStreamReader(is1);
BufferedReader br1 = new BufferedReader(isr1);
String resultStr1 = br1.nextLine();
InputStream is2 = calc2.getInputStream();
InputStreamReader isr2 = new InputStreamReader(is2);
BufferedReader br2 = new BufferedReader(isr2);
String resultStr2 = br2.nextLine();
InputStream is3 = calc3.getInputStream();
InputStreamReader isr3 = new InputStreamReader(is3);
BufferedReader br3 = new BufferedReader(isr3);
String resultStr3 = br3.nextLine();
InputStream is4 = calc4.getInputStream();
InputStreamReader isr4 = new InputStreamReader(is4);
BufferedReader br4 = new BufferedReader(isr4);
String resultStr4 = br4.nextLine();