I am working on a project in which I need to measure Total Time taken by program and average time taken by program. And that program is a Multithreaded program.
In that program, each thread is working in a particular range. Input parameters is Number of Threads and Number of Task.
If number of threads is 2 and number of tasks is 10 then each thread will be performing 10 tasks. So that means 2 thread will be doing 20 tasks.
So that means-
First thread should be using id between 1 and 10 and second thread should be using id between 11 and 20.
I got the above scenario working. Now I want to measure what is the total time and average time taken by all the threads. So I got the below setup in my program.
Problem Statement:-
Can anyone tell me the way I am trying to measure the Total time and Average time taken by all the threads is correct or not in my below program?
//create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(noOfThreads);
long startTime = 0L;
try {
readPropertyFiles();
startTime = System.nanoTime();
// queue some tasks
for (int i = 0, nextId = startRange; i < noOfThreads; i++, nextId += noOfTasks) {
service.submit(new XMPTask(nextId, noOfTasks, tableList));
}
service.shutdown();
service.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
} finally {
long estimatedTime = System.nanoTime() - startTime;
logTimingInfo(estimatedTime, noOfTasks, noOfThreads);
}
private static void logTimingInfo(long elapsedTime, int noOfTasks, int noOfThreads) {
long timeInMilliseconds = elapsedTime / 1000000L;
float avg = (float) (timeInMilliseconds) / noOfTasks * noOfThreads;
LOG.info(CNAME + "::" + "Total Time taken " + timeInMilliseconds + " ms. And Total Average Time taken " + avg + " ms");
}
service.submit is getting executed only noOfThreads times. XMPTask object is created the same number of times.
The time you measure is not the consumed time but the elapsed time.
If the program tested (the JVM) is the only one on the computer, it may be relatively accurate but in a real world a lot of process runs concurrently.
I have already done this job by using a native call to the OS, on Windows (I'll complete this post Monday at my office) and Linux (/proc).
I think you would need to measure the time within the task class itself (XMPTask). Within that task you should be able to extract the id of the thread that is executing it and log that. Using this approach will require reading the logs and doing some calculations on them.
Another approach would be to keep running totals and averages as time progresses. To do this you could write a simple class that is passed into each task that has some static (per jvm) variables for tracking what each thread is doing. Then you could have a single thread outside the Threadpool that did the calculations. So if you wanted to report the average cpu time for each thread every second, this calculation thread could sleep for a second, then calculate and log all the average times, then sleep for a second....
EDIT: After re-reading the requirements, you don't need a background thread, but not sure if we are tracking the average time per thread or average time per task. I have assumed total time and average time per thread and fleshed out the idea in code below. It has not been tested or debugged but should give you a good idea of how to start:
public class Runner
{
public void startRunning()
{
// Create your thread pool
ExecutorService service = Executors.newFixedThreadPool(noOfThreads);
readPropertyFiles();
MeasureTime measure = new MeasureTime();
// queue some tasks
for (int i = 0, nextId = startRange; i < noOfThreads; i++, nextId += noOfTasks)
{
service.submit(new XMPTask(nextId, noOfTasks, tableList, measure));
}
service.shutdown();
service.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
measure.printTotalsAndAverages();
}
}
public class MeasureTime
{
HashMap<Long, Long> threadIdToTotalCPUTimeNanos = new HashMap<Long, Long>();
HashMap<Long, Long> threadIdToStartTimeMillis = new HashMap<Long, Long>();
HashMap<Long, Long> threadIdToStartTimeNanos = new HashMap<Long, Long>();
private void addThread(Long threadId)
{
threadIdToTotalCPUTimeNanos.put(threadId, 0L);
threadIdToStartTimeMillis.put(threadId, 0L);
}
public void startTimeCount(Long threadId)
{
synchronized (threadIdToStartTimeNanos)
{
if (!threadIdToStartTimeNanos.containsKey(threadId))
{
addThread(threadId);
}
long nanos = System.nanoTime();
threadIdToStartTimeNanos.put(threadId, nanos);
}
}
public void endTimeCount(long threadId)
{
synchronized (threadIdToStartTimeNanos)
{
long endNanos = System.nanoTime();
long startNanos = threadIdToStartTimeNanos.get(threadId);
long nanos = threadIdToTotalCPUTimeNanos.get(threadId);
nanos = nanos + (endNanos - startNanos);
threadIdToTotalCPUTimeNanos.put(threadId, nanos);
}
}
public void printTotalsAndAverages()
{
long totalForAllThreadsNanos = 0L;
int numThreads = 0;
long totalWallTimeMillis = 0;
synchronized (threadIdToStartTimeNanos)
{
numThreads = threadIdToStartTimeMillis.size();
for (Long threadId: threadIdToStartTimeNanos.keySet())
{
totalWallTimeMillis += System.currentTimeMillis() - threadIdToStartTimeMillis.get(threadId);
long totalCPUTimeNanos = threadIdToTotalCPUTimeNanos.get(threadId);
totalForAllThreadsNanos += totalCPUTimeNanos;
}
}
long totalCPUMillis = (totalForAllThreadsNanos)/1000000;
System.out.println("Total milli-seconds for all threads: " + totalCPUMillis);
double averageMillis = totalCPUMillis/numThreads;
System.out.println("Average milli-seconds for all threads: " + averageMillis);
double averageCPUUtilisation = totalCPUMillis/totalWallTimeMillis;
System.out.println("Average CPU utilisation for all threads: " + averageCPUUtilisation);
}
}
public class XMPTask implements Callable<String>
{
private final MeasureTime measure;
public XMPTask(// your parameters first
MeasureTime measure)
{
// Save your things first
this.measure = measure;
}
#Override
public String call() throws Exception
{
measure.startTimeCount(Thread.currentThread().getId());
try
{
// do whatever work here that burns some CPU.
}
finally
{
measure.endTimeCount(Thread.currentThread().getId());
}
return "Your return thing";
}
}
After writing all this, there is one thing that seems a bit strange in that the XMPTask seems to know too much about the list of tasks, when, I think you should just create an XMPTask for every task that you have, give it enough information to do the job, and submit them to the service as you create them.
Related
I'm studying Java multi threading and trying to check performance with multiple threads.I am trying to check whether multi threading is better than with single thread.
So, I wrote a code which sums to limit.
It is working as I expected(multiple threads are faster than single thread) when limit gets larger but it didn't when limit is small like 100000L.
Is this due to context-switching ? and is the code below is appropriate to check performance of multi threading ?
public class MultiThreadingSum {
long count = 0;
static long limit = 1000000000L;
static void compareMultipleThreadToSingleThread(int threadCnt) {
Runnable r = () -> {
MultiThreadingSum mts = new MultiThreadingSum();
long startTime = System.nanoTime();
while(++mts.count<=limit);
long endTime = System.nanoTime();
long estimatedTime = endTime - startTime;
double seconds = estimatedTime / 1000000000.0;
System.out.println(Thread.currentThread().getName()+", elapsed time : "+seconds);
};
for(int i=0; i<threadCnt; i++) {
new Thread(r, "multiThread"+i).start();
}
Runnable r2 = () -> {
MultiThreadingSum mts = new MultiThreadingSum();
long startTime = System.nanoTime();
while(++mts.count<=limit*threadCnt);
long endTime = System.nanoTime();
long estimatedTime = endTime - startTime;
double seconds = estimatedTime / 1000000000.0;
System.out.println(Thread.currentThread().getName()+", elapsed time : "+seconds);
};
new Thread(r2, "singleThread").start();
}
public static void main(String[] args) {
compareMultipleThreadToSingleThread(3);
}
}
Your code does not wait for the 3-thread experiment to finish before running the single-thread experiment. So you may be contaminating your results.
Your code seems needlessly complicated. Can't we run two separate experiments, one with 3 threads and one with 1 thread, separately, to reuse code?
In modern Java, we rarely need to address the Thread class. Instead, use the executor service framework added to Java 5.
Putting this all together, perhaps your experiment should look more like the following.
Caveat: This is just a very rough cut, I've not thought it through, and my caffeination has been exhausted. So revise this code thoughtfully. Perhaps I can revise this code in a day or two.
package work.basil.threading;
import java.time.Duration;
import java.time.Instant;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
public class Loopy
{
public static void main ( String[] args )
{
Loopy app = new Loopy();
List < Integer > inputThreadsLimit = List.of( 1 , 3 , ( Runtime.getRuntime().availableProcessors() - 1 ) );
for ( Integer numberOfThreads : inputThreadsLimit )
{
System.out.println("----------| Experiment for thread count: " + numberOfThreads + " |--------------------------");
Duration duration = app.demo( numberOfThreads ); // Waits here for the experiment to run to completion.
System.out.println( numberOfThreads + " = " + duration + " total, each: " + duration.dividedBy( numberOfThreads ) );
}
}
// Member fields
final private AtomicInteger count = new AtomicInteger( 0 );
private Duration demo ( final int numberOfThreads )
{
ExecutorService executorService = Executors.newFixedThreadPool( numberOfThreads );
long start = System.nanoTime();
for ( int i = 0 ; i < numberOfThreads ; i++ )
{
executorService.submit( new Task() );
}
executorService.shutdown(); // Ask the executor service to shutdown its backing pool of threads after all submitted tasks are done/canceled/failed.
try { executorService.awaitTermination( 1 , TimeUnit.HOURS ); } catch ( InterruptedException e ) { e.printStackTrace(); } // Tries to force the shutdown after timeout.
Duration elapsed = Duration.ofNanos( System.nanoTime() - start );
return elapsed;
}
class Task implements Runnable
{
#Override
public void run ( )
{
int countSoFar = count.incrementAndGet(); // Thread-safe way to access, increment, and write a counter.
// … add code here to do some kind of work …
System.out.println( "Thread ID " + Thread.currentThread().getId() + " is finishing run after incrementing the countSoFar to: " + countSoFar + " at " + Instant.now() );
}
}
}
this is not a good example. the multi and single threaded solutions run simultaneously and on the same counter. so practically you run one multi threaded process with four threads. you need to run one solution until thread is complete and shutdown, then the other. the easiest solution would be to run the single threaded process as a simple loop in the main method and run the multi threaded solution after the loop completes. also, i would have two separate counters, or, you can assign zero to counter after single thread loop completes
I have client-server application in which i need to measure the rate of request arrival per second(Request rate). For this, i have a timer object that activates after every seconds, reads a synchronized counter and then sets it to zero. The counter increments on each request arrival.I used following code to detect request rate. There are so many other threads and timers in my application running.The problem is "due to the inaccuracy of timers i am not getting the perfect request rate". Is there any alternative of measuring request rate other than using timers.
public class FrequencyDetector extends TimerTask {
RequestCounter requestCounter;
FrequencyHolder frequencyHolder;
public FrequencyDetector(RequestCounter requestCounter,FrequencyHolder frequencyHolder){
this.frequencyHolder=new FrequencyHolder();
this.frequencyHolder=frequencyHolder;
}
#Override
public void run() {
int newFrequency=requestCounter.getCounter();
frequencyHolder.setFrequency(newFrequency);
requestCounter.setCounterToZero();
//calls to other fuctions
}
}
Instead of checking counter per unit time you can check time per unit counter. That will probably give you more accurate results. Algorithm is given below.
Increment counter on every request.
When counter reaches a certain FIXED_LIMIT calculate frequency by frequency=FIXED_LIMIT/duration since last record
Reset the counter and start with step 1
However this will record frequency at unpredictable intervals and if frequency of request decreases the duration between successive records will increase.
To handle it we can implement an adaptive algorithm, algorithm is given below.
Increment the counter on every request.
When counter reaches a certain ADAPTIVE_LIMIT record frequency as frequency=ADAPTIVE_LIMIT/duration since last record
Change ADAPTIVE_LIMIT as ADAPTIVE_LIMIT=frequency * DESIRED RECORD INTERVAL
Reset counter and start with step 1.
Above algorithm will reset the limit based on frequency last recorded. It's given that it will not be recording at optimal intervals but it will be pretty close.
Also it will give you highly accurate frequencies as it does not depend on any scheduled thread.
Following is an implementation of such an adaptive counter.
import java.util.Random;
import java.util.concurrent.atomic.AtomicLong;
public class TestCounter {
//Keep initial counterInterval to a small value otherwise first record may take long time
final AtomicLong counterInterval = new AtomicLong(10);
AtomicLong requestCounter = new AtomicLong();
volatile long lastTime;
/**OPTIMAL_DURATION is the duration after which frequency is expected to be recorded
* Program adaptively tries to reach this duration
*/
static final double OPTIMAL_DURATION = 1.0; // 1 second
static final Random random = new Random();
public static void main(String[] args) {
System.out.println("Started ");
TestCounter main = new TestCounter();
for(int i = 0; i < 1000; i++) {
main.requestArrived();
}
}
/*
* Simulating requests
*/
public void requestArrived() {
printCounter();
try {
Thread.sleep(random.nextInt(100));
} catch (InterruptedException e) {
e.printStackTrace();
}
}
//This will be in some Utility class
private void printCounter() {
requestCounter.incrementAndGet();
long currentTime = System.nanoTime();
long currentInterval = counterInterval.get();
if(requestCounter.get() > currentInterval) {
if(lastTime != 0) {
long timeDelta = currentTime - lastTime;
long frequency = (long)(currentInterval / (timeDelta / 1e9));
System.out.printf("time=%.2f, frequency=%d\n", (timeDelta / 1e9), frequency);
//updating the currentInterval for the miss
long newCounterInterval = (long)(frequency * OPTIMAL_DURATION);
counterInterval.set(newCounterInterval);
}
requestCounter.set(0);
lastTime = currentTime;
}
}
}
Output
Started
time=0.54, frequency=18
time=0.98, frequency=18
time=1.01, frequency=17
time=0.96, frequency=17
time=0.99, frequency=17
time=0.85, frequency=19
time=0.96, frequency=19
time=0.82, frequency=23
time=1.08, frequency=21
time=0.98, frequency=21
time=0.94, frequency=22
time=1.06, frequency=20
time=1.07, frequency=18
time=0.99, frequency=18
time=0.98, frequency=18
time=1.02, frequency=17
time=0.92, frequency=18
time=0.92, frequency=19
time=0.89, frequency=21
time=0.82, frequency=25
time=1.31, frequency=19
time=1.02, frequency=18
Recently a use case came up where I had to kick off several blocking IO tasks at the same time and use them in sequence. I did not want to change the order of operation on the consumption side and since this was a web app and these were short-lived tasks in the request path, I didn't want to bottleneck on a fixed threadpool and was looking to mirror the .Net async/await coding style. The FutureTask<> seemed ideal for this but required an ExecutorService. This is an attempt to remove the need for one.
Order of operation:
Kick off tasks
Do some stuff
Consume Task 1
Do some other stuff
Consume Task 2
Finish up
...
I wanted to spawn a new thread for each FutureTask<> but simplify the thread management. After run() completed, the calling thread could be joined.
The solution I came up with was:
package com.staples.search.util;
import java.util.concurrent.Callable;
import java.util.concurrent.Future;
import java.util.concurrent.FutureTask;
public class FutureWrapper<T> extends FutureTask<T> implements Future<T> {
private Thread myThread;
public FutureWrapper(Callable<T> callable) {
super(callable);
myThread = new Thread(this);
myThread.start();
}
#Override
public T get() {
T val = null;
try {
val = super.get();
myThread.join();
}
catch (Exception ex)
{
this.setException(ex);
}
return val;
}
}
Here are a couple of JUnit tests I created to compare FutureWrapper to CachedThreadPool.
#Test
public void testFutureWrapper() throws InterruptedException, ExecutionException {
long startTime = System.currentTimeMillis();
int numThreads = 2000;
List<FutureWrapper<ValueHolder>> taskList = new ArrayList<FutureWrapper<ValueHolder>>();
System.out.printf("FutureWrapper: Creating %d tasks\n", numThreads);
for (int i = 0; i < numThreads; i++) {
taskList.add(new FutureWrapper<ValueHolder>(new Callable<ValueHolder>() {
public ValueHolder call() throws InterruptedException {
int value = 500;
Thread.sleep(value);
return new ValueHolder(value);
}
}));
}
for (int i = 0; i < numThreads; i++)
{
FutureWrapper<ValueHolder> wrapper = taskList.get(i);
ValueHolder v = wrapper.get();
}
System.out.printf("Test took %d ms\n", System.currentTimeMillis() - startTime);
Assert.assertTrue(true);
}
#Test
public void testCachedThreadPool() throws InterruptedException, ExecutionException {
long startTime = System.currentTimeMillis();
int numThreads = 2000;
List<Future<ValueHolder>> taskList = new ArrayList<Future<ValueHolder>>();
ExecutorService esvc = Executors.newCachedThreadPool();
System.out.printf("CachedThreadPool: Creating %d tasks\n", numThreads);
for (int i = 0; i < numThreads; i++) {
taskList.add(esvc.submit(new Callable<ValueHolder>() {
public ValueHolder call() throws InterruptedException {
int value = 500;
Thread.sleep(value);
return new ValueHolder(value);
}
}));
}
for (int i = 0; i < numThreads; i++)
{
Future<ValueHolder> wrapper = taskList.get(i);
ValueHolder v = wrapper.get();
}
System.out.printf("Test took %d ms\n", System.currentTimeMillis() - startTime);
Assert.assertTrue(true);
}
class ValueHolder {
private int value;
public ValueHolder(int val) { value = val; }
public int getValue() { return value; }
public void setValue(int val) { value = val; }
}
Repeated runs puts the FutureWrapper at ~925ms vs. ~935ms for the CachedThreadPool. Both tests bump into OS thread limits.
Things seem to work and the thread spawning is pretty fast (10k threads with random sleeps in ~4s). Does anyone see something wrong with this implementation?
Creating and starting thousands of threads is usually a very bad idea, because creating threads is expensive, and having more threads than processors will bring no performance gain but cause thread-context-switches that consume CPU-cycles instead. (See notes very below)
So in my opinion, your test-code contains a big error in reasoning: You are simulating CPU load by calling Thread.sleep(500). But in fact, this does not really cause the CPU to do anything. It is possible to have many sleeping threads in parallel - no matter how many processors you have, but it is not possible to run more CPU consuming tasks than processors in (real) parallel.
If you simulate real CPU load, you'll see, that more threads will just increase the overhead due to thread-management, but not decrease the total processing time.
So let's compare different ways to run CPU consuming tasks in parallel!
First, let's assume we've got some CPU consuming task that always takes the same amount of time:
public Integer task() throws Exception {
// do some computations here (e.g. fibonacchi, primes, cipher, ...)
return 1;
}
Our goal is to run this task NUM_TASKS times using different execution strategies. For our tests, we set NUM_TASKS = 2000.
(1) Using a thread-per-task strategy
This strategy is very comparable to your approach, with the difference, that it is not necessary to subclass FutureTask and fiddle around with threads. Instead, you can use FutureTask directly as it is both, a Runnable and a Future:
#Test
public void testFutureTask() throws InterruptedException, ExecutionException {
List<RunnableFuture<Integer>> taskList = new ArrayList<RunnableFuture<Integer>>();
// run NUM_TASKS FutureTasks in NUM_TASKS threads
for (int i = 0; i < NUM_TASKS; i++) {
RunnableFuture<Integer> rf = new FutureTask<Integer>(this::task);
taskList.add(rf);
new Thread(rf).start();
}
// now wait for all tasks
int sum = 0;
for (Future<Integer> future : taskList) {
sum += future.get();
}
Assert.assertEquals(NUM_TASKS, sum);
}
Running this test with JUnitBenchmarks (10 test iterations + 5 warmup iterations) yields the following result:
ThreadPerformanceTest.testFutureTask: [measured 10 out of 15 rounds, threads: 1 (sequential)]
round: 0.66 [+- 0.01], round.block: 0.00 [+-
0.00], round.gc: 0.00 [+- 0.00], GC.calls: 66, GC.time: 0.06, time.total: 10.59, time.warmup: 4.02, time.bench: 6.57
So one round (execution time of method task()) is about 0.66 seconds.
(2) Using a thread-per-cpu strategy
This strategy uses a fixed number of threads to execute all tasks. Therefore, we create an ExecutorService via Executors.newFixedThreadPool(...). The number of threads should be equal to the number of CPUs (Runtime.getRuntime().availableProcessors()), which is 8 in my case.
To be able to track the results, we simply use a CompletionService. It automatically takes care of the results - no matter in which order they arrive.
#Test
public void testFixedThreadPool() throws InterruptedException, ExecutionException {
ExecutorService exec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
CompletionService<Integer> ecs = new ExecutorCompletionService<Integer>(exec);
// submit NUM_TASKS tasks
for (int i = 0; i < NUM_TASKS; i++) {
ecs.submit(this::task);
}
// now wait for all tasks
int sum = 0;
for (int i = 0; i < NUM_TASKS; i++) {
sum += ecs.take().get();
}
Assert.assertEquals(NUM_TASKS, sum);
}
Again we run this test with JUnitBenchmarks with the same settings. The results are:
ThreadPerformanceTest.testFixedThreadPool: [measured 10 out of 15 rounds, threads: 1 (sequential)]
round: 0.41 [+- 0.01], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 22, GC.time: 0.04, time.total: 6.59, time.warmup: 2.53, time.bench: 4.05
Now one round is only 0.41 seconds (almost 40% runtime reduction)! Also not the fewer GC calls.
(3) Sequential execution
For comparison we should also measure the non-parallelized execution:
#Test
public void testSequential() throws Exception {
int sum = 0;
for (int i = 0; i < NUM_TASKS; i++) {
sum += this.task();
}
Assert.assertEquals(NUM_TASKS, sum);
}
The results:
ThreadPerformanceTest.testSequential: [measured 10 out of 15 rounds, threads: 1 (sequential)]
round: 1.50 [+- 0.01], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+-0.00], GC.calls: 244, GC.time: 0.15, time.total: 22.81, time.warmup: 7.77, time.bench: 15.04
Note that 1.5 seconds is for 2000 executions, so a single execution of task() takes 0.75 ms.
Interpretation
According to Amdahl's law, the time T(n) to execute an algorithm on n processors, is:
B is the fraction of the algorithm that cannot be parallelized and must run sequentially. For pure sequential algorithms, B is 1, for pure parallel algorithms it would be 0 (but this is not possible as there is always some sequential overhead).
T(1) can be taken from our sequential execution: T(1) = 1.5 s
If we had no overhead (B = 0), on 8 CPUs we'd got: T(8) = 1.5 / 8 = 0.1875 s.
But we do have overhead! So let's compute B for our two strategies:
B(thread-per-task) = 0.36
B(thread-per-cpu) = 0.17
In other words: The thread-per-task strategy has twice the overhead!
Finally, let's compute the speedup S(n). That's the number of times, an algorithm runs faster on n CPUs compared to sequential execution (S(1) = 1):
Applied to our two strategies, we get:
thread-per-task: S(8) = 2.27
thread-per-cpu: S(8) = 3.66
So the thread-per-cpu strategy has about 60% more speedup than thread-per-task.
TODO
We should also measure and compare memory consumption.
Note: This all is only true for CPU consuming tasks. If instead, your tasks perform lots of I/O related stuff, you might benefit from having more threads than CPUs as waiting for I/O will put a thread in idle mode, so the CPU can execute another thread meanwhile. But even in this case, there is a reasonable upper limit which is usually far below 2000 on a PC.
I want to know if I need to measure time elapsed then Single Threaded Program is good approach or Multithreading Program is a good approach for that.
Below is my single threaded program that is measuring the time of our service-
private static void serviceCall() {
histogram = new HashMap<Long, Long>();
keys = histogram.keySet();
long total = 5;
long runs = total;
while (runs > 0) {
long start_time = System.currentTimeMillis();
result = restTemplate.getForObject("SOME URL",String.class);
long difference = (System.currentTimeMillis() - start_time);
Long count = histogram.get(difference);
if (count != null) {
count++;
histogram.put(Long.valueOf(difference), count);
} else {
histogram.put(Long.valueOf(difference), Long.valueOf(1L));
}
runs--;
}
for (Long key : keys) {
Long value = histogram.get(key);
System.out.println("MEASUREMENT " + key + ":" + value);
}
}
Output I get from this Single Threaded Program is- Total call was 5
MEASUREMENT 163:1
MEASUREMENT 42:3
MEASUREMENT 47:1
which means 1 call came back in 163 ms. 3 calls came back in 42 ms and so on.
And also I did tried using Multithreaded program as well to measure the time elapsed. Meaning hitting the service parallely with few threads and then measuring how much each thread is taking.
Below is the code for that as well-
//create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(10);
// queue some tasks
for (int i = 0; i < 1 * 5; i++) {
service.submit(new ThreadTask(i, histogram));
}
public ThreadTask(int id, HashMap<Long, Long> histogram) {
this.id = id;
this.hg = histogram;
}
#Override
public void run() {
long start_time = System.currentTimeMillis();
result = restTemplate.getForObject("", String.class);
long difference = (System.currentTimeMillis() - start_time);
Long count = hg.get(difference);
if (count != null) {
count++;
hg.put(Long.valueOf(difference), count);
} else {
hg.put(Long.valueOf(difference), Long.valueOf(1L));
}
}
And below is the result I get from the above program-
{176=1, 213=1, 182=1, 136=1, 155=1}
One call came back in 176 ms, and so on
So my question is why Multithreading program is taking a lot more time as compared to above Single threaded program? If there is some loop hole in my Multithreading program, can anyone help me to improve it?
Your multi-threaded program likely makes all the requests at the same time which puts more strain on the server which will cause it to respond slower to all request.
As an aside, the way you are doing the update isn't threadsafe, so your count will likely be off in the multithreaded scenario given enough trials.
For instance, Thread A and B both return in 100 ms at the same time. The count in histogram for 100 is 3. A gets 3. B gets 3. A updates 3 to 4. B updates 3 to 4. A puts the value 4 in the histogram. B puts the value 4 in the histogram. You've now had 2 threads believe they incremented the count but the count in the histogram only reflects being incremented once.
What is the best way to create 500.000 threads in 5 seconds. (Runnable) I created for loop but it takes lots of time. For example;
startTime = System.currentTimeMills();
for (int i=0;i<500.000; i++){
// create thread
thread.start();
}
resultTime = (System.currentTimeMills() - startTime);
So the resultTime is bigger than 5 seconds. I know it depends on my hardware and os configuration but i just want to know what is the best way to create multiple threads in certain time?
Thanks.
I really can't imagine this is a good idea. Each thread takes a reasonable amount of resource (by default, 512k of heap for each thread) and so even if you create all your threads, your JVM will be fighting for resources.
If you have a requirement for 500,000 work units, I think you're better off creating these as Runnables (and not all at once!) and passing them to a ThreadPool tuned to your environment.machine (e.g. a naive/simple tuning would be one thread per CPU)
The fastest way to create many tasks is to use an ExecutorService
int processors = Runtime.getRuntime().availableProcessors();
ExecutorService es = Executors.newFixedThreadPool(processors);
long start = System.nanoTime();
int tasks = 500 * 1000;
for (int i = 0; i < tasks; i++) {
es.execute(new Runnable() {
#Override
public void run() {
// do something.
}
});
}
long time = System.nanoTime() - start;
System.out.printf("Took %.1f ms to create/submit %,d tasks%n", time / 1e6, tasks);
es.shutdown();
prints
Took 143.6 ms to create/submit 500,000 tasks
Maybe you can make a couple of special threads that generates 250000 threads each..
Maybe this one to expect your computer to smoke better:
concept: share the job among each core.
public class Example {
public static void main(String[] args) {
for (int i = 0; i < 3; i++) {
new Thread(new ThreadCreator()).start(); // with 4 cores on your processor
}
}
}
class ThreadCreator implements Runnable {
#Override
public void run() {
for (int i = 0; i < 125000; i++) {
new Thread().start(); // each core creating remaining thread
}
}
}
Took only 0,6 ms !!