I am trying to test the performance (in terms of execution time) for my webcrawler but I am having trouble timing it due to multi-threading taking place.
My main class:
class WebCrawlerTest {
//methods and variables etc
WebCrawlerTest(List<String> websites){
//
}
if(!started){
startTime = System.currentTimeMillis();
executor = Executors.newFixedThreadPool(32); //this is the value I'm tweaking
started=true;
}
for(String site : websites){
executor.submit(webProcessor = new AllWebsiteProcessorTest(site, deepSearch));
}
executor.shutdown();
//tried grabbing end time here with no luck
AllWebsiteProcessorTest class:
class AllWebsiteProcessorTest implements Runnable{
//methods and var etc
AllWebsiteProcessorTest(String site, boolean deepSearch) {
}
public void run() {
scanSingleWebsite(websites);
for(String email:emails){
System.out.print(email + ", ");
}
private void scanSingleWebsite(String website){
try {
String url = website;
Document document = Jsoup.connect(url).get();
grabEmails(document.toString());
}catch (Exception e) {}
With another class (with a main method), I create an instance of WebCrawlerTest and then pass in an array of websites. The crawler works fine but I can't seem to figure out how to time it.
I can get the start time (System.getCurrentTime...();), but the problem is the end time. I've tried adding the end time like this:
//another class
public static void main(.....){
long start = getCurrent....();
WebCrawlerTest w = new WebCrawlerTest(listOfSites, true);
long end = getCurrent....();
}
Which doesn't work. I also tried adding the end after executor.shutdown(), which again doesn't work (instantly triggered). How do I grab the time for the final completed thread?
After shutting down your executors pool
executor.shutdown();
//tried grabbing end time here with no luck
You can simply
executor.awaitTermination(TimeUnit, value)
This call will block untill all tasks are completed. Take the time, subtract T0 from it and voila, we have execution time.
shutdown() method just assures that no new tasks will be accepted into excution queue. Tasks already in the queue will be performed (shutdownNow() drops pending tasks). To wait for all currently running tasks to complete, you have to awaitTermination().
Related
I am looking for ways to process list entries in parallel, a task that is quite long (say 24 hours - I stream data from huge dbs and then for each row it takes about 1-2 sec to be done with it). I have an application that have 2 methods each processing a list of data. My intitial idea was to use ForkJoin which works but not quite. The simplified dummy code mimicing my app's behaviour is as follows:
#Service
#Slf4j
public class ListProcessing implements Runnable {
#Async
private void processingList() {
// can change to be a 100 or 1000 to speed up the processing,
// but the point is to see the behaviour after the task runs for a long time
// so just using 12.
ForkJoinPool newPool = new ForkJoinPool(12);
newPool.execute(() -> {
List<Integer> testInt = IntStream.rangeClosed(0, 50000)
.boxed().toList();
long start = System.currentTimeMillis();
Map<Integer,DummyModel> output = testInt.stream().parallel()
.map(item -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
log.info("I slept at item {} for map",item);
return new DummyModel(UUID.randomUUID(), item); // a model class with 2 fields and no logic save for getters/setters
}).collect(Collectors.toConcurrentMap(DummyModel::getNum, item -> item));
long end = System.currentTimeMillis();
log.info("Processing time {}",(end-start));
log.info("Size is {}",output.size());
});
newPool.shutdown();
}
// method is identical to the one above for simplicity & demo purposes
#Async
private void processingList2() {
ForkJoinPool newPool = new ForkJoinPool(12);
newPool.execute(() -> {
List<Integer> testInt = IntStream.rangeClosed(0, 50000)
.boxed().toList();
long start = System.currentTimeMillis();
Map<Integer,DummyModel> output = testInt.stream().parallel()
.map(item -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
log.info("I slept at item {} for map2",item);
return new DummyModel(UUID.randomUUID(), item);
}).collect(Collectors.toConcurrentMap(DummyModel::getNum, item -> item));
long end = System.currentTimeMillis();
log.info("Processing time {}",(end-start));
log.info("Size is {}",output.size());
});
newPool.shutdown();
}
#Override
public void run() {
processingList();
processingList2();
}
}
The class is then being called by my controller which is as follows:
#PostMapping
public void startTest() {
Thread startRun = new Thread(new ListProcessing());
startRun.start();
}
This works perfectly - both methods are executed in async and I can see that they are using separate pools with 12 worker threads each. However, about an hour into running this app I can see that the number of threads used by each method starts dropping. After some researching, I learnt that parallel streams might be the problem (according to this discussion).
Now, I can change my ForkJoinPools to have more worker threads (which will shorten the execution time solveing the problem, but that sounds like a temp fix with the problem still there if execution exceeds 1 hour mark). So I decided to try something else, although I would really like to make ForkJoin work.
Another solution that seems to be able to do what I want is using CompletableFuture with Custom Executor as described here. So I removed Runnable & ForkJoin and implemented CompletableFuture as described in the article. The only difference being that I have a separate pool for each method and both methods are being called by controller which looks like so now:
#Autowired
private ListProcessing listProcessing;
#PostMapping
public void startTest() {
listProcessing.processingList();
listProcessing.processingList2();
}
However, the custom Executors never get used and each testInt gets executed synchronosly one by one. I tried to make it work with only 1 method but that also didn't work - custom executor seems to just be ignored. The method looked like so:
private CompletableFuture<List<DummyModel>> processingList() {
List<Integer> testInt = IntStream.rangeClosed(0, 50000)
.boxed().toList();
long start = System.currentTimeMillis();
List<CompletableFuture<DummyModel>> myDummyies = new ArrayList<>();
testInt.forEach(item -> {
myDummyies.add(createDummy(item));
log.info("I slept at item {} for list", item);
});
// waiting for all CompletableFutures to complete and collect them into a list
CompletableFuture<List<DummyModel>> output = CompletableFuture.allOf(myDummyies.toArray(new CompletableFuture[0]))
.thenApply(item -> myDummyies.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList()));
long end = System.currentTimeMillis();
log.info("Processing time {} \n", (end - start));
return output;
}
#Async("myPool")
private CompletableFuture<DummyModel> createDummy(Integer item) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return CompletableFuture.completedFuture(new DummyModel(UUID.randomUUID(), item));
}
So my questions are as follows:
Can I somehow set up ForkJoin to replace blocked worker threads with the fresh ones, so that the number of worker threads remain the same all the time? Or maybe after some time ask it to be replaced by a newly created one and continue the work? Or is it all just a limitation of a ForkJoin framework and I should look elsewhere?
If the ForkJoin cannot happen, how can I make CompletableFuture work? Where did I go worng with what I have implemented?
Is there any other way to process a long running task with custom number of worker threads which run in parallel? What would be the best way to process a lot of data for a prolong period of time in parallel?
I have a thread pool with 8 threads
private static final ExecutorService SERVICE = Executors.newFixedThreadPool(8);
My mechanism emulating the work of 100 user (100 Tasks):
List<Callable<Boolean>> callableTasks = new ArrayList<>();
for (int i = 0; i < 100; i++) { // Number of users == 100
callableTasks.add(new Task(client));
}
SERVICE.invokeAll(callableTasks);
SERVICE.shutdown();
The user performs the Task of generating a document.
Get UUID of Task;
Get Task status every 10 seconds;
If Task is ready get document.
public class Task implements Callable<Boolean> {
private final ReportClient client;
public Task(ReportClient client) {
this.client = client;
}
#Override
public Boolean call() {
final var uuid = client.createDocument(documentId);
GetStatusResponse status = null;
do {
try {
Thread.sleep(10000); // This stop current thread, but not a Task!!!!
} catch (InterruptedException e) {
return Boolean.FALSE;
}
status = client.getStatus(uuid);
} while (Status.PENDING.equals(status.status()));
final var document = client.getReport(uuid);
return Boolean.TRUE;
}
}
I want to give the idle time (10 seconds) to another task. But when the command Thread.sleep(10000); is called, the current thread suspends its execution. First 8 Tasks are suspended and 92 Tasks are pending 10 seconds. How can I do 100 Tasks in progress at the same time?
The Answer by Yevgeniy looks correct, regarding Java today. You want to have your cake and eat it too, in that you want a thread to sleep before repeating a task but you also want that thread to do other work. That is not possible today, but may be in the future.
Project Loom
In current Java, a Java thread is mapped directly to a host OS thread. In all common OSes such as macOS, BSD, Linux, Windows, and such, when code executing in a host thread blocks (stops to wait for sleep, or storage I/O, or network I/O, etc.) the thread too blocks. The blocked thread suspends, and the host OS generally runs another thread on that otherwise unused core. But the crucial point is that the suspended thread performs no further work until your blocking call to sleep returns.
This picture may change in the not-so-distant future. Project Loom seeks to add virtual threads to the concurrency facilities in Java.
In this new technology, many Java virtual threads are mapped to each host OS thread. Juggling the many Java virtual threads is managed by the JVM rather than by the OS. When the JVM detects a virtual thread’s executing code is blocking, that virtual thread is "parked", set aside by the JVM, with another virtual thread swapped out for execution on that "real" host OS thread. When the other thread returns from its blocking call, it can be reassigned to a "real" host OS thread for further execution. Under Project Loom, the host OS threads are kept busy, never idled while any pending virtual thread has work to do.
This swapping between virtual threads is highly efficient, so that thousands, even millions, of threads can be running at a time on conventional computer hardware.
Using virtual threads, your code will indeed work as you had hoped: A blocking call in Java will not block the host OS thread. But virtual threads are experimental, still in development, scheduled as a preview feature in Java 19. Early-access builds of Java 19 with Loom technology included are available now for you to try. But for production deployment today, you'll need to follow the advice in the Answer by Yevgeniy.
Take my coverage here with a grain of salt, as I am not an expert on concurrency. You can hear it from the actual experts, in the articles, interviews, and presentations by members of the Project Loom team including Ron Pressler and Alan Bateman.
EDIT: I just posted this answer and realized that you seem to be using that code to emulate real user interactions with some system. I would strongly recommend just using a load testing utility for that, rather than trying to come up with your own. However, in that case just using a CachedThreadPool might do the trick, although probably not a very robust or scalable solution.
Thread.sleep() behavior here is working as intended: it suspends the thread to let the CPU execute other threads.
Note that in this state a thread can be interrupted for a number of reasons unrelated to your code, and in that case your Task returns false: I'm assuming you actually have some retry logic down the line.
So you want two mutually exclusive things: on the one hand, if the document isn't ready, the thread should be free to do something else, but should somehow return and check that document's status again in 10 seconds.
That means you have to choose:
You definitely need that once-every-10-seconds check for each document - in that case, maybe use a cachedThreadPool and have it generate as many threads as necessary, just keep in mind that you'll carry the overhead for numerous threads doing virtually nothing.
Or, you can first initiate that asynchronous document creation process and then only check for status in your callables, retrying as needed.
Something like:
public class Task implements Callable<Boolean> {
private final ReportClient client;
private final UUID uuid;
// all args constructor omitted for brevity
#Override
public Boolean call() {
GetStatusResponse status = client.getStatus(uuid);
if (Status.PENDING.equals(status.status())) {
final var document = client.getReport(uuid);
return Boolean.TRUE;
} else {
return Boolean.FALSE; //retry next time
}
}
}
List<Callable<Boolean>> callableTasks = new ArrayList<>();
for (int i = 0; i < 100; i++) {
var uuid = client.createDocument(documentId); //not sure where documentId comes from here in your code
callableTasks.add(new Task(client, uuid));
}
List<Future<Boolean>> results = SERVICE.invokeAll(callableTasks);
// retry logic until all results come back as `true` here
This assumes that createDocument is relatively efficient, but that stage can be parallelized just as well, you just need to use a separate list of Runnable tasks and invoke them using the executor service.
Note that we also assume that the document's status will indeed eventually change to something other than PENDING, and that might very well not be the case. You might want to have a timeout for retries.
In your case, it seems like you need to check if a certain condition is met every x seconds. In fact, from your code the document generation seems asynchronous and what the Task keeps doing after that is just is waiting for the document generation to happen.
You could launch every document generation from your Thread-Main and use a ScheduledThreadPoolExecutor to verify every x seconds whether the document generation has been completed. At that point, you retrieve the result and cancel the corresponding Task's scheduling.
Basically, one ConcurrentHashMap is shared among the thread-main and the Tasks you've scheduled (mapRes), while the other, mapTask, is just used locally within the thread-main to keep track of the ScheduledFuture returned by every Task.
public class Main {
public static void main(String[] args) {
ScheduledThreadPoolExecutor pool = (ScheduledThreadPoolExecutor) Executors.newScheduledThreadPool(8);
//ConcurrentHashMap shared among the submitted tasks where each Task updates its corresponding outcome to true as soon as the document has been produced
ConcurrentHashMap<Integer, Boolean> mapRes = new ConcurrentHashMap<>();
for (int i = 0; i < 100; i++) {
mapRes.put(i, false);
}
String uuid;
ScheduledFuture<?> schedFut;
//HashMap containing the ScheduledFuture returned by scheduling each Task to cancel their repetition as soon as the document has been produced
Map<String, ScheduledFuture<?>> mapTask = new HashMap<>();
for (int i = 0; i < 100; i++) {
//Starting the document generation from the thread-main
uuid = client.createDocument(documentId);
//Scheduling each Task 10 seconds apart from one another and with an initial delay of i*10 to not start all of them at the same time
schedFut = pool.scheduleWithFixedDelay(new Task(client, uuid, mapRes), i * 10, 10000, TimeUnit.MILLISECONDS);
//Adding the ScheduledFuture to the map
mapTask.put(uuid, schedFut);
}
//Keep checking the outcome of each task until all of them have been canceled due to completion
while (!mapTasks.values().stream().allMatch(v -> v.isCancelled())) {
for (Integer key : mapTasks.keySet()) {
//Canceling the i-th task scheduling if:
// - Its result is positive (i.e. its verification is terminated)
// - The task hasn't been canceled already
if (mapRes.get(key) && !mapTasks.get(key).isCancelled()) {
schedFut = mapTasks.get(key);
schedFut.cancel(true);
}
}
//... eventually adding a sleep to check the completion every x seconds ...
}
pool.shutdown();
}
}
class Task implements Runnable {
private final ReportClient client;
private final String uuid;
private final ConcurrentHashMap mapRes;
public Task(ReportClient client, String uuid, ConcurrentHashMap mapRes) {
this.client = client;
this.uuid = uuid;
this.mapRes = mapRes;
}
#Override
public void run() {
//This is taken form your code and I'm assuming that if it's not pending then it's completed
if (!Status.PENDING.equals(client.getStatus(uuid).status())) {
mapRes.replace(uuid, true);
}
}
}
I've tested your case locally, by emulating a scenario where n Tasks wait for a folder with their same id to be created (or uuid in your case). I'll post it right here as a sample in case you'd like to try something simpler first.
public class Main {
public static void main(String[] args) {
ScheduledThreadPoolExecutor pool = (ScheduledThreadPoolExecutor) Executors.newScheduledThreadPool(2);
ConcurrentHashMap<Integer, Boolean> mapRes = new ConcurrentHashMap<>();
for (int i = 0; i < 16; i++) {
mapRes.put(i, false);
}
ScheduledFuture<?> schedFut;
Map<Integer, ScheduledFuture<?>> mapTasks = new HashMap<>();
for (int i = 0; i < 16; i++) {
schedFut = pool.scheduleWithFixedDelay(new MyTask(i, mapRes), i * 20, 3000, TimeUnit.MILLISECONDS);
mapTasks.put(i, schedFut);
}
while (!mapTasks.values().stream().allMatch(v -> v.isCancelled())) {
for (Integer key : mapTasks.keySet()) {
if (mapRes.get(key) && !mapTasks.get(key).isCancelled()) {
schedFut = mapTasks.get(key);
schedFut.cancel(true);
}
}
}
pool.shutdown();
}
}
class MyTask implements Runnable {
private int num;
private ConcurrentHashMap mapRes;
public MyTask(int num, ConcurrentHashMap mapRes) {
this.num = num;
this.mapRes = mapRes;
}
#Override
public void run() {
System.out.println("Task " + num + " is checking whether the folder exists: " + Files.exists(Path.of("./" + num)));
if (Files.exists(Path.of("./" + num))) {
mapRes.replace(num, true);
}
}
}
I want to create two threads in my application that'll run two methods. I'm using the builder design pattern where inside the build method I have something like this, request is the Object that is passed:
Rules rule;
Request build() {
Request request = new Request(this);
//I want one threat to call this method
Boolean isExceeding = this.rule.volumeExceeding(request);
//Another thread to call this method
Boolean isRepeating = this.rule.volumeRepeating(request);
//Some sort of timer that will wait until both values are received,
//If one value takes too long to be received kill the thread and continue with
//whatever value was received.
..Logic based on 2 booleans..
return request;
}
Here's how this class looks like:
public class Rules {
public Boolean volumeExceeding(Request request) {
...some...logic...
return true/false;
}
public Boolean volumeRepeating(Request request) {
...some...logic...
return true/false;
}
}
I have commented in the code what I'd like to happen. Basically, I'd like to create two threads that'll run their respective method. It'll wait until both are finished, however, if one takes too long (example: more than 10ms) then return the value that was completed. How do I create this? I'm trying to understand the multithreading tutorials, but the examples are so generic that it's hard to take what they did and apply it to something more complicated.
One way to do that is to use CompletableFutures:
import java.util.concurrent.CompletableFuture;
class Main {
private static final long timeout = 1_000; // 1 second
static Boolean volumeExceeding(Object request) {
System.out.println(Thread.currentThread().getName());
final long startpoint = System.currentTimeMillis();
// do stuff with request but we do dummy stuff
for (int i = 0; i < 1_000_000; i++) {
if (System.currentTimeMillis() - startpoint > timeout) {
return false;
}
Math.log(Math.sqrt(i));
}
return true;
}
static Boolean volumeRepeating(Object request) {
System.out.println(Thread.currentThread().getName());
final long startpoint = System.currentTimeMillis();
// do stuff with request but we do dummy stuff
for (int i = 0; i < 1_000_000_000; i++) {
if (System.currentTimeMillis() - startpoint > timeout) {
return false;
}
Math.log(Math.sqrt(i));
}
return true;
}
public static void main(String[] args) {
final Object request = new Object();
CompletableFuture<Boolean> isExceedingFuture = CompletableFuture.supplyAsync(
() -> Main.volumeExceeding(request));
CompletableFuture<Boolean> isRepeatingFuture = CompletableFuture.supplyAsync(
() -> Main.volumeRepeating(request));
Boolean isExceeding = isExceedingFuture.join();
Boolean isRepeating = isRepeatingFuture.join();
System.out.println(isExceeding);
System.out.println(isRepeating);
}
}
Notice that one task takes significantly longer than the other.
What's happening? You supply those tasks to the common pool by using CompletableFuture for execution. Both tasks are executed by two different threads. What you've asked for is that a task is stopped when it takes too long. Therefore you can simply remember the time when a task has started and periodically check it against a timeout. Important: Do this check when the task would return while leaving the data in a consistent state. Also note that you can place multiple checks of course.
Here's a nice guide about CompletableFuture: Guide To CompletableFuture
If I understand your question correctly, then you should do this with a ticketing system (also known as provider-consumer pattern or producer-consumer pattern), so your threads are reused (which is a significant performance boost, if those operations are time critical).
The general idea should be:
application initialization
Initialize 2 or more "consumer" threads, which can work tickets (also called jobs).
runtime
Feed the consumer threads tickets (or jobs) that will be waited on for (about) as long as you like. However depending on the JVM, the waiting period will most likely not be exactly n milliseconds, as most often schedulers are more 'lax' in regards to waiting periods for timeouts. e.g. Thread.sleep() will almost always be off by a bunch of milliseconds (always late, never early - to my knowledge).
If the thread does not return after a given waiting period, then that result must be neglected (according to your logic), and the ticket (and thus the thread) must be informed to abort that ticket. It is important that you not interrupt the thread, since that can lead to exceptions, or prevent locks from being unlocked.
Remember, that halting or stopping threads from the outside is almost always problematic with locks, so I would suggest, your jobs visit a possible exit point periodically, so if you stop caring about a result, they can be safely terminated.
Good day,
I am writing a program where a method is called for each line read from a text file. As each call of this method is independent of any other line read I can call them on parallel. To maximize cpu usage I use a ExecutorService where I submit each run() call. As the text file has 15 million lines, I need to stagger the ExecutorService run to not create too many jobs at once (OutOfMemory exception). I also want to keep track of the time each submitted run has been running as I have seen that some are not finishing. The problem is that when I try to use the Future.get method with timeout, the timeout refers to the time since it got into the queue of the ExecutorService, not since it started running, if it even started. I would like to get the time since it started running, not since it got into the queue.
The code looks like this:
ExecutorService executorService= Executors.newFixedThreadPool(ncpu);
line = reader.readLine();
long start = System.currentTimeMillis();
HashMap<MyFut,String> runs = new HashMap<MyFut, String>();
HashMap<Future, MyFut> tasks = new HashMap<Future, MyFut>();
while ( (line = reader.readLine()) != null ) {
String s = line.split("\t")[1];
final String m = line.split("\t")[0];
MyFut f = new MyFut(s, m);
tasks.put(executorService.submit(f), f);
runs.put(f, line);
while (tasks.size()>ncpu*100){
try {
Thread.sleep(100);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Iterator<Future> i = tasks.keySet().iterator();
while(i.hasNext()){
Future task = i.next();
if (task.isDone()){
i.remove();
} else {
MyFut fut = tasks.get(task);
if (fut.elapsed()>10000){
System.out.println(line);
task.cancel(true);
i.remove();
}
}
}
}
}
private static class MyFut implements Runnable{
private long start;
String copy;
String id2;
public MyFut(String m, String id){
super();
copy=m;
id2 = id;
}
public long elapsed(){
return System.currentTimeMillis()-start;
}
#Override
public void run() {
start = System.currentTimeMillis();
do something...
}
}
As you can see I try to keep track of how many jobs I have sent and if a threshold is passed I wait a bit until some have finished. I also try to check if any of the jobs is taking too long to cancel it, keeping in mind which failed, and continue execution. This is not working as I hoped. 10 seconds execution for one task is much more than needed (I get 1000 lines done in 70 to 130s depending on machine and number of cpu).
What am I doing wrong? Shouldn't the run method in my Runnable class be called only when some Thread in the ExecutorService is free and starts working on it? I get a lot of results that take more than 10 seconds. Is there a better way to achieve what I am trying?
Thanks.
If you are using Future, I would recommend change Runnable to Callable and return total time in execution of thread as result. Below is sample code:
import java.util.concurrent.Callable;
public class MyFut implements Callable<Long> {
String copy;
String id2;
public MyFut(String m, String id) {
super();
copy = m;
id2 = id;
}
#Override
public Long call() throws Exception {
long start = System.currentTimeMillis();
//do something...
long end = System.currentTimeMillis();
return (end - start);
}
}
You are making your work harder as it should be. Java’s framework provides everything you want, you only have to use it.
Limiting the number of pending work items works by using a bounded queue, but the ExecutorService returned by Executors.newFixedThreadPool() uses an unbound queue. The policy to wait once the bounded queue is full can be implemented via a RejectedExecutionHandler. The entire thing looks like this:
static class WaitingRejectionHandler implements RejectedExecutionHandler {
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
executor.getQueue().put(r);// block until capacity available
} catch(InterruptedException ex) {
throw new RejectedExecutionException(ex);
}
}
}
public static void main(String[] args)
{
final int nCPU=Runtime.getRuntime().availableProcessors();
final int maxPendingJobs=100;
ExecutorService executorService=new ThreadPoolExecutor(nCPU, nCPU, 1, TimeUnit.MINUTES,
new ArrayBlockingQueue<Runnable>(maxPendingJobs), new WaitingRejectionHandler());
// start flooding the `executorService` with jobs here
That’s all.
Measuring the elapsed time within a job is quite easy as it has nothing to do with multi-threading:
long startTime=System.nanoTime();
// do your work here
long elpasedTimeSoFar = System.nanoTime()-startTime;
But maybe you don’t need it anymore once you are using the bounded queue.
By the way the Future.get method with timeout does not refer to the time since it got into the queue of the ExecutorService, it refers to the time of invoking the get method itself. In other words, it tells how long the get method is allowed to wait, nothing more.
I am emulating a simple connection between a client and a server. The client petitions are sent and the server proccesses them in a concurrent way: the server class extends Thread and the task is run when the object is created.
The server is always open, listening to petitions, when there is one then a object is created using the socket as a parameter, and the task is then run as I said.
I am trying to measure the time it takes to process all the petitions one client sends at once, but I can't manage to do it. With threads, pools and such I would usually take the initial time and take the time when I know everything finished and voila (usually after a join or checking if the pool is terminated).
But now I can't manage to know when all the tasks are done, because the server is always running.
Any ideas?
I'm going to try to sum up the code in case someone didn't understand:
import java.net.*;
import java.io.*;
public class MyServer extends Thread
{
Socket socket;
public MyServer(Socket s) { socket=s; this.start(); }
public void run()
{
// proccessing of the data sent by the client (just printing values)
// data is read properly, don't worry
socket.close();
}
public static void main(String[] args)
{
int port = 2001; // the same one the client is using
try
{
ServerSocket chuff = new ServerSocket(port, 3000);
while (true)
{
Socket connection = chuff.accept();
new MyServer(connection);
}
} catch (Exception e) {}
}
}
It's not clear from your question whether a client will (a) send more work down a single connection later, or (b) open multiple connections at once.
If it won't ever do either, then the processing of one connection is the unit of work to time (and in fact I think all you need to time is how long the thread is alive for).
If a client might do one of those things, then if you can, change your protocol so that clients send work in one single packet; then measure how long it takes to process one of those packets. This gives you an unambiguous definition of what you are actually measuring, the lack of which might be what is causing you problems here.
For each incoming connection, I would do it as follows:
Handover the connection to a Runnable class that performs the work.
Measure the time taken by the run method and at the end of run method, prepare a Statistics object that contains the client details and the time taken to run and post it to a LinkedBlockingQueue.
Have another thread that would poll this queue, extracts the Statistics object and updates the database or data where per-client run times are tracked.
If you want to be notified when no more connections are incomming you must set a SO_TIMEOUT, otherwise accept() blocks forever. Timeouts are enabled by invoking ServerSocket.setSoTimeout(int).
To measure performance each thread could update a shared variable with the time when they completed the task. Mark this variable as volatile to keep the values synchronized and wait until all your threads have terminated and accept has raised a java.net.SocketTimeoutException.
Note that you're also measuring the network latency between the incoming requests, is this inteded?
I would highly recommended instead of creating new Thread every time on accepting the client task consider using ExecutorService instead.
If you want to check the timing for performing number of tasks by server may be you can send list of task in one go as mentioned above and use CompletionService to check total time to complete all tasks(Runnable). Below is a sample test class to show how to capture completion time:
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.*;
public class ServerPerformanceTest {
public static void main(String[] args) {
System.out.println("Total time taken : " + totalTimeTaken(1000, 16));
}
public static long totalTimeTaken(final int taskCount, final int threadCount) {
//Mocking Dummy task send by client
Runnable clientTask = new Runnable() {
#Override
public void run() {
System.out.println("task done");
}
};
long startTime = System.currentTimeMillis();
//Prepare list of tasks for performance test
List<Runnable> tasks = Collections.nCopies(taskCount, clientTask);
ExecutorService executorService = Executors.newFixedThreadPool(threadCount);
ExecutorCompletionService<String> completionService = new ExecutorCompletionService<String>(executorService);
//Submit all tasks
for (Runnable _task : tasks) {
completionService.submit(_task, "Done");
}
//Get from all Future tasks till all tasks completed
for (int i = 0; i < tasks.size(); i++) {
try {
completionService.take().get();
} catch (InterruptedException e) {
e.printStackTrace(); //do something
} catch (ExecutionException e) {
e.printStackTrace(); //do something
}
}
long endTime = System.currentTimeMillis();
return (endTime - startTime);
}
}