I currently have code that does the following:
private final static ExecutorService pool = Executors.newCachedThreadPool();
public void foo(){
FutureTask<MyObject> first_task = createFutureTask();
FutureTask<MyObject> second_task = createFutureTask();
...
pool.execute(first_task);
pool.execute(second_task);
....
first_task.get();
second_task.get();
...
System.out.println(time taken);
}
The problem I'm having is that I get each of the future task to print out the time they take when doing computation, so for example on the console I will see
first_task : 20000ms
second_task : 18000ms
...
but the total time (System.out.println(time taken)) is much larger then the longest time taken by any future task, so in line with this example the method do would take around 1 minute (compared to the 20s of first_task).
I was under the impression that these future tasks run in parallel but from the timings it seems as though they are being run one after the other. Am I using this API correctly?
You're using the API correctly, but keep in mind that each task runs in a separate thread, not a separate process (and thus not necessarily in parallel).
Each thread would have to run on a separate CPU core to actually execute at the same time. Whether or not this is possible depends on your machine, its current load, and how the JVM and OS are able to schedule the threads across cores.
Related
I was reading ScheduledThreadPoolExecutor JavaDoc and came across the following thing:
Delayed tasks execute no sooner than they are enabled, but without any
real-time guarantees about when, after they are enabled, they will
commence. Tasks scheduled for exactly the same execution time are
enabled in first-in-first-out (FIFO) order of submission.
So, if I write something like this:
ScheduledExecutorService ses = Executors.newScheduledThreadPool(4); //uses ScheduledThreadPoolExecutor internally
Callable<Integer> c;
//initialize c
ses.schedule(c, 10, TimeUnit.SECONDS);
there's no any guarantees that the execution of the callable will start in 10 seconds after the scheduling? As far as I got, the specification allows it to execute even in hour after scheduleing (without any real-time guarantees, as stated in the documentation).
How does it work in practice? Should I excepct some really long delay?
Your understanding is correct. The Executor is not claiming to be a real-time system with any sort of timing guarantees. The only thing it will guarantee is that it doesn't run tasks too early.
In practice, the timing of well-tuned Executors are very accurate. Typically they start within 10ms after the scheduled time from my experience. The only time you will see scheduling get pushed back very far is if your Executor is lacking the appropriate resources to run it's workload. So this is more of a tuning issue.
Realistically, if you give your Executor enough resources to work with, the timing will be quite accurate.
Some things that you don't want to do with an Executor is use the scheduling as part of a rate-based calculation. For example, if you schedule a task to run every 1 second and you use that to compute <somemetric> per second without factoring in what time the task is actually running at.
Another thing to be mindful of is the cost of context switching. If you schedule multiple tasks to run every 1ms, the Executor will not be able to keep up with running your task and context switching everyone 1ms.
So, I have a loop where I create thousands of threads which process my data.
I checked and storing a Thread slows down my app.
It's from my loop:
Record r = new Record(id, data, outPath, debug);
//r.start();
threads.add(r);
//id is 4 digits
//data is something like 500 chars long
It stop my for loop for a while (it takes a second or more for one run, too much!).
Only init > duration: 0:00:06.369
With adding thread to ArrayList > duration: 0:00:07.348
Questions:
what is the best way of storing Threads?
how to make Threads faster?
should I create Threads and run them with special executor, means for example 10 at once, then next 10 etc.? (if yes, then how?)
Consider that having a number of threads that is very high is not very useful.
At least you can execute at the same time a number of threads equals to the number of core of your cpu.
The best is to reuse existing threads. To do that you can use the Executor framework.
For example to create an Executor that handle internally at most 10 threads you can do the followig:
List<Record> records = ...;
ExecutorService executor = Executors.newFixedThreadPool(10);
for (Record r : records) {
executor.submit(r);
}
// At the end stop the executor
executor.shutdown();
With a code similar to this one you can submit also many thousands of commands (Runnable implementations) but no more than 10 threads will be created.
I'm guessing that it is not the .add method that is really slowing you down. My guess is that the hundreds of Threads running in parallel is what really is the problem. Of course a simple command like "add" will be queued in the pipeline and can take long to be executed, even if the execution itself is fast. Also it is possible that your data-structure has an add method that is in O(n).
Possible solutions for this:
* Find a real wait-free solution for this. E.g. prioritising threads.
* Add them all to your data-structure before executing them
While it is possible to work like this it is strongly discouraged to create more than some Threads for stuff like this. You should use the Thread Executor as David Lorenzo already pointed out.
I have a loop where I create thousands of threads...
That's a bad sign right there. Creating threads is expensive.
Presumeably your program creates thousands of threads because it has thousands of tasks to perform. The trick is, to de-couple the threads from the tasks. Create just a few threads, and re-use them.
That's what a thread pool does for you.
Learn about the java.util.concurrent.ThreadPoolExecutor class and related classes (e.g., Future). It implements a thread pool, and chances are very likely that it provides all of the features that you need.
If your needs are simple enough, you can use one of the static methdods in java.util.concurrent.Executors to create and configure a thread pool. (e.g., Executors.newFixedThreadPool(N) will create a new thread pool with exactly N threads.)
If your tasks are all compute bound, then there's no reason to have any more threads than the number of CPUs in the machine. If your tasks spend time waiting for something (e.g., waiting for commands from a network client), then the decision of how many threads to create becomes more complicated: It depends on how much of what resources those threads use. You may need to experiment to find the right number.
I apologize in advance if this is a basic question, but I'm new to the material. I've got a piece of software that is kicked off by user's submitting jobs through a website. Because the software is itself designed to capitalize on parallel processing, all I want to do is queue up these jobs so they can kick off one after the other. To do this, I've tried to capitalize on the Executor framework built into Java. The code I've developed is:
public JobManager()
{
mcpExecutor = Executors.newSingleThreadExecutor();
}
public Future<MatlabProcessResults> startProcess(inputs)
{
MyProcess myProcess = new MyProcess(inputs);
Future<MyProcessResults> future = mcpExecutor.submit(myProcess);
Long newKey = System.currentTimeMillis();
futures.putIfAbsent(newKey, future);
}
Where startProcess is run every time the "submit" button is pressed. Now, the description of the newSingleThreadExecutor reads:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
This led me to think that it would take multiple tasks, queue them, and only run one instance of the software at a time. As you might suspect, I'm writing because it doesn't do that. It is starting as many tasks as I submit in parallel (I know, the opposite problem of what most people probably want to do). Any help on this issue is much appreciated, and thank you in advance.
I wrote a multi threading programme, which have two to four thread at the same time.
One of the thread is time critical thread, it will be called every 500 milliseconds, it is not allow to delay more than 10 milliseconds. But when other thread have more loading, I find that some delay, around two millisecond is occurred. (Print the timestamp to show it) So, I worry that after running for a long time it will delay more than 10 milliseconds, except from check the timestamp, and adjust the looping interval to make sure the time is not delay more than 10 milliseconds, is there any way to make it safe?
Thanks.
Sounds like you need Real-Time Java
If timing is critical, I use a busy wait on a core which is dedicated to that thread. This can give you << 10 micro-second jitter most of the time. Its a bit extreme and will result in the logical thread not being used for anything else.
This is the library I use. You can use it to reserve a logical thread or a whole core. https://github.com/peter-lawrey/Java-Thread-Affinity
By using isolcpus= in grub.conf on Linux you can ensure that the logical thread or core is not used for any else (except the 100 Hz timer and power management which are relatively small and < 2 us delay)
You can set your threads priorities:
myCriticalThread.setPriority(Thread.MAX_PRIORITY);
otherThread.setPriority(Thread.NORM_PRIORITY); // the default
yetAnotherThread.setPriority(Thread.MIN_PRIORITY);
It won't really guarantee anything though.
There is no guarantee that your thread isn't delayed, because the OS may decide to give other processes precedence (unless you put effort in setting up a complete real-time system including a modified OS). That being said, for basic tasks, you should use a ScheduledExecutorService like this:
class A {
private final ScheduledExecutorService exe = Executors.newScheduledThreadPool(1);
public void startCriticalAction(Runnable command) {
this.exe.scheduleAtFixedRate(command, 100, 100, TimeUnit.MILLISECONDS);
}
public void shutdown() {
this.exe.shutdown();
}
}
The executor service will do its best to execute the task every 100ms. You should not develop this functionality yourself, because a lot of things can go wrong.
Creep up on the timeout:
waitFor(int timeout)
{
dateTime wallTimeEnd;
wallTimeEnd=now()+(msToDateTime(timeout));
int interval=timeout/2;
while(true){
if(interval>10){
sleep(interval);
interval=dateTimeToMs(wallTimeEnd-now()) / 2;
}
else
{
do{
sleep(0);
interval=dateTimeToMs(wallTimeEnd-now());
}while(interval>0);
}
}
This only wastes a core for 5-10ms
we are using Hazelcast distributed tasks a lot and realized, that sometimes just starting a task lasts > 2 sec, even before the task itself is executed. We did this on a single machine; that is, no network overhead. The task executed itself has just a single line of code in its call() method (we placed a System.currentTimeMillis() at beginning and end) stores the passed argument "client" in its constructor call - nothing else.
Task is started as follows:
FutureTask<Member> task = new DistributedTask<Member>(new NotifyWaitingClientTask(client),
theId);
ExecutorService executorService = hazelcastInstance.getExecutorService();
executorService.execute(task);
...
task.get();
Question is: is this a usual time? We expected rather milliseconds on local machines.
This is not normal unless you have two many tasks that take too much time and the Executor threads are already occupied. So the task will not start until there is an available thread to execute and you'll see that latency.
If this is not the case, can you come up with the code that we can run and reproduce the issue.
Fuad
Hazelcast