time required to finish the multithreaded program?

time required to finish the multithreaded program? - java

A java process starts 5 threads , each thread takes 5 minutes. what will be the minimum and maximum time taken by process? will be of great help if one can explain in java threads and OS threads.
Edit : I want to know how java schedule threads at OS level.

This depends on the amount of logical processor cores you have and the already running processes and the priority of the threads. The theoretical minimum would be 5 minutes plus the little overhead in starting and controlling threads, if you have at least five logical processor cores. The theoretical maximum would be 25 minutes plus the little overhead, if you have only one logical processor core available. The mentioned overhead is usually not more than a few milliseconds.
The theoretical maximum can however be unpredictably (much) higher if there are at the same time a lot of another running threads with a higher priority from other processes than the JVM.
Edit : I want to know how java schedule threads at OS level.
The JVM just spawns another native thread and it get assigned to the process associated with JVM itself.

Minimum time, 5 minutes, assuming that threads run entirely concurrently with no interdependencies and have a dedicated core available. Maximum time, 25 minutes, assuming that each thread has to have exclusive use of some global resource and so can't run in parallel with any other thread.

A glib (but realistic answer) for the maximum is that they might take an infinite amount of time to complete, as multi-threaded programs often contain deadlock bugs.

It depends! There isn't enough information to quantify this.
Missing Info: Hardware - How many threads can run at the same time on your CPU. Workload - Does it take 5 minutes because it's doing something for 5 minutes or is it performing some calculation that usually takes about 5 min's and uses a lot of CPU resources.
When you run multiple threads concurrently there can be lock waits for resources or the threads may even have to take turns executing and although they have been running for 5 minuets they may only have had a few CPU seconds.
5 threads never euqals 5X output. It can get close but will never reach 5X.

I am not sure whether you are looking for CPU time spent by the thread. If that is the case, you can measure the CPU time, see below
ThreadMXBean tb = ManagementFactory.getThreadMXBean()
long startTime= tb.getCurrentThreadCpuTime();
Call the above when thread is created
long endTime= tb.getCurrentThreadCpuTime();
The difference between endTime - starTime, is the CPU time that the thread used

Related

Does how many processes can be executed dependent on number of cores [duplicate]

Please I got confused about something.
What I know is that the maximum number of threads that can run concurrently on a normal CPU of a modern computer ranges from 8 to 16 threads.
On the other hand, using GPUs thousands of threads can run concurrently without the scheduler interrupting any thread to schedule another one.
On several posts as:
Java virtual machine - maximum number of threads https://community.oracle.com/message/10312772
people are stating that they run thousands of java threads concurrently on normal CPUs.
How could this be ??
And how can I know the maximum number of threads that can run concurrently so that my code adjusts it self dynamically according to the underlying architecture.

Threads aren't tied to or limited by the number of available processors/cores. The operating system scheduler can switch back and forth between any number of threads on a single CPU. This is the meaning of "preemptive multitasking."
Of course, if you have more threads than cores, not all threads will be executing simultaneously. Some will be on hold, waiting for a time slot.
In practice, the number of threads you can have is limited by the scheduler - but that number is usually very high (thousands or more). It will vary from OS to OS and with individual versions.
As far as how many threads are useful from a performance standpoint, as you said it depends on the number of available processors and on whether the task is IO or CPU bound. Experiment to find the optimal number and make it configurable if possible.

There is hardware and software concurrency. The 8 to 16 threads refers to the hardware you have - that is one or more CPUs with hardware to execute 8 to 16 threads parallel to each other. The thousands of threads refers to the number of software threads, the scheduler will have to swap them out so every software thread gets its time slice to run on the hardware.
To get the number of hardware threads you can try Runtime.availableProcessors().

At any given time, a processor will run the number of threads equal to the number of cores contained. This means that on a uniprocessor system, only one thread (or no thread) is being run at any given moment.
However, processors do not run each thread one after another, rather they switch between multiple threads rapidly to simulate concurrent execution. If this weren't the case let alone create multiple threads, you won't even be able to start multiple applications.
A java thread (compared to processor instructions) is a very high level abstraction of a set of instructions for the CPU to process. When it gets down to the processor level, there is no guarantee which threads will run on which core at any given time. But given that processors rapidly switch between these threads, it is theoretically possible to create an infinite amount of threads albeit at the cost of performance.
If you think about it, a modern computer has thousands of threads running at the same time (combining all applications) while only having 1 ~ 16 (typical case) number of cores. Without this task-switching, nothing would ever get done.
If you are optimizing your application, you should consider the amount of threads you need by the work at hand, and not by the underlying architecture. Performance gains from parallelism should be weighted against increasing overheads of thread execution. Since every machine is different, every runtime environment is different, it is impractical to work out some golden thread count (however, a ballpark estimate may be made by benchmarking and looking at number of cores).

While all the other answers have explained how you can theoretically have thousands of threads in your application at the cost of memory and other overheads already well explained here. It is however worth noting that the default concurrencyLevel for the data structures provided in the java.util.concurrent package is 16.
You will come across contention issues if you don't account for the same.
Using a significantly higher value than you need can waste space and time, and a significantly lower value can lead to thread contention.
Make sure you have set the appropriate concurrencyLevel in case you are running into issues related to concurrency with a higher number of threads.

Java multithreading doesn't use the threads at 100%

I have this program that has to run 10k times a specific method, which is fairly heavy, every time with different input.
I tried both to thread it naturally (one thread per input) and tried to thread it with nr_threads= Runtime.getRuntime().availableProcessors() threads, or 2-3 times that amount (the method has different complexity based on the input, so I found out that if I deploy exactly nr_threads threads, then one usually is still alive when all the other threads are dead.
When I run it locally on my computer (4 physical cores, 8 considering the virtual ones) it runs at 100%, but every time I try to run it on a server (amazon instances, if that matters), where I have 36 or 72 cores, the average load per thread is between 15 and 25%.
I use a Callable class for the multithreading, and from that one I call only static methods. I also update a matrix, but I'm sure that no two threads try to access the same cell, so it shouldn't be a concurrency issue. RAM usage is also fairly safe ( 40GB out of 60 ), so I would exclude intense GC activity. But I have no idea how to test GC activity.
Does anyone know why it's using only 25% of each thread?
Also, I get that 10k threads might be a bit overwhelming for the computer to handle, but I found no info about it. Is there a best practice when it comes down to the number of threads deployed?
This is another section of the program, that deploys 10000 threads all of them heavily loaded. It takes 15 minutes to complete.
This is the code I am referring to, it deploys 5 times the available processors (so 180 in total). It takes around 10 minutes to complete. Nothing changes if I run 10000 threads instead.

Why in Java my CPU bound threads can cause operations in kernel space?

I have large array of type C and a pool of threads. Each thread has a range of indexes (they don't overlap) and does some CPU bound operations to populate them.
After submission of tasks to the executor (created with newFixedThreadPool) I monitor the output of 'top' command and can notice that the cpu spends significant amount of time in kernel space ("%sy" in 'top' output) - between 15 and 25% - during the execution of those tasks (before it is low and after it decreases again).
On some test runs it does happen that "%sy" stays close to 0 and then the execution is much faster.
The number of threads is equal to the number of logical cpus on the test machine and this is also the number of tasks that I submit to the executor (so it's like 1 thread - 1 CPU bound task). Therefore I wouldn't expect here a lot of context switching.
In this part of code there is no explicit synchronization done by me, I rely only on the guarantees provided by the executor service as the threads don't share any variables.
Operating system is Amazon Linux AMI 2014.09, the program runs on Java 8.
Any ideas why this could happen? How I can debug such issue?

You might need to use a Profiler

Program execution slows down the more threads I have running (Java)

I'm experiencing some strange behaviour in a java program. Basically, I have a list of items to process, which I can choose to process one at a time, or all at once (which means 3-4 at a time). Each item needs about 10 threads to be processed, so processing 1 item at a time = 10 threads, 2 at a time = 20 threads, 4 at a time = 40 threads, etc.
Here's the strange thing, if I process just one item, its done in approx 50-150 milliseconds. But if I process 2 at a time, it goes up to 200-300 ms per item. 3 at a time = 300-500MS per item, 4 at a time = 400-700 MS per item, etc.
Why is this happening? I've done prior research which says that jvm can handle upto 3000-4000 threads easily, so why does it slow down with just 30-40 threads for me? Is this normal behavior? I thought that having 40 threads would mean each thread would work in parallel rather than in a queue as it seems to be.

How many CPU cores do you have?
If I have one CPU core, and I max out a single threaded application on it, the CPU is always busy, if I give it two threads, both doing this heavy task I don't get double-the-cpu, no, they both get ~0.5 seconds / second (seconds per second) of CPU time take away the time the OS needs to switch threads.
So it doubles the time taken for each thread to work, but they might finish at about the same time (depending on the scheduler)
If you have two CPU cores.... then it'd (theoretically again) finish in the same time as one thread, because one thread can't use two cpu cores (at the same time)
Then there's hardware threads, some threads yield or sleep, if they're reading/writing the OS will run other threads while they are blocked, so forth....
Does this help?

It would be nice to see some source code.
Without it i have only 4 assumptions :
1) You haven't done the load balancing. You should consider about optimal number of threads.
2) Work, executed by each thread does not justify the time, needed to setup and start the thread (+ context switching time).
3) There is the real problems with your code quality
4) Weak hardware

Optimising max number of threads running on a CPU

Just wondering what is the best way to decide when to stop creating new threads on a single-core machine which is running the same program multiple times as a thread?
The threads are fetching web content and doing a bit of processing, which means the load of each thread is not constant all the way until the thread terminates.
I'm thinking to have a thread which monitors the CPU/RAM load, and stop creating threads if the load reaches a certain treshold, but also stop creating threads if a certain threads count has been reached, to make sure the CPU doesn't get overloaded.
Any feedback on what techniques are out there to achieve this?
Many thanks,
Vladimir

It is going to be difficult to do this by monitoring the CPU used by the current process. Those numbers tend to lag reality and the result is going to be peaks and valleys to a large degree. The problem is that your threads are mostly going to be blocked by IO and there is not any good way to anticipate when bytes will be available to be read in the near future.
That said, you could start out with a ThreadPoolExecutor at a certain max thread number (for a single processor let's say 4) and then check every 10 seconds or so the load average. If the load average is below what you want then you could call setMaximumPoolSize(...) with a larger value to increase it for the next 10 seconds. You may need to poll 30 or more seconds between each calculation to smooth out the performance of your application.
You could use the following code to track your total CPU time for all threads. Not sure if that's the best way to do it
long total = 0;
for (long id : threadMxBean.getAllThreadIds()) {
long cpuTime = threadMxBean.getThreadCpuTime(id);
if (cpuTime > 0) {
total += cpuTime;
}
}
// since is in nano-seconds
long currentCpuMillis = total / 1000000;
Instead of trying to maximize the CPU level for your spider, you might consider trying to maximize throughput. Take the sample of the number of pages spidered per a unit of time and increase or decrease the max number of threads in your ExecutorService until this is maximized.
One thing to consider is to use NIO and selectors so your threads are always busy as opposed to always waiting for IO. Here's a good example tutorial about NIO/Selectors. You might also consider using Pyronet which seems to provide some good features around NIO.

If async I/O is not a good fit, I would consider using thread pools, e.g. ThreadPoolExecutor, so you don't have the overhead of creating, destroying and recreating threads.
Then I would do performance testing to tweak the max number of threads offers the best performance.
You could start with 10 threads, then rerun your performance test with 20 threads until you hone in on an optimal value. At the same time I would use system tools (depending on your OS) to monitor the thread run queue, JVM, etc.
For the performance test you would have to ensure that your test is repeatable (i.e. using the same inputs) and representative of the actual input that your program would be using.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.