What is the differene between concurrency and multithreading? Is concurrency only possible in multicore cpu? can anybody explain it with an example?
What is the differene between concurrency and multithreading?
Concurrency describes the way in which processes run. They are either sequential (one after another), concurrent (able to make progress "at the same time" although not necessarily at the same instant), or parallel (they happen simultaneously).
Multi-threading is a technique which allocates individual threads of execution; they are essentially lightweight processes with some advantages with respect to shared resources from their parent.
If you pay close attention, multi-threading is possible on both concurrent and non-concurrent systems. A thread is a lightweight process (with respect to processes); so, having multiples of threads on a non-concurrent system would not result in parallel programming. They would still start and run until finished before the other. And on a concurrent system they would each get their fair share at some CPU time; they would all be making progress concurrently.
Is concurrency only possible in multicore cpu?
I think we know now, the answer to this is no. Concurrent execution of processes is taken for granted to the point it's widely misunderstood as parallelism; a much more powerful tool.
To give an example that provides some insight, think about your machine. It does all kinds of stuff all the time and you do not (hopefully) experience any lag in its performance. All these processes are running concurrently giving you, the user, a perception of parallelism even when on a single core machine (I know cause I'm old :)).
But what about a merge sort? Couldn't we perform two merge sorts simultaneously on two halves of the data; yes. But only if we have multiple cores/CPUs.
Concurrency means doing multiple tasks simultaneously. It means multiple tasks are running parallely. So definitely to run multiple tasks parallely you need multiple threads.
So Concurrency is achieved by Multithreading
Now coming to your Question :
Is concurrency only possible in multicore cpu?
The answer is No.
If I have 2 threads and only 1 core. In this case, CPU will give time to each thread to complete its task. So Multithreading is even possible in single core CPU.
Related
I learn from a cookbook that there is the main difference between the words: Parallel and Concurrency.
Parallel: working on multiple tasks with multiple cores / CPUs
Concurrency: working on multiple tasks with one core / CPU but with the time-framing mechanism.
So I also learned that Java is the language that using the power of multiple cores to achieve multiple task processing simultaneously. But its package name on this ability is called java.util.concurrency.
Why does java use the word concurrency instead of Parallelism?
I don't think that definition of "concurrency" is correct.
The definition provided by Wikipedia is:
"In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or at the same time simultaneously partial order, without affecting the final outcome. This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi-processor and multi-core systems. In more technical terms, concurrency refers to the decomposability of a program, algorithm, or problem into order-independent or partially-ordered components or units of computation."
In other words, concurrency includes parallelism. (Or at least, the kind of parallelism you get with multiple conventional CPUs or hyperthreads.)
The article goes into more detail, and gives literature citations for the introduction of the term.
Furthermore, what the Wikipedia article says is consistent with the way that other sources use the term "concurrent"; e.g. Java: Concurrency in Practice by Brian Goetz et al.
Why does Java use the word concurrency instead of Parallelism?
Java is correct in using the term concurrent / concurrency.
The mistake is in the "cookbook" you are using.
Concurrency and parallelism are orthogonal.
You can have parallelism without concurrency e.g. single thread running on a CPU with super scalar execution, instruction pipelines and SIMD instructions.
You can have concurrency without parallelism; e.g. a bunch of threads running on a system with a single CPU (executing instructions sequentially).
Often concurrency enables parallelism. E.g. if you have split up the processing logic so that parts can be run concurrent using different threads, then often these threads can be run parallel using different CPUs.
According to my superficial knowledge, Java's multitasking is based on the JVM multithreading model; while parallelism is more based on the task parallelism in the physical concept of the CPU. It is difficult for Java to control the CPU to achieve task parallelism, and it can only guarantee more The threads are concurrent.
My application is supposed to have a "realtime with pause" functionality. The user can pause execution, do some things that modify what's going to happen, then unpause and let stuff happen. Stuff happens at regular intervals as specified by the user, can be slow, can be fast.
My goal at using threading here is to improve performance on multicore systems. The amount of data that the application is supposed to crunch at the time intervals is supposed to be arbitrarily large (I expect lots and lots of loops over collections, modifying object properties and generating random numbers, but precious little disk access). I don't want the application to be constrained by the capacity of a single core, if it can use more to run faster.
Will this actually work this way?
I've run some tests (made a program crunch numbers a lot, and looked at CPU usage during its activity), but it's not really conclusive - usage is certainly in the proximity of 100% on my dual core machine, but hardly ever 100%. Does a single-threaded (main only) Java application use all available cores for computation?
Does a single-threaded (main only) Java application use all available cores for computation?
No, it will normally use a single core.
Making a program do computations in parallel with multiple threads may make it faster, but it's not a magical solution for any kind of problem. Whether this is a suitable solution for your program depends on what your program is doing exactly, and if the algorithm can be parallelized. If, for example, you are doing lots of computations where the next computation depends on the result of the previous computation, then making it multi-threaded will not help a lot, because you can't do the computations at the same time - the next one first has to wait for the answer of the previous one. So, you first have to think about what computations in your program could be run in parallel.
Java has a lot of support for multi-threading. You can program with threads directly, or use an executor service, or use the fork/join framework. Whatever is appropriate depends on what exactly you want to do.
Does a single-threaded (main only) Java application use all available cores for computation?
Not usually, but you could make use of some higher level apis in java that is actually using threads for you and youre not even usinfpg threads directly, more obviousiously fork/join and executors, less obvious the new Streams API on collections (ie parallelStream).
In general, though, to make use of all cores, you need to do some kind of concurrency. Further...its really hard to just observe you OS monitor to see what is going on (especially with only 2 cores)...your OS has other things going on (trying to manage itself, running your IDE, running crontab, running a browers to post to stackoverflow ;).
Finally, just implementing (concurrency) itself may not help, you have to do it "right" for your code/algorithm.
a java thread will run in a single cpu. to use multiple CPUs, you should have multiple threads.
Imagine that u have to do various tasks using your hand. You will do it slowly using one hand and more effciently using both your hands. Similarly, in java or in any other language multi threading provides the system with many hands. The good news is that you can have many threads to do different tasks. Running operations in a single thread will make the program sluggish and sometimes unresponsive. A good practice is to do long running tasks in a separate thread. For example loading large chunks of data from a database should be processed in a separate thread. Downloading data from the internet should also be processed in a separate thread. What happens if you do long running operations in the main thread? The program HANGS and will become unresponsive till the task gets completed and the user will think that there is someting wrong. I hope you get it
I am new to multithreading in Java, after looking at Java virtual machine - maximum number of threads it would appear there isn't a limit to how many threads a Java/Android app can run. However, is there an advisable limit? What I mean by this is, is there a number of threads where if you run past this number then it is unwise because you are unable to determine what thread does what at what time? I hope my question makes sense.
There are some advisable limits, however they don't really have anything to do with keeping track of them.
Most multithreading comes with locking. If you are using central data storage or global mutable state then the more threads you have, the more lock contention you will get. This is app-specific and depends on how much of said state you have and how often threads read and write it.
There are no limits in desktop JVMs by default, but there are OS limits.It should be in the tens of thousands for modern Windows machines, but don't rely on the ability to create much more than that.
Running multiple tasks in parallel is great, but the hardware can only cope with so much. If you are using small threads that get fired up sometimes, and spend most their time idle, that's no biggie (Java servers were written like this for years). However if your threads are very intensive, making more of them than the number of cores you have is not likely to give you any benefit. (I believe the standard practice is twice the number of cores if you anticipate threads going idle sometimes).
Threads have a cost to them. Whenever you switch Threads you switch context, and while it isn't that expensive, doing it constantly will hurt performance. It's not a good idea to create a Thread to sum up two integers and write back a result.
If Threads need visibility of each others state, then they are greatly slowed down, since a lot of their writes have to be written back to main memory. Threads are best used for standalone tasks that require little interaction with each other.
TL;DR
Depends on OS and Hardware: on servers creating thousands of threads is fine, on desktop machines you should limit yourself to 50-200 and choose carefully what you do with them.
Note: Androids default and suggested "UI multithread helper" - the AsyncTask is not actually a thread. It's a task invoked from a ThreadPool, and as such there is no limit or penalty to using it. It has an upper limit on the number of threads it spawns and reuses them rather than creating new ones. Most Android apps should use it instead of spawning their own threads. In general, Thread Pools are fairly widespread and are a great choice unless you are forced into blocking operations.
This is a similar question to the one appearing at: How to ensure Java threads run on different cores. However, there might have been a lot of progress in that in Java, and also, I couldn't find the answer I am looking for in that question.
I just finished writing a multithreaded program. The program spawns several threads, but it doesn't seem to be using more than a single core. The program is faster (I am parallelizing something which makes it faster), but it definitely does not use all cores available, judging by running "top".
Any ideas? Is that an expected behavior?
The general code structure is as following:
for (some values in i)
{
start a thread of instantiated as MyThread(i)
(this thread uses heavily ConcurrentHashMap and arrays and basic arithmetic, no IO)
add the thread to a list T
}
foreach (thread in T)
{
do thread.join()
}
If its almost exactly 100% of one CPU, it can mean you really have
one core thread which is doing all the work and the others are not doing so much.
one resource which you are locking on and only one thread has a chance to run.
If you are using approximately one CPU it can mean this is all the work your CPUs have because you are waiting for something such as IO (network and/or disk)
I suggest you look at the state of your threads in VisualVM. It will help you identify which threads are running and give you an ideal of their pattern of behaviour. I also suggest you use a CPU profiler to help find your bottlenecks.
I think I read in the SCJP book by Katherine Sierra that JVM's ask the underlying OS to create a new OS thread for every Java thread.
So it's up to the underlying Operating System to decide how to balance Java (and any other kind of) threads between the available CPU's.
I am implementing a worker pool in Java.
This is essentially a whole load of objects which will pick up chunks of data, process the data and then store the result. Because of IO latency there will be significantly more workers than processor cores.
The server is dedicated to this task and I want to wring the maximum performance out of the hardware (but no I don't want to implement it in C++).
The simplest implementation would be to have a single Java process which creates and monitors a number of worker threads. An alternative would be to run a Java process for each worker.
Assuming for arguments sake a quadcore Linux server which of these solutions would you anticipate being more performant and why?
You can assume the workers never need to communicate with one another.
One process, multiple threads - for a few reasons.
When context-switching between jobs, it's cheaper on some processors to switch between threads than between processes. This is especially important in this kind of I/O-bound case with more workers than cores. The more work you do between getting I/O blocked, the less important this is. Good buffering will pay for threads or processes, though.
When switching between threads in the same JVM, at least some Linux implementations (x86, in particular) don't need to flush cache. See Tsuna's blog. Cache pollution between threads will be minimized, since they can share the program cache, are performing the same task, and are sharing the same copy of the code. We're talking savings on the order of 100's of nanoseconds to several microseconds per switch. If that's small potatoes for you, then read on...
Depending on the design, the I/O data path may be shorter for one process.
The startup and warmup time for a thread is generally much shorter. The OS doesn't have to start a process, Java doesn't have to start another JVM, classloading is only done once, JIT-compilation is only done once, and HotSpot optimizations are done once, and sooner.
Well usually, when discussing multi processing (/w one thread per process) versus multi threading in the same process, while the theoretical overhead is bigger in the first case than in the latter (and thus multi processing is theoretically slower than multi threading), in reality on most modern OSs this is not such a big issue. However when discussing it in the Java context, starting a new process is a lot more costly then starting a new thread. Starting a new process means starting up a new instance of the JVM which is very costly especially in terms of memory. I recommend that you start multiple threads in the same JVM.
Moreover, if you say inter-thread communication is not an issue, you can use Java's Executor Service to get a fixed thread pool of size 2x(number of available CPUs). The number of available CPU's can be autodetected at runtime via Java's Runtime class. This way you get a quick simple multithreading going without any boiler plate code.
Actually, if you do this with large scale taks using multiple jvm process is way faster than one jvm with multple threads. At least we never got one jvm runnning as fast as multple jvms.
We do some calculations where each task uses around 2-3GB ram and does some heavy number crunching. If we spawn 30 jvm's and run 30 task they perform around 15-20% better than spawning 30 threads in one jvm. We tried tuning the gc and the various memory sections and never catched up to the first variant.
We did this on various machines 14 tasks on a 16 core server, 34 tasks on a 36 core server etc. Multithreading in java always performed worde than multiple jvm processes.
It may not make any difference on simple tasks but on heavy calculations it seems jvm performce bad on threads.