In the book Core Java : Volume 1 Fundamentals -> chapter MultiThreading .
The Author wrote as follows :
"All modern desktop and server operating systems use preemptive
scheduling. However, smaller devices such as cell phones may use
cooperative scheduling...."
I am aware of the definitions/workings of both types of scheduling , but want to understand reasons why cooperative scheduling is preferred over preemptive in smaller devices.
Can anyone explain the reasons why ?

Preemptive scheduling has to solve a hard problem -- getting all kinds of software from all kinds of places to efficiently share a CPU.
Cooperative scheduling solves a much simpler problem -- allowing CPU sharing among programs that are designed to work together.
So cooperative scheduling is cheaper and easier when you can get away with it. The key thing about small devices that allows cooperative scheduling to work is that all the software comes from one vendor and all the programs can be designed to work together.

The big benefit in cooperative scheduling over preemptive is that cooperative scheduling does not use "context switching". Context switching involves storing and restoring the state of an application (or thread). This is costly.
The reason why smaller devices are able to get away with cooperative scheduling for now has to do with the fact that there is only one user on a small device. The problem with cooperative scheduling is that one application can hog up the CPU. In preemptive scheduling every application will eventually be given an opportunity to use the CPU for a few cycles. For bigger systems, where multiple demons or users are involved, cooperative scheduling may cause issues.
Reducing context switching is kind of a big thing in modern programming. You see it in Node.js, Nginx, epoll, ReactiveX and many other places.

First you have to find the Meaning of the word Preemption
Preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such changes of the executed task are known as context switches.(
Therefore, the difference is
In a preemptive model, the operating system's thread scheduler is
allowed to step in and hand control from one thread to another at any
time(tasks can be forcibly suspended).
In cooperative model, once a thread is given control it continues to
run until it explicitly yields control(handover control of CPU to the next task) or until it blocks.
Both models have their advantages and disadvantages. Preemptive scheduling works better when CPU have to run all kinds of software which are not related to each other. And cooperative scheduling works better when running programs that are designed to work together.
Examples for cooperative scheduling threads:
Windows fibers (
Sony’s PlayStation 4 SDK (
If you want to learn underline implementations of these cooperative scheduling fibers refer this book (
Your book states that "smaller devices such as cell phones", may be author is referring to cell phones from several years back. They had only few programs to run and all are provided by the phone manufacturer. So we can assume those programs are designed to work together.

Cooperative scheduling has fewer synchronizaton problems.
Cooperative scheduling can have better performance in some, mostly contrived, scenarios.
Cooperative scheduling introduces constraints upon design and implementation of threads.
Cooperative scheduling is basically useless for most real purposes because of dire I/O performance, which is why almost nobody uses it.
Even small devices will prefer to use preemptive scheduling if they can possibly get away with it. Smartphones, streaming, (esp. video), and such apps that require good I/O are essentially not possible with cooperative systems.
What you are left with are trivial embedded toaster-controllers and the like.

Hard real-time control applications often demand that at least one thread/task not be preemptively interrupted while other threads are more forgiving. Additionally, the highest priority task may require that it be executed on a rigid schedule rather than being left to the mercy of a scheduler that will eventually provide a time-slot. For these applications, cooperative multitasking seems much closer to what is needed than preemptive multitasking but it still isn't an exact fit since some tasks may need immediate on-demand interrupt response while other tasks are less sensitive to the multi-tasking scheme.

Cooperative Scheduling
A task will give up the CPU on a point called (Synchronization Point). It can use something like that in POSIX:
Preemptive Scheduling
The main difference here is that in preemptive scheduling, the task may be forced to relinquish the CPU by the scheduler. For instance, two tasks with same priority, while one of them running, its time slice is ended.


Why Kotlin/Java doesn't have an option for preemptive scheduler?

Heavy CPU bound task could block the thread and delay other tasks waiting execution. That's because JVM can't interrupt running thread and require help from programmer and manual interruption.
So writing CPU bound tasks in Java/Kotlin requires manual intervention to make things run smoothly, like using Sequence in Kotlin in code below.
fun simple(): Sequence<Int> = sequence { // sequence builder
for (i in 1..3) {
Thread.sleep(100) // pretend we are computing it
yield(i) // yield next value
fun main() {
simple().forEach { value -> println(value) }
As far as I understood the reason is that having preemptive scheduler with the ability to interrupt running threads have performance overhead.
But wouldn't it be better to have a switch, so you can choose? If you would like to run JVM with faster non-preemptive scheduler. Or with slower pre-emtpive (interrupting and switching the tread after N instructions) one but able to run things smoothly and don't require manual labor to do that?
I wonder why Java/Kotlin doesn't have such JVM switch that would allow to choose what mode you would like.
When you program using Kotlin coroutines or Java virtual threads (after Loom), you get preemptive scheduling from the OS.
Following usual practices, tasks that are not blocked (i.e., they need CPU) are multiplexed over real OS threads in the Kotlin default dispatcher or Java ForkJoinPool. Those OS threads are scheduled preemptively by the OS.
Unlike old-style multithreading, however, tasks are not assigned to a thread when they are blocked waiting for I/O. This makes no difference in terms of preemption, since a task that is waiting for I/O couldn't possibly preempt another running task anyway.
What you don't get when programming with coroutines, is preemptive scheduling over a large number of tasks simultaneously. If you have many tasks that require the CPU, then the first N will be assigned to a real thread and the OS will time slice them. The remaining ones will wait in the queue until those ones are done.
But in real life, when you have 10000 tasks that need to be simultaneously interactive, they are I/O bound tasks. On average, there aren't many that require the CPU at any one time, so the number of real threads you get from the default dispatcher or ForkJoinPool is plenty. In normal operation, the queue of tasks waiting for threads is almost always empty.
If you really had a situation where 10000 CPU-bound tasks needed to be simultaneously interactive, well, then you would be sad anyway, because time slicing would not provide a very smooth experience.
This question is based on a false premise: In the JVM, a preemptive scheduler is your only choice. No modern JVM uses co-operative multitasking.
No modern JVM implements user space threads or a scheduler of its own. JVMs use native operating system threads instead. Native threads are scheduled by the operating system, and operating system schedulers are preemptive.
The fact that JVM threads map 1-to-1 to native operating system threads is a problem for applications that need a high level of concurrency. Threads are relatively scarce and expensive. To address this, Project Loom is investigating adding "virtual threads" that may allow using native threads more sparingly, especially for I/O bound tasks.
Project Loom is under active development and there is no set schedule of when it will become a part of standard Java. Regarding how Project Loom schedules "virtual threads", the latest (May 2020) update from Project Loom claims "virtual threads are preemptive, not cooperative" but then goes on to say "none of the schedulers in the JDK currently employs time-slice-based preemption of virtual threads". It sounds like in its current state the "virtual thread" scheduler in Project Loom is somewhere between fully co-operative and fully pre-emptive. It will be interesting to see how the project develops and what we will get when it is integrated into main stream Java.
In a July 28 Q&A the Loom project lead Ron Pressler mentioned that you will be able to plug in a scheduler of your own for virtual thread, but did not go into details of how much control you get over the scheduling algorithm.

A confusion over a concurrent architecture concept

I have basic idea about concurrency but I have a confusion about the following architecture. I think it is concurrent but my colleague thinks it is not. The architecture is as follows:
I have multiple robots which publish its data to its individual gateways and there's another java service which listens on the gateways. The service creates a new thread to listen to each gateway.
My understanding is that the service is performing concurrent execution but my colleague says this is not concurrent as concurrency involves sharing of hardware.
Appreciate if some one can clarify or elaborate on this topic.
My understanding is that the service is performing concurrent execution but my colleague says this is not concurrent as concurrency involves sharing of hardware.
TL/DR: Words are squishy. That's why we have code.
"Concurrent" simply means two or more things happening at the same time. As it applies to computation, true concurrency means two or more threads of execution running at the same time, which requires separate hardware. That certainly can be separate cores of the same CPU or separate CPUs in the same chassis, so that there is some degree of shared hardware. It can also be separate cores in different chassis, however, such as in a computational cluster, though perhaps this is where your colleague is drawing his line. Such a line would be pretty arbitrary, though.
In contrast, long before it was common for even servers to feature multiple CPU (core)s, many computer systems implemented one flavor or another of multitasking, whereby multiple tasks can all be in progress at the same time by virtue of the operating system allotting slices of CPU time to each and switching them in and out. All modern general-purpose operating systems still do this. On a single core, however, this provides only simulated concurrency, because at any given instant in time, only one computation is actually making progress.
Your colleague does have a point, however, that multiple, spatially distributed robots all operating at the same time without coordination is a bit beyond what people usually mean when they talk about concurrent computation. Certainly such robots are operating concurrently, in the general-use sense of "at the same time", but it's a bit of a stretch to characterize them as participating in a concurrent computation.
The server that allocates a separate thread to handle communication with each robot may thereby be performing a concurrent computation. But as long as we're splitting hairs, do recognize that communication over a single network interface is serialized, so unless your server has multiple network interfaces, the actual communication cannot be truly concurrent. If the server is primarily just recording the data as it arrives, as opposed to incorporating it into an ongoing concurrent computation, then it would be potentially misleading to describe it as performing a concurrent operation.
Even by your colleague's definition, this is a concurrent system since there are multiple threads executing on the hardware on which the service resides.

What is the differene between concurrency and multithreading?

What is the differene between concurrency and multithreading? Is concurrency only possible in multicore cpu? can anybody explain it with an example?
What is the differene between concurrency and multithreading?
Concurrency describes the way in which processes run. They are either sequential (one after another), concurrent (able to make progress "at the same time" although not necessarily at the same instant), or parallel (they happen simultaneously).
Multi-threading is a technique which allocates individual threads of execution; they are essentially lightweight processes with some advantages with respect to shared resources from their parent.
If you pay close attention, multi-threading is possible on both concurrent and non-concurrent systems. A thread is a lightweight process (with respect to processes); so, having multiples of threads on a non-concurrent system would not result in parallel programming. They would still start and run until finished before the other. And on a concurrent system they would each get their fair share at some CPU time; they would all be making progress concurrently.
Is concurrency only possible in multicore cpu?
I think we know now, the answer to this is no. Concurrent execution of processes is taken for granted to the point it's widely misunderstood as parallelism; a much more powerful tool.
To give an example that provides some insight, think about your machine. It does all kinds of stuff all the time and you do not (hopefully) experience any lag in its performance. All these processes are running concurrently giving you, the user, a perception of parallelism even when on a single core machine (I know cause I'm old :)).
But what about a merge sort? Couldn't we perform two merge sorts simultaneously on two halves of the data; yes. But only if we have multiple cores/CPUs.
Concurrency means doing multiple tasks simultaneously. It means multiple tasks are running parallely. So definitely to run multiple tasks parallely you need multiple threads.
So Concurrency is achieved by Multithreading
Now coming to your Question :
Is concurrency only possible in multicore cpu?
The answer is No.
If I have 2 threads and only 1 core. In this case, CPU will give time to each thread to complete its task. So Multithreading is even possible in single core CPU.

Will threading help improve efficiency in Java?

My application is supposed to have a "realtime with pause" functionality. The user can pause execution, do some things that modify what's going to happen, then unpause and let stuff happen. Stuff happens at regular intervals as specified by the user, can be slow, can be fast.
My goal at using threading here is to improve performance on multicore systems. The amount of data that the application is supposed to crunch at the time intervals is supposed to be arbitrarily large (I expect lots and lots of loops over collections, modifying object properties and generating random numbers, but precious little disk access). I don't want the application to be constrained by the capacity of a single core, if it can use more to run faster.
Will this actually work this way?
I've run some tests (made a program crunch numbers a lot, and looked at CPU usage during its activity), but it's not really conclusive - usage is certainly in the proximity of 100% on my dual core machine, but hardly ever 100%. Does a single-threaded (main only) Java application use all available cores for computation?
Does a single-threaded (main only) Java application use all available cores for computation?
No, it will normally use a single core.
Making a program do computations in parallel with multiple threads may make it faster, but it's not a magical solution for any kind of problem. Whether this is a suitable solution for your program depends on what your program is doing exactly, and if the algorithm can be parallelized. If, for example, you are doing lots of computations where the next computation depends on the result of the previous computation, then making it multi-threaded will not help a lot, because you can't do the computations at the same time - the next one first has to wait for the answer of the previous one. So, you first have to think about what computations in your program could be run in parallel.
Java has a lot of support for multi-threading. You can program with threads directly, or use an executor service, or use the fork/join framework. Whatever is appropriate depends on what exactly you want to do.
Does a single-threaded (main only) Java application use all available cores for computation?
Not usually, but you could make use of some higher level apis in java that is actually using threads for you and youre not even usinfpg threads directly, more obviousiously fork/join and executors, less obvious the new Streams API on collections (ie parallelStream).
In general, though, to make use of all cores, you need to do some kind of concurrency. Further...its really hard to just observe you OS monitor to see what is going on (especially with only 2 cores)...your OS has other things going on (trying to manage itself, running your IDE, running crontab, running a browers to post to stackoverflow ;).
Finally, just implementing (concurrency) itself may not help, you have to do it "right" for your code/algorithm.
a java thread will run in a single cpu. to use multiple CPUs, you should have multiple threads.
Imagine that u have to do various tasks using your hand. You will do it slowly using one hand and more effciently using both your hands. Similarly, in java or in any other language multi threading provides the system with many hands. The good news is that you can have many threads to do different tasks. Running operations in a single thread will make the program sluggish and sometimes unresponsive. A good practice is to do long running tasks in a separate thread. For example loading large chunks of data from a database should be processed in a separate thread. Downloading data from the internet should also be processed in a separate thread. What happens if you do long running operations in the main thread? The program HANGS and will become unresponsive till the task gets completed and the user will think that there is someting wrong. I hope you get it

How long does java thread switch take?

I'm learning reactive programming techniques, with async I/O etc, and I just can't find decent authoritative comparative data about the benefits of not switching threads.
Apparently switching threads is "expensive" compared to computations. But what scale are we talking on?
The essential question is "How many processor cycles/instructions does it take to switch a java thread?" (I'm expecting a range)
Is it affected by OS?
I presume it's affected by number of threads, which is why async IO is so much better than blocking - the more threads, the further away the context has to be stored (presumably even out of the cache into main memory).
I've seen Approximate timings for various operations which although it's (way) out of date, is probably still useful for relating processor cycles (network would likely take more "instructions", SSD disk probably less).
I understand that reactive applications enable web apps to go from 1000's to 10,000's requests per second (per server), but that's hard to tell too - comments welcome
NOTE - I know this is a bit of a vague, useless, fluffy question at the moment because I have little idea on the inputs that would affect the speed of a context switch. Perhaps statistical answers would help - as an example I'd guess >=60% of threads would take between 100-10000 processor cycles to switch.
Thread switching is done by the OS, so Java has little to do with it. Also, on linux at least, but I presume also many other operating systems, the scheduling cost does not depend on the number of threads. Linux has been using an O(1) scheduler since version 2.6.
The thread switch overhead on Linux is some 1.2 µs (article from 2018). Unfortunately the article doesn't list the clock speed at which that was measured, but the overhead should be some 1000-2000 clock cycles or thereabout. On a given machine and OS the thread switching overhead should be more or less constant, not a wide range.
Apart from this direct switching cost there's also the cost of changing workload: the new thread is most likely using a different set of instructions and data, which need to be loaded into the cache, but this cost doesn't differ between a thread switch or an asynchronous programming 'context switch'. And for completeness, switching to an entirely different process has the additional overhead of changing the memory address space, which is also significant.
By comparison, the switching overhead between goroutines in the Go programming language (which uses userspace threads which are very similar to asynchronous programming techniques) was around 170 ns, so one seventh of a linux thread switch.
Whether that is significant for you depends on your use case of course. But for most tasks, the time you spend doing computation will be far more than the context switching overhead. Unless you have many threads that do an absolutely tiny amount of work before switching.
Threading overhead has improved a lot since the early 2000s, and according to the linked article running 10,000 threads in production shouldn't be a problem on a recent server with a lot of memory. General claims of thread switching being slow are often based on yesteryears computers, so take those with a grain of salt.
One remaining fundamental advantage of asynchronous programming is that the userspace scheduler has more knowledge about the tasks, and so can in principle make smarter scheduling decisions. It also doesn't have to deal with processes from different users doing wildly different things that still need to be scheduled fairly. But even that can be worked around, and with the right kernel extensions these Google engineers were able to reduce the thread switching overhead to the same range as goroutine switches (200 ns).
Rugal has a point. In modern architectures theoretical turn-around times are usually far off from actual measurements because the hardware, as well as the software have become so much more complex. It also inherently depends on your application. Many web-applications for example are I/O-bound where the context switch time matters a lot less.
Also note that context switching (what you refer to as thread switching) is an OS thing and not a Java thing. There is no guarantee as to how "heavy" a context switch in your OS is. It used to take tens if not hundreds of thousands of CPU cycles to do a kernel-level switch, but there are also user-level switches, as well as experimental systems, where even kernel-level switches can take only a few hundred cycles.

