Java Threads: On Windows & On Linux

Java Threads: On Windows & On Linux - java

I have chat server application which we are going to deploy on 3 servers. chat application used lot of multithreading.
Basically i have to decide which os i should for those 3 servers. so i want to know how linux and windows handles java threads distinctively. what is the difference? who creates operating system threads? what memory are they assigning ?
If in future scope scalability and clustering which option is better?

If in future scope scalability and clustering which option is better?
Scalability and clustering are most likely hampered by the internal design of your code not by the JVM nor by the underlying OS. And without taking a very deep look into the code every statement about this is just hot noise but not a profound statement.
But the nice thing about Java is: It will run on both platform without changing your code. So the best you can do is: Benchmark both OSes on the same hardware (but do not use any kind of virtualization!) and use the best one for your purpose.

how linux and windows handles java threads distinctively.
The beauty of Java is that you don't really care. They just work. But if you are really curious, modern JVMs delegate thread handling to the operating system. So it is more of an OS question rather than Java.
what is the difference?
See above. Java has little to do here. It is about how threading is implemented in the host OS.
who creates operating system threads?
JVM asks OS to create them and provides a thin wrapper between Java Thread object and native threads.
what memory are they assigning ?
Each thread gets its own stack (seee -Xss JVM option). Also all threads share the same heap space.

Related

What does the OS do when you run java bytecode? How it interact with the JVM?

I have been studying Operating systems in my free time, but I am confused about how it works with Java & the JVM.
Some questions
when running the java bytecode using a command like java file.class:
I understand that JVM optimize and interpret or perform JIT of the program
How does the JVM get CPU allocation for doing these in a multi-threaded application?
My assumption: In each thread of this application, they all use the same per-process JVM to perform these tasks. (Is this correct?)
What is the role of the operating system with the JVM, what interaction do they have?

(image from Wikipedia)
JVM is just an app
The Java Virtual Machine (JVM) as seen in the OpenJDK project is just another app, usually written in C and C++, sometimes written in Java.
From the point of view of the host operating system, running a JVM is just like running a word-processor, a spreadsheet, or a web browser. All of these apps are large, consuming much memory and spawning threads.
As someone commented, a JVM technically is any software and/or hardware that complies with the official specifications. Indeed, there have been attempts at building hardware chips that knew how to execute Java butecode (see Jazelle and others) but they did not succeed afaik. In practice today, the JVMs we download from Oracle or AdoptOpenJDK or other distributors are simply C/C++ apps that run like any other app on your Mac, BSD, Linux, Windows, AIX, Solaris, or similar machine.
I understand that JVM optimize and interpret or perform JIT of the program
HyperCard from Apple is vintage software similar to Java in that it too executed code internally through an interpreter with a JIT so that repeated runs of the same code blocks would suddenly run faster. HyperCard too was just another app from the point of view of the Mac operating system.
How does the JVM get CPU allocation for doing these in a multi-threaded application?
By scheduling threads on CPU cores like any other app. Word-processors use threads for writing to storage in the background and for re-rendering the document on the background. Web browsers might allocate threads for handling each web page in separate windows/tabs.
In each thread of this application, they all use the same per-process JVM to perform these tasks. (Is this correct?)
Yes, with OpenJDK, you will see one process on your OS for the JVM. All the threads of all the Java apps running within that JVM are housed within that single OS process. However, as someone commented, these are mere implementation details. People are free to implement a JVM as they see fit, in any manner they choose, as long as they comply with the Java specifications.
See the source code
OpenJDK is open-source. So if you are really curious, peruse that source code. Note that you will find areas of code specific to each OS, such as macOS versus Linux versus MS Windows, on each CPU type, such as x86 or ARM or SPARC or such, where the JVM interacts with the host OS.

Multithread applications on MPP architecture

In short:
Does it worth the effort to add multithreading scalability (Vertical scalability) on an application that will run always in a MPP infrastructure such Tandem HPNS (Horizontal scalable)?
Now, let me go deeper:
I’ve seen on many places the development under MPP (Massively Parallel Processing) using Java tend to think, if it’s Java you can use all what Java provides (You know, Write once run anywhere!) in which multithreading libraries(such threads, AKKA, Thread Pools, etc.) can help a lot by speeding up the performance using parallelism.
Forgetting the fact, if it’s MPP, it is horizontal scalable, meaning if you need a faster app, you have to design it to run multiples copies of the application, each on a different processor.
On the other side we have SMP (Symmetric Multi-processing) infrastructures (here we have any windows, Linux, UNIX like environment), in these you don’t have to worry about that, since the scalability is vertical, you can have more threads in which their execution will be distributed on the different cores the OS have available (Here I do agree on using Multithread libraries).
So, having this in mind, my question is, if there is a need of creating an application that will perform a heavy load of data with a lot of validations and other requirements in which the use of parallelism will help a lot to improve the load time, but, it has to run under a MPP environment (such Tandem HPNS).
Should the developer invest time on adding Multithread libraries to add parallelism and concurrency?
Just a couple of side notes:
1) I’m not saying SMP is better or MPP is better, they are just different infrastructures; my point goes just to the use of multithread libraries on MPP environments giving the fact an application using multithread on MPP will use just one CPU of the N Cpus the Server may has.
2) I’m not saying the MPP server does not support multithread libraries, you can have multithreads running on HPNS, but even you have 20 threads, there is no real parallelism since one thread is blocking the others; unless you have the application distributed (several copies running) on different CPUs.

No I don't think it makes sense to add multithreaded scalability on application that will always run on tandem, because tandem does not provide kernel level thread so even though you write multithreaded application it will not give any benefit.
Even tandem HPNS Java provides multithreading as per Java Spec but its performance is not comparable with linux or any other OS which support kernel level threading.
Actual purpose of tandem is HA availability because of its hardware redundancy.

Compile Java to behave like GO code

Would it be possible to write a Java compiler or Virtual Machine that would let you compile legacy java application that use thread and blocking system call the same way GO program are compiled.
Thus new Thread().run(); would create light weight thread and all blocking system call will instead be asynchronous Operating System call and make the light weight thread yield.
If not, what is the main reason this would be impossible!

Earlier versions of Sun's Java runtime on Solaris (and other UNIX systems) made use of a user space threading system known as "green threads". As described in the Java 1.1 for Solaris documentation:
Implementations of the many-to-one model (many user threads to one kernel thread) allow the application to create any number of threads that can execute concurrently. In a many-to-one (user-level threads) implementation, all threads activity is restricted to user space. Additionally, only one thread at a time can access the kernel, so only one schedulable entity is known to the operating system. As a result, this multithreading model provides limited concurrency and does not exploit multiprocessors. The initial implementation of Java threads on the Solaris system was many-to-one, as shown in the following figure.
This was replaced fairly early on by the use of the operating system's threading support. In the case of Solaris prior to Solaris 9, this was an M:N "many to many" system similar to Go, where the threading library schedules a number of program threads over a smaller number of kernel-level threads. On systems like Linux and newer versions of Solaris that use a 1:1 system where user threads correspond directly with kernel-level threads, this is not the case.
I don't think there has been any serious plans to move the Sun/Oracle JVM away from using the native threading libraries since that time. As history shows, it certainly would be possible for a JVM to use such a model, but it doesn't seem to have been considered a direction worth pursuing.

James Henstridge has already provided good background on Java green threads, and the efficiency problems introduced by exposing native OS threads to the programmer because their use is expensive.
There have been several university attempts to recover from this situation. Two such are JCSP from Kent and CTJ (albeit probably defunct) from Twente. Both offer easy design of concurrency in the Go style (based on Hoare's CSP). But both suffer from the poor JVM performance of coding in this way because JVM threads are expensive.
If performance is not critical, CSP is a superior way to achieve a concurrent design because it avoids the complexities of asynchronous programming. You can use JCSP in production code - I do.
There were reports that the JCSP team also had an experimental JNI-add-on to the JVM to modify the thread semantics to be much more efficient, but I've never seen that in action.
Fortunately for Go you can "have your cake and eat it". You get CSP-based happen-before simplicity, plus top performance. Yay!
Aside: an interesting Oxford University paper reported on a continuation-passing style modification for concurrent Scala programs that allows CSP to be used on the JVM. I'm hoping for further news on this at the CPA2014 conference in Oxford this August (forgive the plug!).

Is java's Multithreading visible to operating system

For example, I use Java to write a multi-threaded program with 5 threads. When I execute it, does the operating system (e.g. Windows 7) know that or it is just one task?

That depends on the JVM implementation.
However, in Linux platform , USUALLY there is one-one mapping between java thread and native thread.
Alternatively, the JVM could chose to implement using many-one mapping ,that is many java thread are running on one single native thread. This is called Green Thread.

Modern JVMs tend to use operating system threads, but it isn't specified, and the JVM is free to do otherwise.

How to force two Java threads to run on same processor/core?

I would like a solution that doesn't include critical sections or similar synchronization alternatives. I'm looking for something similar the equivalent of Fiber (user level threads) from Windows.

The OS manages what threads are processed on what core. You will need to assign the threads to a single core in the OS.
For instance. On windows, open task manager, go to the processes tab and right click on the java processes... then assign them to a specific core.
That is the best you are going to get.

To my knowledge there is no way you can achieve that.
Simply because the OS manages running threads and distributes resources according to it's scheduler.
Edit:
Since your goal is to have a "spare" core to run other processes on I'd suggest you use a thread manager and get the number of cores on the system (x) and then spawn at most x-1 threads on the specific system. That way you'll have your spare core.
The former statements still apply, you cannot specify which cores to run threads on unless you in the OS specify it. But from java, no.

Short of assigning the entire JVM to a single core, I'm not sure how you'd be able to do this. In Linux, you can use taskset:
http://www.cyberciti.biz/tips/setting-processor-affinity-certain-task-or-process.html
I suppose you could run your JVM within a virtualized environment (e.g., VirtualBox/VMWare instance) with one processor allocated, but I'm not sure that that gets you what you want.

I read this as asking if a Java application can control the thread affinity itself. Java does not provide any way to control this. It is treated as the business of the host operating system.
If anything can do it, the OS can, and they typically can, though the tools you use for thread pinning will be OS specific. (But if the OS is itself is virtualized, there are two levels of pinning. I don't know if that is going to work / be practical.)
There don't appear to be any relevant Hotspot JVM thread tuning options in modern JVMs.
If you were using a Rockit JVM you could choose between "native threads" (where there is a 1-1 mapping between Java and OS threads) and "thin threads" where multiple Java threads are multiplexed onto a small number of OS threads. But AFAIK, JRocket "thin threads" are only supported in 32bit mode, and they don't allow you to tune the number of OS threads used.
This is really the kind of question that you should be asking under a Sun support contract. They have people who have spent years figuring out how to get the best performance out of big Java apps.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.