Are Threads in Java platform dependent? - java

It is obvious that OS scheduling/ threading algorithms have their impact on Java threads but
can we safely say that Threads are OS/machine dependant?
If this is the case then doesn't it make Java platform dependant?

Yes, the details of the scheduling of threads in Java depends on the JVM implementation and (usually) on the OS implementation as well.
But the specifics of that scheduling is also not specified in the Java SE specification, only a selected few ground rules are specified.
This means that as long as the OS specific scheduling is conforming to those ground rules, it is also conforming to the JVM spec.
If your code depends on scheduling specifics that are not specified in the JVM spec, then it depends on implementation details and can not be expected to work everywhere.
That's pretty much the same situation as file I/O: if you hard-code paths and use a fixed directory separator, then you're working outside the spec and can not expect your code to work cross-platform.
Edit: The JVM implementation itself (i.e. the JRE) is platform dependent, of course. It provides the layer that allows pure Java programs to not care about the platform specifics. To achieve this, the JRE has to be paltform specific.

... Java will usually use native threads, but on some operating
systems it uses so called "green threads", which the JVM handles
itself and is executed by a single native thread.
You shouldn't have to worry about this. It is all handled by the JVM,
and is invisible to the programmer. The only real difference I can
think of is that on an implementation that uses green threads, there
will be no performance gain from multi-threaded divide-and-conquer
algorithms. However, the same lack of performance gain is true for
implementations that use native threads, but run on a machine with a
single core.
Excerpt from JVM & Java Threads Scheduling

Even on the same platform, if you write unsafe multi-thread code, behavior can depend on the full configuration details, the rest of the machine load, and a lot of luck, as well as hardware and OS. An unsafe program can work apparently correctly one day, and fail the next on the same hardware with more-or-less the same workload.
If you write safe multi-thread code, code that depends only on what is promised in the Java Language Specification and the library APIs, the choice of platform can, of course, affect performance, but not whether it works functionally.

Related

Why Fork/Join framework was introduced when all JAVA threads are Native threads created using OS libraries?

What I know is after JDK 1.2 all Java Threads are created using 'Native Thread Model' which associates each Java Thread with an OS thread with the help of JNI and OS Thread library.
So from the following text I believe that all Java threads created nowadays can realize use of multi-core processors:
Multiple native threads can coexist. Therefore it is also called many-to-many model. Such characteristic of this model allows it to take complete advantage of multi-core processors and execute threads on separate individual cores concurrently.
But when I read about the introduction of Fork/Join Framework introduced in JDK 7 in JAVA The Compelete Reference :
Although the original concurrent API was impressive in its own right, it was significantly expanded by JDK 7. The most important addition was the Fork/Join Framework. The Fork/Join Framework facilitates the creation of programs that make use of multiple processors (such as those found in multicore systems). Thus, it streamlines the development of programs in which two or more pieces execute with true simultaneity (that is, true parallel execution), not just time-slicing.
It makes me question why the framework was introduced when 'Java Native Thread Model' existed since JDK 3?
Fork join framework does not replace the original low level thread API; it makes it easier to use for certain classes of problems.
The original, low-level thread API works: you can use all the CPUs and all the cores on the CPUs installed on the system. If you ever try to actually write multithreaded applications, you'll quickly realize that it is hard.
The low level thread API works well for problems where threads are largely independent, and don't have to share information between each other - in other words, embarrassingly parallel problems. Many problems however are not like this. With the low level API, it is very difficult to implement complex algorithms in a way that is safe (produces correct results and does not have unwanted effects like dead lock) and efficient (does not waste system resources).
The Java fork/join framework, an implementation on the fork/join model, was created as a high level mechanism to make it easier to apply parallel computing for divide and conquer algorithms.

Why operating systems are not written in java?

All the operating systems till date have been written in C/C++ while there is none in Java. There are tonnes of Java applications but not an OS. Why?
Because we have operating systems already, mainly. Java isn't designed to run on bare metal, but that's not as big of a hurdle as it might seem at first. As C compilers provide intrinsic functions that compile to specific instructions, a Java compiler (or JIT, the distinction isn't meaningful in this context) could do the same thing. Handling the interaction of GC and the memory manager would be somewhat tricky also. But it could be done. The result is a kernel that's 95% Java and ready to run jars. What's next?
Now it's time to write an operating system. Device drivers, a filesystem, a network stack, all the other components that make it possible to do things with a computer. The Java standard library normally leans heavily on system calls to do the heavy lifting, both because it has to and because running a computer is a pain in the ass. Writing a file, for example, involves the following layers (at least, I'm not an OS guy so I've surely missed stuff):
The filesystem, which has to find space for the file, update its directory structure, handle journaling, and finally decide what disk blocks need to be written and in what order.
The block layer, which has to schedule concurrent writes and reads to maximize throughput while maximizing fairness.
The device driver, which has to keep the device happy and poke it in the right places to make things happen. And of course every device is broken in its own special way, requiring its own driver.
And all this has to work fine and remain performant with a dozen threads accessing the disk, because a disk is essentially an enormous pile of shared mutable state.
At the end, you've got Linux, except it doesn't work as well because it doesn't have near as much effort invested into functionality and performance, and it only runs Java. Possibly you gain performance from having a single address space and no kernel/userspace distinction, but the gain isn't worth the effort involved.
There is one place where a language-specific OS makes sense: VMs. Let the underlying OS handle the hard parts of running a computer, and the tenant OS handles turning a VM into an execution environment. BareMetal and MirageOS follow this model. Why would you bother doing this instead of using Docker? That's a good question.
Indeed there is a JavaOS http://en.wikipedia.org/wiki/JavaOS
And here is discuss about why there is not many OS written in java Is it possible to make an operating system using java?
In short, Java need to run on JVM. JVM need to run on an OS. writing an OS using Java is not a good choice.
OS needs to deal with hardware which is not doable using java (except using JNI). And that is because JVM only provided limited commands which can be used in Java. These command including add, call a method and so on. But deal with hardware need command to operate reg, memory, CPU, hardware drivers directly. These are not supported directly in JVM so JNI is needed. That is back to the start - it is still needed to write an OS using C/assembly.
Hope this helps.
One of the main benefits of using Java is that abstracts away a lot of low level details that you usually don't really need to care about. It's those details which are required when you build an OS. So while you could work around this to write an OS in Java, it would have a lot of limitations, and you'd spend a lot of time fighting with the language and its initial design principles.
For operating systems you need to work really low-level. And that is a pain in Java. You do need e.g. unsigned data types, and Java only has signed data types. You need struct objects that have exactly the memory alignment the driver expects (and no object header like Java adds to every object).
Even key components of Java itself are no longer written in Java.
And this is -by no means- a temporary thing. More and more does get rewritten in native code to get better performance. The HotSpot VM adds "intrinsics" for performance critical native code, and there is work underway to reduce the overall cost of native calls.
For example JavaFX: The reason why it is much faster than AWT/Swing ever were is because it contains/uses a huge amount of native code. It relies on native code for rendering, and e.g. if you add the "webview" browser component it is actually using the webkit C library to provide the browser.
There is a number of things Java does really well. It is a nicely structured language with a fantastic toolchain. Python is much more compact to write, but its toolchain is a mess, e.g. refactoring tools are disappointing. And where Java shines is at optimizing polymorphism at run-time. Where C++ compilers would need to do expensive virtual calls - because at compile time it is not known which implementation will be used - there Hotspot can aggressively inline code to get better performance. But for operating systems, you do not need this much. You can afford to manually optimize call sites and inlining.
This answer does not mean to be exhaustive in any way, but I'd like to share my thoughts on the (very vast) topic.
Although it is theoretically possible to write some OS in pure java, there are practical matters that make this task really difficult. The main problem is that there is no (currently up to date and reliable) java compiler able to compile java to byte code. So there is no existing tool to make writing a whole OS from the ground up feasible in java, at least as far as my knowledge goes.
Java was designed to run in some implementation of the java virtual machine. There exist implementations for Windows, Mac, Linux, Android, etc. The design of the language is strongly based on the assumption that the JVM exists and will do some magic for you at runtime (think garbage collection, JIT compiler, reflection, etc.). This is most likely part of the reason why such a compiler does not exist: where would all these functionality go? Compiled down to byte code? It's possible but at this point I believe it would be difficult to do. Even Android, whose SDK is purely java based, runs Dalvik (a version of the JVM that supports a subset of the language) on a Linux Kernel.

Compile Java to behave like GO code

Would it be possible to write a Java compiler or Virtual Machine that would let you compile legacy java application that use thread and blocking system call the same way GO program are compiled.
Thus new Thread().run(); would create light weight thread and all blocking system call will instead be asynchronous Operating System call and make the light weight thread yield.
If not, what is the main reason this would be impossible!
Earlier versions of Sun's Java runtime on Solaris (and other UNIX systems) made use of a user space threading system known as "green threads". As described in the Java 1.1 for Solaris documentation:
Implementations of the many-to-one model (many user threads to one kernel thread) allow the application to create any number of threads that can execute concurrently. In a many-to-one (user-level threads) implementation, all threads activity is restricted to user space. Additionally, only one thread at a time can access the kernel, so only one schedulable entity is known to the operating system. As a result, this multithreading model provides limited concurrency and does not exploit multiprocessors. The initial implementation of Java threads on the Solaris system was many-to-one, as shown in the following figure.
This was replaced fairly early on by the use of the operating system's threading support. In the case of Solaris prior to Solaris 9, this was an M:N "many to many" system similar to Go, where the threading library schedules a number of program threads over a smaller number of kernel-level threads. On systems like Linux and newer versions of Solaris that use a 1:1 system where user threads correspond directly with kernel-level threads, this is not the case.
I don't think there has been any serious plans to move the Sun/Oracle JVM away from using the native threading libraries since that time. As history shows, it certainly would be possible for a JVM to use such a model, but it doesn't seem to have been considered a direction worth pursuing.
James Henstridge has already provided good background on Java green threads, and the efficiency problems introduced by exposing native OS threads to the programmer because their use is expensive.
There have been several university attempts to recover from this situation. Two such are JCSP from Kent and CTJ (albeit probably defunct) from Twente. Both offer easy design of concurrency in the Go style (based on Hoare's CSP). But both suffer from the poor JVM performance of coding in this way because JVM threads are expensive.
If performance is not critical, CSP is a superior way to achieve a concurrent design because it avoids the complexities of asynchronous programming. You can use JCSP in production code - I do.
There were reports that the JCSP team also had an experimental JNI-add-on to the JVM to modify the thread semantics to be much more efficient, but I've never seen that in action.
Fortunately for Go you can "have your cake and eat it". You get CSP-based happen-before simplicity, plus top performance. Yay!
Aside: an interesting Oxford University paper reported on a continuation-passing style modification for concurrent Scala programs that allows CSP to be used on the JVM. I'm hoping for further news on this at the CPA2014 conference in Oxford this August (forgive the plug!).

Can kernel be written in other than assembly language?

I never did kernel programming. I am a good programmer in the Java language and frequently use it. Now i feel to do something interesting with kernels. A kernel resides between hardware and OS. It communicates with hardware using system calls. Every programming language require a compiler to compile the code written in high level language and then it generate low level code, which is generally assembly language code. Here comes my doubt, if we have kernel written in C, then should we have a C compiler installed on the machine? At the end, when kernel interacts with hardware it uses assembly language, can i create kernel in Java language? If yes, then what are the requirements for the same? Thank you.
A kernel resides between hardware and OS
Usually, the kernel is considered to be part of the operating system.
It communicates with hardware using system calls
System calls are the interface that is provided by the OS to user applications. The operating system communicates with the hardware through other mechanisms (for example interrupts or memory-mapped registers).
Every programming language require a compiler to compile the code written in high level language and then it generate low level code, which is generally assembly language code.
The compiler output is typically either native machine code or a language-specific bytecode (like in the case of Java). Sometimes, compilers also target another programming language such as C or Javascript (transpilation).
Here comes my doubt, if we have kernel written in C, then should we have a C compiler installed on the machine?
That's not necessary. The C compiler produces output that can execute directly on the hardware without interpretation.
At the end, when kernel interacts with hardware it uses assembly language
The CPU doesn't understand assembly. It understands machine code.
can i create kernel in Java language?
It has been done.
If yes, then what are the requirements for the same?
If you want to write a kernel in Java, then you either have to
compile your entire Java codebase to machine code
get yourself a CPU that can execute Java bytecode
find or build a Java VM and runtime that can run on bare metal and run your Java code in it (if you do it cleverly, you can write much of the runtime and maybe also parts of the VM in Java itself).
Now to the unspoken, almost rhethorical question:
Is this a good idea?
Probably not. Why? First of all, because it would take ages to set up. Second, because you couldn't just code the way you develop an average business application. You'd have to think about performance of very time-critical code (e.g. context switching, which often requires hand-tuned assembly to be fast enough), manual memory management (as in: your MRU might expect you to give it the physical address where the page table lies), system-/hardware-specific mechanisms (how to access a XYZ controller on this particular architecture?), ...
So you would lose many of the advantages that Java has over a low-level language like C in the first place.
Yes a kernel can be written in Java, see the JNode. It would have the advantage of having no problems with: dangling pointers, mix up of pointers and array addresses, unitialised data, and many more features of C.

Are threading issues for C/C++ "system level programmers" significantly different from those faced by Java programmers?

I'm looking for a development job and see that many listings specify that the developers must be versed in multithreading. This appears both for Java job listings, and for C++ listings that involve "system programming" on UNIX.
In the past few years I have been working with Java and using its various synchronization mechanisms.
In the late 90s I did a lot of C++ work, though very little threads. In college, however, we used threads on Solaris.
My question is whether there are significant differences in the issues that developers in C/C++ face compared to developers in Java, and whether any of the techniques to address them are fundamentally different. Java obviously includes some nicer mechanisms and synchronized versions of collections, etc.
If I want to refresh or relearn threading on UNIX, what's the best approach? Which library should I look at? etc. Is there some great current tutorial on threads in c++?
The fundamental challenges of threading (e.g. synchronization, race conditions, inter-thread communication, resource cleanup), but Java makes thread much more manageable with garbage collection, exceptions, advanced synchronization objects, advanced debugging support with reflection.
With C++, you are much more likely to have memory corruption and "impossible" race conditions. And you will need to write a lot more low-level thread primitives or rely on libraries (like boost) that are not part of the standardized language.
C++ is actually aeasier to write complex threaded code in than Java because it has a feature Java lacks - RAII or "resource acquisition is initialisation". This idiom is used for all all resource control in well written C++ code, but is particularly appropriate in multi-threaded code where automatic management of synchronisation is a must.
Look at pthreads and boost (the pthreads one was a random lijnk, but it looks ok as a starting point).
At a high level the issues for Java/C/C++/ are the same. The specifics about how you solve the problem (functions to call, classes to create, etc...) vary language to language.
Garbage collection makes programming threads that do not leak memory easier, and there are fancy things you can do to address the timing of the collections.
Deterministic destructors make programming threads that do not spawn zombies easier, see ACM paper here
It depends on what level you choose to work at. Intel TBB and OpenMP handle a lot of common cases from a pretty high level. Posix threads, Windows APIs, and portable libraries like Boost threads bring you closer to the same level as primitives in Java.
C++0x threading (especially with acquire and release memory barriers) allow you to go to an even lower level for more control and complexity than Java offers (marking a variable volatile in Java gives it both an acquire and a release memory barrier, but in Java you can't ask for just the acquire or just the release barrier; where in C++0x you can).
Please note that C++0x's threading model is intentionally low level with the hope that people will build things like TBB on top of it and the next time the standards committee meets they'll be able to figure out which of those higher level libraries and toolkits work well enough to learn from.
Regardless of the programming language being uses, the idiosyncrasy of thread are common. For instance even across OS the POSIX threads & WIN32 threads have same set of logical idiosyncrasies, though the API calls & native implementation WRT underlying hardware/kernel might change, but to system programmers the logical thinking about threads & how to make them work as expected & in achieving this is the most hardest part. This is even true when coming to programming languages. If you really understand the concept of threading & thread synchronization you are good to go & use them in any programming languages you like. Since these programming languages provide syntactic sugars on top of the native thread/thread synchronization implementation.

Categories

Resources