Achieving multi-core in Java - how?

Achieving multi-core in Java - how? - java

What is the easiest way in Java for achieving multi-core? And by this, I mean, to specifically point out on what core to execute some parts of the project, so good-old "normal" java threads are not an option.
So far, I was suggested JConqurr (which is an Eclipse toolkit for multi-core programming in java), JaMP (which extends Java for OpenMP), and MPJ express, of which I don't know much. Which of the above do you consider the best, or do you have other suggestions? It would be preferable to somehow mesure the performance boost/gain, but not exclusive.
Any help would be much appreciated.
Thanks,
twentynine.

Even though it is easy to write multi-threaded code in Java, there is nothing in the Java standard runtime which generically allow you to tell the JVM or the operating system how to schedule your program.
Hence you will need to have code specifically for your JVM and/or your operating system, and that code may not be doable in Java (unless you dive into JNI or JNA). External programs can pin processes to a CPU in many Unix versions (and probably Windows too), but I don't think you can do this for individual threads.

Scala is quite popular for this. It runs on the JVM and has bindings for Java to hook them together.

Related

Why operating systems are not written in java?

All the operating systems till date have been written in C/C++ while there is none in Java. There are tonnes of Java applications but not an OS. Why?

Because we have operating systems already, mainly. Java isn't designed to run on bare metal, but that's not as big of a hurdle as it might seem at first. As C compilers provide intrinsic functions that compile to specific instructions, a Java compiler (or JIT, the distinction isn't meaningful in this context) could do the same thing. Handling the interaction of GC and the memory manager would be somewhat tricky also. But it could be done. The result is a kernel that's 95% Java and ready to run jars. What's next?
Now it's time to write an operating system. Device drivers, a filesystem, a network stack, all the other components that make it possible to do things with a computer. The Java standard library normally leans heavily on system calls to do the heavy lifting, both because it has to and because running a computer is a pain in the ass. Writing a file, for example, involves the following layers (at least, I'm not an OS guy so I've surely missed stuff):
The filesystem, which has to find space for the file, update its directory structure, handle journaling, and finally decide what disk blocks need to be written and in what order.
The block layer, which has to schedule concurrent writes and reads to maximize throughput while maximizing fairness.
The device driver, which has to keep the device happy and poke it in the right places to make things happen. And of course every device is broken in its own special way, requiring its own driver.
And all this has to work fine and remain performant with a dozen threads accessing the disk, because a disk is essentially an enormous pile of shared mutable state.
At the end, you've got Linux, except it doesn't work as well because it doesn't have near as much effort invested into functionality and performance, and it only runs Java. Possibly you gain performance from having a single address space and no kernel/userspace distinction, but the gain isn't worth the effort involved.
There is one place where a language-specific OS makes sense: VMs. Let the underlying OS handle the hard parts of running a computer, and the tenant OS handles turning a VM into an execution environment. BareMetal and MirageOS follow this model. Why would you bother doing this instead of using Docker? That's a good question.

Indeed there is a JavaOS http://en.wikipedia.org/wiki/JavaOS
And here is discuss about why there is not many OS written in java Is it possible to make an operating system using java?
In short, Java need to run on JVM. JVM need to run on an OS. writing an OS using Java is not a good choice.
OS needs to deal with hardware which is not doable using java (except using JNI). And that is because JVM only provided limited commands which can be used in Java. These command including add, call a method and so on. But deal with hardware need command to operate reg, memory, CPU, hardware drivers directly. These are not supported directly in JVM so JNI is needed. That is back to the start - it is still needed to write an OS using C/assembly.
Hope this helps.

One of the main benefits of using Java is that abstracts away a lot of low level details that you usually don't really need to care about. It's those details which are required when you build an OS. So while you could work around this to write an OS in Java, it would have a lot of limitations, and you'd spend a lot of time fighting with the language and its initial design principles.

For operating systems you need to work really low-level. And that is a pain in Java. You do need e.g. unsigned data types, and Java only has signed data types. You need struct objects that have exactly the memory alignment the driver expects (and no object header like Java adds to every object).
Even key components of Java itself are no longer written in Java.
And this is -by no means- a temporary thing. More and more does get rewritten in native code to get better performance. The HotSpot VM adds "intrinsics" for performance critical native code, and there is work underway to reduce the overall cost of native calls.
For example JavaFX: The reason why it is much faster than AWT/Swing ever were is because it contains/uses a huge amount of native code. It relies on native code for rendering, and e.g. if you add the "webview" browser component it is actually using the webkit C library to provide the browser.
There is a number of things Java does really well. It is a nicely structured language with a fantastic toolchain. Python is much more compact to write, but its toolchain is a mess, e.g. refactoring tools are disappointing. And where Java shines is at optimizing polymorphism at run-time. Where C++ compilers would need to do expensive virtual calls - because at compile time it is not known which implementation will be used - there Hotspot can aggressively inline code to get better performance. But for operating systems, you do not need this much. You can afford to manually optimize call sites and inlining.

This answer does not mean to be exhaustive in any way, but I'd like to share my thoughts on the (very vast) topic.
Although it is theoretically possible to write some OS in pure java, there are practical matters that make this task really difficult. The main problem is that there is no (currently up to date and reliable) java compiler able to compile java to byte code. So there is no existing tool to make writing a whole OS from the ground up feasible in java, at least as far as my knowledge goes.
Java was designed to run in some implementation of the java virtual machine. There exist implementations for Windows, Mac, Linux, Android, etc. The design of the language is strongly based on the assumption that the JVM exists and will do some magic for you at runtime (think garbage collection, JIT compiler, reflection, etc.). This is most likely part of the reason why such a compiler does not exist: where would all these functionality go? Compiled down to byte code? It's possible but at this point I believe it would be difficult to do. Even Android, whose SDK is purely java based, runs Dalvik (a version of the JVM that supports a subset of the language) on a Linux Kernel.

How to speed up Java applications?

I need to use Java for a desktop application. I heard that there are many tools that compile java natively, is this true? does these tools compile java program into machine code?
THank you!

Since the (Sun/Oracle) Java VM has a good JIT (just-in-time) compiler, you don't have to compile your Java program to machine code yourself. The compiler will do that on the fly when it's necessary.
So: Speed up your Java programs just as every other program:
reduce algorithmic complexity
exploit parallelism
compute at the right moment
find and remove bottlenecks
...
Since Java is a garbage collected language, there is one important point to more speed: reduce allocations! Reducing allocations will help you at least twice: The allocation itself isn't done and the garbage collector will have to do less work (which will save time).

I agree with the others that compiling to machine code does not make much sense: mind that C free/malloc have same or higher costs than Java new/garbage collection.
The NetBeans IDE comes with a built-in Profiler; so you could profile your application in that IDE to find bottlenecks.

are you coding the app or it's someone's else?
It looks you're trying to run an java app that is slow. Try increasing the memory when running it. You can change the shell script specifying these params:
java -Xms64m -Xmx512m

I need to use Java for a desktop application. I heard that there are
many tools that compile java natively, is this true? does these tools
compile java program into machine code?
Such programs do exist, but may come with tradeoffs when using some of the more dynamic capabilities of the Java platform like you may lose the ability to load new classes at runtime. The JVM may have a slow start up, but it's plenty fast enough once it gets going.
That said, one solution that I didn't see anyone mention here is to replace code written in Swing with SWT. The SWT toolkit uses native code underneath.

Is there an advantage to running JRuby if you don't know any Java?

I've heard great things about JRuby and I know you can run it without knowing any Java. My development skills are strong, Java is just not one of the tools I know. It's a massive tool with a myriad of accompanying tools such as Maven/Ant/JUnit etc.
Is it worth moving my current Rails applications to JRuby for performance reasons alone? Perhaps if I pick up some basic Java along side, there can be so added benefits that aren't obvious such as better debugging/performance optimization tools?
Would love some advice on this one.

I think you pretty much nailed it.
JRuby is just yet another Ruby execution engine, just like MRI, YARV, IronRuby, Rubinius, MacRuby, MagLev, SmallRuby, Ruby.NET, XRuby, RubyGoLightly, tinyrb, HotRuby, BlueRuby, Red Sun and all the others.
The main differences are:
portability: for example, YARV is only officially supported on x86 32 Bit Linux. It is not supported on OSX or Windows or 64 Bit Linux. Rubinius only works on Unix, not on Windows. JRuby OTOH runs everywhere: desktops, servers, phones, App Engine, you name it. It runs on the Oracle JDK, OpenJDK, IBM J9, Apple SoyLatte, RedHat IcedTea and Oracle JRockit JVMs (and probably a couple of others I forgot about) and also on the Dalvik VM. It runs on Windows, Linux, OSX, Solaris, several BSDs, other proprietary and open Unices, OpenVMS and several mainframe OSs, Android and Google App Engine. In fact, on Windows, JRuby passes more RubySpec tests than "Ruby" (meaning MRI or YARV) itself!
extensibility: Ruby programs running on JRuby can use any arbitrary Java library. Through JRuby-FFI, they can also use any arbitrary C library. And with the new C extension support in JRuby 1.6, they can even use a large subset of MRI and YARV C extensions, like Mongrel for example. (And note that "Java" or "C" library does not actually mean written in those languages, it only means with a Java or C API. They could be written in Scala or Clojure or C++ or Haskell.)
tooling: whenever someone writes a new tool for YARV or MRI (like e.g. memprof), it turns out that JRuby already had a tool 5 years ago which does the same thing, only better. The Java ecosystem has some of the best tools for "runtime behavior comprehension" (which is a term I just made up, by which I mean much more than just simple profiling, I mean tools for deeply understanding what exactly your program does at runtime, what its performance characteristics are, where the bottlenecks are, where the memory is going, and most importantly why all of that is happening) and visualization available on the market, and pretty much all of those work with JRuby, at least to some extent.
deployment: assuming that your target system already has a JVM installed, deploying a JRuby app (and I'm not just talking about Rails, I also mean desktop, mobile, other kinds of servers) is literally just copying one JAR (or WAR) and a double-click.
performance: JRuby has much higher startup overhead. In return you get much higher throughput. In practice, this means that deploying a Rails app to JRuby is a good idea, as is running your integration tests, but for developer unit tests and scripts, MRI, YARV or Rubinius are better choices. Note that many Rails developers simply develop and unit test on MRI and integration test and deploy on JRuby. There's no need to choose a single execution engine for everything.
concurrency: JRuby runs Ruby threads concurrently. This means two things: if your locking is correct, your program will run faster, and if your locking is incorrect, your program will break. (Unfortunately, neither MRI nor YARV nor Rubinius run threads concurrently, so there's still some broken multithreaded Ruby code out there that doesn't know it's broken, because obviously concurrency bugs can only show up if there's actual concurrency.)
platforms (this is somewhat related to portability): there are some amazing Java platforms out there, e.g. the Azul JCA with 768 GiBytes of RAM and 864 CPU cores specifically designed for memory-safe, pointer-safe, garbage-collected, object-oriented languages. Android. Google App Engine. All of those run JRuby.

I would modify what Peter said slightly. JRuby may use more memory compared to standard Ruby, but that's usually because you're doing the work in a single process what would take several processes with Ruby.
You should try the Rails.threadsafe! option with a single JRuby runtime (for example, the Trinidad gem with the --threadsafe option). We've heard several stories where it gives you great performance and low memory usage, while leveraging multiple CPU cores with a single process.

JRuby is one of the few implementations that uses native threads. So if you care to do some multithreading, go for it.
As far as hosting is concerned, you have to put your app in some sort of java container, which I personally find to be far less straightforward than using something like passenger (for Rack apps)
I use JRuby for an app as we communicate over JMS and it works fine, but if I wasn't using any Java I would certainly stick to CRuby. My biggest beef is that in testing, running tests takes forever with JRuby as you have to spin up a VM each time you run them. This makes it a lot harder to TDD as it's a significant hit on your testing time.

Jruby has advantages if you're on Windows. It supports 64 bits and you can use a lot of proprietary databases with standard JDBC drivers.

The latest releases are significantly faster than Ruby but also use significantly more memory. If that is your only reason for using JRuby, I wouldn't bother unless you have a specific performance need that it solves, simply because, while it is pretty popular, it is less standard for hosting and less people use it as compared to standard Ruby. That being said, there are many other reasons to use JRuby such as a need for interoperability with existing Java code and the need to deploy in environments where Java has been "blessed" by the operations department and Ruby has not.

Java thread affinity

Does anybody know of a way to lock down individual threads within a Java process to specific CPU cores (on Linux)? I've done this in C, but can't find how to do this in Java. My instincts are that this will require a JNI call, but I was hoping someone here might have some insight or might have done it before.
Thanks!

You can't do this in pure java. But if you really need it -- you can use JNI to call native code which do the job. This is the place to start with:
http://ovatman.blogspot.com/2010/02/using-java-jni-to-set-thread-affinity.html
http://blog.toadhead.net/index.php/2011/01/22/cputhread-affinity-in-java/
UPD: After some thinking, I've decided to create my own class for this: ThreadAffinity.java It's JNA-based, and very simple -- so, if you want to use it in production, may be you should spent some time making it more stable, but for benchmarking and testing it works well as is.
UPD 2: There is another library for working with thread affinity in java. It uses same method as previously noted, but has another interface

I know it's been a while, but if anyone comes across this thread, here's how I solved this problem. I wrote a script that would do the following:
"jstack -l "
Take the results, find the "nid"'s of the threads I want to manually lock down to cores.
Taskset those threads.

You might want to take a look at https://github.com/peter-lawrey/Java-Thread-Affinity/blob/master/src/test/java/com/higherfrequencytrading/affinity/AffinityLockBindMain.java

IMO, this will not be possible unless you use native calls. JVM is supposed to be platform independent, any system calls done to achieve this will not result in a portable code.

It's not possible (at least with plain Java).
You can use thread pools to limit the amount of threads (and therefore cores) used for different types of work, but there is no way to specify a core to use.
There is even the (small) possibility that your Java runtime doesn't support native threading for your OS or hardware. In this case, green threads are used and only one core will be used for the whole JVM.

Writing functional programs in non-functional languages

Suppose I write a program using immutable data structures in Java. Even though it is not a functional language, it should be able to execute parallely. How do I ensure that my program is being executed using all the cores of my processer? How does the computer decide which code can be run parallely?
P.S. My intent in asking this question was not to find out how to parrallelize java programs. But to know - how does the computer parallelize code. Can it do it in a functional program written in a non functional language?

Java programs are parallelized through threads. The computer can't magically figure out how to distribute the pieces of your application across all the cores in an imperative language like Java. Only a functional language like Erlang or Haskell could do that. Read up on Java threads.

I am not aware of automatic parallelization JVMs. They do exist for other languages such as FORTRAN.
You might find the JSR166y fork-join framework scheduled for JDK7 interesting.

i dont think you can "force" the JVM to parallelize your program, but having a separate thread executing each "task", if you can break down your program that way, would probably do the trick in most cases? parallelism is still not guaranteed however.

You can write functions with automatically parallelise tasks, it is fairly easy to do for specific cases, however I am not aware of any built-in Java API which does this. (Except perhaps the Executor/ExecutorService)

Something that I used in school that did alot of the work for you.
http://www.cs.rit.edu/~ark/pj.shtml
It has the capability to do SMP or message based parallelism.
Whether or not it is useful to you is a different question :-)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.