Parallel processing with Java on desktop and mobile cpu's

Parallel processing with Java on desktop and mobile cpu's - java

I am developping a parallel processing code to solve an optimization problem.
First I try to run only one optimization job lets call it task(1). And then I try to run two copy of the same job in parallel task(2). Theoretically these two tasks will take same amount of time.
My question is that:
When i run these tasks on my i5 desktop computer task(1) and task(2) take same time to finish.
There is no problem with that. This is what i expected.
But when i use my i7 notebook, run(2) takes around %30 times more than run(1).
I do not understand this gap. Why two completely same parallel jobs takes %30 more time ?
Can it be a hardware difference between mobile and desktop cpus ?
Code and data are completely the same.
Only one difference is i7 notebook is 64 bit but i5 desktop is 32 bit.
Is there anybody has an idea or experience related to this ?
Thanks in advance.

This might explain why the 64 bit version is running slow. Java HotSpot VM FAQ
Generally, the benefits of being able to address larger amounts of memory come with a small performance loss in 64-bit VMs versus running the same application on a 32-bit VM. This is due to the fact that every native pointer in the system takes up 8 bytes instead of 4. The loading of this extra data has an impact on memory usage which translates to slightly slower execution depending on how many pointers get loaded during the execution of your Java program. The good news is that with AMD64 and EM64T platforms running in 64-bit mode, the Java VM gets some additional registers which it can use to generate more efficient native instruction sequences. These extra registers increase performance to the point where there is often no performance loss at all when comparing 32 to 64-bit execution speed.
The performance difference comparing an application running on a 64-bit platform versus a 32-bit platform on SPARC is on the order of 10-20% degradation when you move to a 64-bit VM. On AMD64 and EM64T platforms this difference ranges from 0-15% depending on the amount of pointer accessing your application performs.

I believe it's due to the processor hardware, it seems that CPUs for mobile are slower than those for desktops. I've noticed a similar difference between my laptop (i7-2600 QM) and my desktop (i7-2600), where running the same parallel job in Java takes 33% less time on the desktop computer.

Related

JRE 32bit vs 64bit

I've been using Java for a while now, and my typical ritual of setting up a new dev machine requires the norm of downloading and installing the latest JDK from Oracle's site.
This prompted an unusual question today, does it matter if I use the 32bit or 64bit JRE bundle?
From thinking back on it, I've installed both versions before and my normal toolchain plugs happily in (Eclipse). In my day-to-day programming, I do not recall ever having to change something or think about something in a different way just because I was using the 64bit JRE (or targetting the 64bit JRE for that respect).
From my understanding of 64bit vs. 32bit - it really boils down to how the numbers are stored underneath the covers... and I do know that int is a 32 bits and long is 64 bits... same with float being 32 bits and double is 64 bits -- so is it just that Java has abstracted even this subtlety away, and perhaps has been "64 bit compatible" all along?
I'm sure I'm missing something here besides not being able to install a 64 bit JRE onto a 32 bit system.

64-bit vs. 32-bit really boils down to the size of object references, not the size of numbers.
In 32-bit mode, references are four bytes, allowing the JVM to uniquely address 2^32 bytes of memory. This is the reason 32-bit JVMs are limited to a maximum heap size of 4GB (in reality, the limit is smaller due to other JVM and OS overhead, and differs depending on the OS).
In 64-bit mode, references are (surprise) eight bytes, allowing the JVM to uniquely address 2^64 bytes of memory, which should be enough for anybody. JVM heap sizes (specified with -Xmx) in 64-bit mode can be huge.
But 64-bit mode comes with a cost: references are double the size, increasing memory consumption. This is why Oracle introduced "Compressed oops". With compressed oops enabled (which I believe is now the default), object references are shrunk to four bytes, with the caveat that the heap is limited to four billion objects (and 32GB Xmx). Compressed oops are not free: there is a small computational cost to achieve this big reduction in memory consumption.
As a personal preference, I always run the 64-bit JVM at home. The CPU is x64 capable, the OS is too, so I like the JVM to run in 64-bit mode as well.

As you note, primitive numeric types in Java are well-defined.
However, the choice between 32-bit and 64-bit JVMs can matter if your Java application is using native-code libraries, which may be built for use in a 32-bit application, a 64-bit application, or both.
If you have native libraries that support only 32-bit applications, you either need to use a 32-bit JVM, or build 64-bit versions of the libraries.

Depending on context, for local development I will always use a 64-bit JDK. Primarily because I would likely need the whole memory space for builds and the IDE.
That being said for integration to production, I would recommend 32-bit if it is possible. Why?
For some Java EE servers that are licensed for production use, it would depend on some factors like which machine how many cores etc. For WebSphere Liberty Profile specifically, you are also limited to 2GB.
64-bit JREs would take up slightly more memory and if you're trying to constrain it to something like 2GB or better yet 2x 1GB cluster you would have more flex space to work around in without paying a cent.
From https://plumbr.eu/blog/java/should-i-use-32-or-64-bit-jvm
Problem 1: 30-50% of more heap is required on 64-bit. Why so? Mainly
because of the memory layout in 64-bit architecture. First of all –
object headers are 12 bytes on 64-bit JVM. Secondly, object references
can be either 4 bytes or 8 bytes, depending on JVM flags and the size
of the heap. This definitely adds some overhead compared to the 8
bytes on headers on 32-bit and 4 bytes on references. You can also dig
into one of our earlier posts for more information about calculating
the memory consumption of an object.
Problem 2: Longer garbage collection pauses. Building up more heap
means there is more work to be done by GC while cleaning it up from
unused objects. What it means in real life is that you have to be
extra cautious when building heaps larger than 12-16GB. Without fine
tuning and measuring you can easily introduce full GC pauses spanning
several minutes. In applications where latency is not crucial and you
can optimize for throughput only this might be OK, but on most cases
this might become a showstopper.
To limit your impact for your Java EE environment, offload parts of it to other microservices such as ElasticSearch for search, Hazelcast for caching, your database for data storage and keep your Java EE server to host your application core itself rather than running the services inside it.

I think there are two main differences to consider. One has been mentioned here but not the other.
On the one hand, as other mentioned, the memory and data types. 32-bits and 64-bits JVMs use different native data type sizes and memory-address spaces.
64-bits JVMs can allocate (can use) more memory than the 32-bits ones.
64-bits use native datatypes with more capacity but occupy more space. Because that, the same Object may occupy more space too.
For JVMs which the Garbage Collector (GC) freezes the machine, the 64-bits versions may be slower because the GC must check bigger heaps/objects and it takes more time.
There is an IBM presentation explaining these differences.
And on the other hand, the supported native libraries. Java programs that use JNI to access native libraries require different versions depending on the type of JVM.
32-bits JVMs use 32-bits native libraries and 64-bits JVMs use 64bits libraries.
That means that, if your program uses libraries that rely on native code such as SWT, you will need different versions of them. Note in the SWT download page, there are different versions for Linux/Windows 32- and 64-bits. Note that there are different versions of Eclipse (each one with a different version of SWT) for 32- and 64-bits.
Some applications, such as Alloy, are packaged with 32-bits native libraries. They fail with 64-bit JVMs. You can solve these problems just downloading the corresponding 64-bits native libraries and configuring the JNI appropriately.

Do I need to understand the difference between 32-bit JVM and 64-bit JVM?
If you aren’t building a performance critical application, you don’t have to understand the difference. The subtle difference between 32-bit JVM and 64-bit JVM wouldn’t make much difference to your application. You can skip reading further
Does 64-bit JVM perform better than 32-bit JVM?
Most of us think 64-bit is bigger than 32-bit, thus 64-bit JVM performance will be better than 32-bit JVM performance. Unfortunately, it’s not the case. 64-bit JVM can have a small performance degradation than 32-bit JVM. Below is the excerpt from Oracle JDK documentation regarding 64-bit JVM performance:
“Generally, the benefits of being able to address larger amounts of memory come with a small performance loss in 64-bit VMs versus running the same application on a 32-bit VM.
The performance difference comparing an application running on a 64-bit platform versus a 32-bit platform on SPARC is on the order of 10-20% degradation when you move to a 64-bit VM. On AMD64 and EM64T platforms this difference ranges from 0-15% depending on the amount of pointer accessing your application performs.”
Does 32 bit JVM or 64 bit JVM matter anymore.
What are the things to consider when migrating from 32-bit JVM to 64-bit JVM?
a. GC Pause times
The primary reason to migrate from 32-bit JVM to 64-bit JVM is to attain large heap size (i.e. -Xmx). When you increase heap size, automatically your GC pause times will start to go high, because now there is more garbage in the memory to clear up. You need to do proper GC tuning before doing the migration, otherwise your application can experience several seconds to few minutes pause time.
b. Native Library
If your application is using Java Native Interface (JNI) to access native libraries, then you need to upgrade the native libraries as well. Because 32-bit JVM can use only 32-bit native library. Similarly, 64-bit JVM can use only 64-bit native library.

Practical limitations of JVM memory and CPU usage?

Let's say money was not a limiting factor, and I wanted to write a Java program that ran on a single powerful machine.
The goal would be to make the Java program run as fast as possible without having to swap or go to disk for anything.
Let's say that this computer has:
1 TB of RAM (64 16GB DIMMs)
64 processor cores (8 8-core processors)
running 64-bit Ubuntu
Could a single instance of a java program running in a JVM take advantage of this much RAM and processors?
Are there any practical considerations that might limit the usage and efficiency?
OS process (memory & threads) limitations?
JVM memory/heap limitations?
JVM thread limitations?
Thanks,
Galen

A single instance can try to acces all the memory, however NUMA regions mean that things such as GC perform badly accessing memory in another region. This is getting faster and JVM has some NUMA support but it needs to improve if you want scalability. Even so you can get 256 MB of heap and use 700 of native/direct memory without this issue. ;)
The biggest limitation if you have loads of memory is that arrays, collections and ByteBuffer (for memory mapped files) are all limited to a size of 2 billion. (2^31-1)
You can work around these problems with custom collections, but its really something Java should support IMHO.
BTW: You can buy a Dell R910 with 1 TB of memory and 24 cores/48 threads with Ubuntu for £40K.
BTW: I only have experience of JVMs up to 40 GB in size.

First of all, the Java program itself: a poorly designed code wouldn't use that much computer-power. Badly implemented threads, for example, could make your performance be slow.
OS is a limiting factor, too: not all OS can handle well that amount of memory.
I think the JVM can deal with that amount of memory, since the OS supports it.

Java using too much memory on Linux?

I was testing the amount of memory java uses on Linux. When just staring up an application that does absolutely NOTHING it already reports that 11 MB is in use. When doing the same on a Windows machine about 6 MB is in use. These were measured with the top command and the windows task manager. The VM on linux I use is the 1.6_0_11 one, and the hotspot VM is Server 11.2. Starting the application using -client did not influence anything.
Why does java take this much memory? How can I reduce this?
EDIT: I measure memory using the windows task manager and in Linux I open the terminal and type top.
Also, I am only interested in how to reduce this or if I even CAN reduce this. I'll decide for myself whether a couple of megs is a lot or not. It's just that the difference of 5 MB between windows and Linux is strange, and I want to know if I am able to do this on Linux too.

If you think 11MB is "too much" memory... you'd better avoid using Java entirely. Seriously, the JVM needs to do quite a lot of stuff (bytecode verifier, GC, loading all the essential classes), and in an age where average desktop machines have 4GB of RAM, keeping the base JVM overhead (and memory use in generay) very low is simply not a design priority.
If you need your app to run on an embedded system (pretty much the only case where 11 MB might legitimately be considered "too much"), then there are special JVMs designed for such sytems that use less RAM - but at the cost of lacking many of the features and/or performance of mainstream JVMs.

You can control the heap size otherwise default values will be used, java -X gives you an explanation of the meaning of these switches
i.g.
set JAVA_OPTS="-Xms6m -Xmx6m"
java ${JAVA_OPTS} MyClass

The question you might really be asking is, "Does windows task manager and Linux top report memory in the same way?" I'm sure there are others that can answer this question better than I, but I suspect that you may not be doing an apples to apples comparison.
Try using the jconsole application on each respective machine to do a more granular inspection. You'll find jconsole on your sdk under the bin directory.
There is also a very extensive discussion of java memory management at http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/
The short answer is that how memory is being allocated is a more complex answer than just looking at a single figure at the top of a user simplifed system utility.

Both Top and TaskManager will report how much memory has been allocated to a process, not how much the process is actually using, so I would say it's not an apples to apples comparison. Regardless, in the age of Gigs of memory what's a couple megs here or there on startup?

Linux and Windows are radically different operating systems and use RAM very differently. Windows kind of allocates as you go, and Linux caches more at once, and prepares for the future, so that the next operations are smooth.
This explanation is not quite right, but it's close enough for you.

jvm design decision

Why does the jvm require around 10 MB of memory for a simple hello world but the clr doesn't. What is the trade-off here, i.e. what does the jvm gain by doing this?
Let me clarify a bit because I'm not conveying the question that is in my head. There is clearly an architectural difference between the jvm and clr runtimes. The jvm has a significantly higher memory footprint than the clr. I'm assuming there is some benefit to this overhead otherwise why would it exist. I'm asking what the trade-offs are in these two designs. What benefit does the jvm gain from it's memory overhead?

I guess one reason is that Java has to do everything itself (another aspect of platform independence). For instance, Swing draws it's own components from scratch, it doesn't rely on the OS to draw them. That's all got to take place in memory. Lots of stuff that windows may do, but linux does not (or does differently) has to be fully contained in Java so that it works the same on both.
Java also always insists that it's entire library is "Linked" and available. Since it doesn't use DLLs (they wouldn't be available on every platform), everything has to be loaded and tracked by java.
Java even does a lot of it's own floating point since the FPUs often give different results which has been deemed unacceptable.
So if you think about all the stuff C# can delegate to the OS it's tied to vs all the stuff Java has to do for the OS to compensate for others, the difference should be expected.
I've run java apps on 2 embedded platforms now. One was a spectrum analyzer where it actually drew the traces, the other is set-top cable boxes.
In both cases, this minimum memory footprint hasn't been an issue--there HAVE been Java specific issues, that just hasn't been one. The number of objects instantiated and Swing painting speed were bigger issues in these cases.

I don't know if initial memory footprint or a footprint of a Hello World application is important. A difference might be due to the number and sizes of the libraries that are loaded by the JVM / CLR. There can also be an amount of memory that is preallocated for garbage collection pools.
Every application that I know off, uses a lot more then Hello World functionality. That will load and free memory thousands of times throughout the execution of the application. If you are interested in Memory Utilization differences of JVM vs CLR, here are a couple of links with good information
http://benpryor.com/blog/2006/05/04/jvm-vs-clr-memory-allocation/
Memory Management Case study (JVM & CLR)
Memory Management Case study is in Power Point. A very interesting presentation.

Seems like java is just using more virtual memory.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
amwise 20598 0.0 0.5 22052 5624 pts/3 Sl+ 14:59 0:00 mono Test.exe
amwise 20601 0.0 0.7 214312 7284 pts/2 Sl+ 15:00 0:00 java Program
I made a test program in C# and in Java that print the string "test" and waits for input. I believe that the resident set size (RSS) value more accurately shows the memory usage. The virtual memory useage (VSZ) is less meaningful.
As I understand it applications can reserve a ton of virtual memory without actually using any real memory. For example you can ask the VirtualAlloc function on Windows to either reserve or commit virtual memory.
EDIT:
Here is a pretty picture from my windows box:
alt text http://awise.us/images/mem.png
Each app was a simple printf followed by a getchar.
Lots of virtual memory usage by Java and CLR. The C version depends on just about nothing, so it's memory usage is tiny relatively.
I doubt it really matters either way. Just pick whichever platform you are more familiar with and then don't write terrible, memory-wasting code. I'm sure it will work out.
EDIT:
This VMMap tool from Microsoft might be useful in figureing out where memory is going.

The JVM counts all its shared libraries whether they use memory or not.
Task manager is rather unreliable when it comes to reporting the memory consumption of programs. You should take it as a guide.

JVM loads lots of unnecessary core classes on each run from rt.jar. Unfortunately, the inner-cross dependencies (java.lang <-> java.io) of java packages make it hard to do a partial runtime init. Not to mention the rt.jar itself is over 40MB, needs lots of time for lookup and decompress.
Post Java 6u10 seems to load things a bit smarter (it has a jqs.exe = java quick starter service to keep necessary data in memory and do a faster startup), still Java 7 is told to be better.
The Process Explorer in Windows reports the Private Bytes correctly (Private bytes are those memory regions, which are not shared by any dll).
A slightly bigger annoyance is that after 10 years, JVM still defaults to 64MB memory usage. It is really annoying to use -Xmx almost every time and cannot run demanding programs in jars with a simple double click (unless I alter the file extension assignment's command).

CLR is counted as part of the OS so the task manager doesn't report it's memory consumption under the application process.

Benefits/drawbacks to running 64-bit JVM on 64-bit Linux server?

We run the 32-bit Sun Java 5 JVM on 64-bit Linux 2.6 servers, but apparently this limits our maximum memory per process to 2GB. So it's been proposed that we upgrade to the 64-bit JVM's to remove the restriction. We currently run multiple JVM's (Tomcat instances) on a server in order to stay under the 2GB limit, but we'd like to consolidate them in the interest of simplifying deployment.
If you've done this, can you share your experiences, please? Are you running 64-bit JVM's in production? Would you recommend staying at Java 5, or would it be ok to move to both Java 6 and 64 bits simultaneously? Should we expect performance issues, either better or worse? Are there any particular areas we should focus our regression testing?
Thanks for any tips!

At the Kepler Science Operations Center we have about 50 machines with 32-64G each. The JVMs heaps are typically 7-20G. We are using Java 6. The OS has Linux 2.6 kernel.
When we migrated to 64bit I expected there would be some issues with running the 64-bit JVM But really there have not been. Out of memory conditions are more difficult to debug since the heap dumps are so much larger. The Java Service Wrapper needed some modifications to support larger heap sizes.
There are some sites on the web claiming GC does not scale well past 2G, but I've not seen any problems. Finally, we are doing throughput intensive rather interactive intensive computing. I've never looked at latency differences; my guess is worst case GC latency will be longer with the larger heap sizes.

We use a 64-bit JVM with heaps of around 40 Gb. In our application, a lot of data is cached, resulting in a large "old" generation. The default garbage collection settings did not work well and needed some painful tuning in production. Lesson: make sure that you have adequate load-testing infrastructure before you scale up like this. That said, once we got the kinks worked out, GC performance has been great.

I can confirm Sean's experience. We are running pure-Java, computationally intensive web services (home-cooked Jetty integration, with nowadays more than 1k servlet threads, and >6Gb of loaded data in memory), and all our applications scaled very well to a 64 bit JVM when we migrated 2 years ago. I would advise to use the latest Sun JVM, as substantial improvement in the GC overhead have been done in the last few releases. I did not have any issue with Tanukisoftware's Wrapper either.

Any JNI code you have written that assumes it's running in 32 bits will need to be retested. For problems you may run into porting c code from 32 to 64 bits see this link. It's not JNI specific but still applys. http://www.ibm.com/developerworks/library/l-port64.html

After migrating to JDK6 64bits from JDK5 32bits (Windows server), we got leak in "perm gen space" memory block. After playing with JDK parameters it was resolved. Hope you will be more lucky then we are.

If you use numactl --show you can see the size of the memory banks in your server.
I have found the GC doesn't scale well when it uses more than one memory bank. This is more a hardware than a software issue IMHO but it can effect your GC times all the same.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.