Java process size 32 bit vs 64 bit

Java process size 32 bit vs 64 bit - java

From this IBM article:
A 32 bit Java process has a 4 GB process address space available shared by the Java Heap, Native Heap and the Operating System.
...
64 bit processes do not have this limit and the address ability is in terabytes. It is common for many enterprise applications to have large java heaps (we have seen applications with java heap requirements of over 100 GB). 64 bit Java allows massive Java heaps (benchmark released with heaps upto 200 GB).
Whats the explanation behind that 64 bit processors have quite large (basically very large) address space and 32 bit do not have. Basically whats happening inside 64 bit that's not inside 32 bit machines.

Whats the explanation behind that 64 bit processors have quite large (basically very large) address space and 32 bit do not have. Basically whats happening inside 64 bit that's not inside 32 bit machines.
Quite simply, there's double the space to store the address, so the value you can store in this space squares.
It may be easier to see this for lesser values; for instance, if I had a 4 bit address space, I could store up to 1111, giving me a maximum of 15 bits of memory. With an 8 bit address space, I could store up to 11111111, giving me 255 (15^2) bits of memory.
Note that this value just denotes the maximum amount of memory you can use, it doesn't actually give you this memory - but if you have more memory than you can address, you have no way of accessing it.

A 32-bit process usually has a 32-bit address space, which limits how much memory can be addressed. (See, for instance, "Why can't I get a larger heap with the 32-bit JVM?") A 64-bit process has a 64-bit address space, which essentially squares the number of addresses available.

with a 32 bit word, you can make about 4 billion different values.
That's 4 billion bytes worth of memory addresses.
with 64 bits, you can represent more values. about (4,000,000,000 ^ 2), which ends up being about 16,000,000,000,000,000,000,

Related

Valid values of -Xmx flag in java

In every example I see of the -Xmx flag in java documentation (and in examples on the web), it is always a power of 2. For example '-Xmx512m', '-Xmx1024m', etc.
Is this a requirement, or is it a good idea for some reason?
I understand that memory in general comes in powers of 2; this question is specifically about the java memory flags.

It keeps things simple, but it is more for your benefit than anything else.
There is no particular reason to pick a power of 2, or a multiple of 50 MB (also common) e.g. -Xmx400m or -Xmx750m
Note: the JVM doesn't follow this strictly. It will use this to calculate the sizes of different regions which if you add them up tends to be lightly less than the number you provide. In my experience the heap is 1% - 2% less, but if you consider all the other memory regions the JVM uses, this doesn't make much difference.
Note: memory sizes for hardware typically a power of two (on some PCs it was 3x a power of two) This might have got people into the habit of thinking of a memory sizes as a power of two.
BTW: AFAIK, in most JVMs the actual size of each region is a multiple of the page size i.e. 4 KB.

What actually memory overhead is in java?

I have read what-is-the-memory-consumption-of-an-object-in-java and what-is-the-memory-overhead-of-an-object-in-java.
But still I am in confusion.
What is memory overhead?? is it the padding?
What is JVM with compressed pointers? is it reference??
If 32-bit JVM is used then overhead will be less? Ofcourse yes.But is it because of padding?
So is it better to use always 32-bit JVM for memory efficiency or for performance?
Below image is from this link(page no.26)
In this image at starting itself they shown as 16 bytes JVM overhead,why that so??

What is memory overhead??
When more memory is used than the fields you created.
is it the padding?
Some is padding which can appear anywhere in the object, except the header which is always at the start. The header is typically 8-12 bytes long.
What is JVM with compressed pointers?
A technique for using 32-bit pointers in a 64-bit JVM to save memory.
is it reference??
References can use this technique but so can pointers to the class information for an object.
If 32-bit JVM is used then overhead will be less?
Possibly, though this is the same as using compressed pointers for references and classes.
But is it because of padding?
It's because 64-bit pointers use more space than 32-bit pointer.
So is it better to use always 32-bit JVM for memory efficiency or for performance?
No, the 32-bit processor model has 32-bit registers where as the 64-bit model has twice as many registers which are double the sized (64-bit) means far more can be held in the fastest memory, the registers. 64-bit calculations tend to be faster as well with a 64-bit processing model.
In general I would recommend you always use the 64-bit JVM unless you a) can't or b) have a very small amount of memory.
In this image at starting itself they shown as 16 bytes JVM overhead,why that so??
This is not strictly correct. This assumes you have a non compressed class reference so the header is 12-bytes, however objects are 8 byte aligned by default, which means there will be 4 bytes of padding at the end (which totals 16 bytes but it's not all at the start)
FAQ: Why can 32-bit Compressed OOP address more than 4 GB
Object have to be 8-byte aligned by default. This makes memory management easier but wastes some padding sometimes. A side effect is that the address of every object will have 000 for the lowest three bits (it has to be a multiple of 8) Those bits don't need to be stored. This allows a compressed oops to address 8 * 4 GB or 32 GB.
With a 16 byte object alignment the JVM can address 64 GB with 32-bit reference (however the padding overhead is higher and might not be worth it)
IFAQ: Why is it slower at around 28 - 32 GB
While the reference can be multiplied by 8, the heap doesn't start at the start of memory. It typically start around 4 GB after. This means that if you want the full 32 GB you have to add this offset, which has a slight overhead.
Heap sizes:
< 4 GB - zero extend address
4 - 28 GB - multiply by 8 or << 3 note: x64 has an instruction for this to support double[] and long[]
28 - 32 GB - multiple by 8 and add a register holding the offset. Slightly slower, but not usually a problem.

Why do Java objects have to be a multiple of 8?

I know that Java uses padding; objects have to be a multiple of 8 bytes. However, I dont see the purpose of it. What is it used for? What exactly is its main purpose?

Its purpose is alignment, which allows for faster memory access at the cost of some space. If data is unaligned, then the processor needs to do some shifts to access it after loading the memory.
Additionally, garbage collection is simplified (and sped up) the larger the size of the smallest allocation unit.
It's unlikely that Java has a requirement of 8 bytes (except on 64-bit systems), but since 32-bit architectures were the norm when Java was created it's possible that 4-byte alignment is required in the Java standard.

The accepted answer is speculation (but partially correct). Here is the real answer.
First of, to #U2EF1's credit, one of the benefits of 8-byte boundaries is that 8-bytes are the optimal access on most processors. However, there was more to the decision than that.
If you have 32-bit references, you can address up to 2^32 or 4 GB of memory (practically you get less though, more like 3.5 GB). If you have 64-bit references, you can address 2^64, which is terrabytes of memory. However, with 64-bit references, everything tends to slow down and take more space. This is due to the overhead of 32-bit processors dealing with 64-bits, and on all processors more GC cycles due to less space and more garbage collection.
So, the creators took a middle ground and decided on 35-bit references, which allow up to 2^35 or 32 GB of memory and take up less space so to have the same performance benefits of 32-bit references. This is done by taking a 32-bit reference and shifting it left 3 bits when reading, and shifting it right 3 bits when storing references. That means all objects must be aligned on 2^3 boundaries (8 bytes). These are called compressed ordinary object pointers or compressed oops.
Why not 36-bit references for accessing 64 GB of memory? Well, it was a tradeoff. You'd require a lot of wasted space for 16-byte alignments, and as far as I know the vast majority of processors receive no speed benefit from 16-byte alignments as opposed to 8-byte alignments.
Note that the JVM doesn't bother using compressed oops unless the maximum memory is set to be above 4 GB, which it does not by default. You can actually enable them with the -XX:+UsedCompressedOops flag.
This was back in the day of 32-bit VMs to provide the extra available memory on 64-bit systems. As far as I know, there is no limitation with 64-bit VMs.
Source: Java Performance: The Definitive Guide, Chapter 8

Data type sizes in Java are multiples of 8 bits (not bytes) because word sizes in most modern processors are multiples of 8-bits: 16-bits, 32-bits, 64-bits. In this way a field in an object can be made to fit ("aligned") in a word or words and waste as little space as possible, taking advantage of the underlying processor's instructions for operating on word-sized data.

limit for max heap size in java heap setting

is there limit to increase the max heap size in java? I am wondering if the large heap size can be set as long as the physical memory is available.
For example, if a server has 100G for RAM, then can i set the max heap at 90G? I know that GC will halt the app, but I am just curious.
Thanks.

With a 32 bit JVM, the hard limit would be 4 GB but the actual one would be lower as, at least if you aren't running a 64 bit OS, some space must be left for non heap memory, like the JVM own address space (non java), stacks for all threads, architecture/OS limitations and the likes. A 64 bit JVM has no such limitation so you could set the limit to 90 GB although I wouldn't recommend it for the reason you already pointed.

Will using longs instead of ints benefit in 64bit java

In a 64 bit VM, will using longs instead of ints do any better in terms of performance given that longs are 64 bits in java and hence pulling and processing 64 bit word may be faster that pulling 32bit word in a 64 bit system. (I am expecting a lot of NOs but I was looking for a detailed explanation).
EDIT: I am implying that "pulling and processing 64 bit word may be faster that pulling 32bit word in a 64 bit system" because I am assuming that in a 64 bit system, pulling a 32 bit data would require you to first get the 64 bit word and then mask the top 32 bits.

Using long for int probably will slow you down in general.
You immediate concern is whether int on 64 bit CPU requires extra processing time. This is highly unlikely on a modern pipelined CPU. We can test this easily with a little program. The data it operates on should be small enough to fit in L1 cache, so that we are testing this specific concern. On my machine (64bit Intel Core2 Quad) there's basically no difference.
In a real app, most data can't reside in CPU caches. We must worry about loading data from main memory to cache which is relatively very slow and usually a bottleneck. Such loading works on the unit of "cache lines", which is 64 bytes or more, therefore loading a single long or int will take the same time.
However, using long will waste precious cache space, so cache misses will increase which are very expensive. Java's heap space is also stressed, so GC activity will increase.
We can demonstrate this by reading huge long[] and int[] arrays with the same number of elements. They are way bigger than caches can contain. The long version takes 65% more time on my machine. The test is bound by the throughput of memory->cache, and the long[] memory volume is 100% bigger. (why doesn't it take 100% more time is beyond me; obviously other factors are in play too)

I always go with use the right datatype based on your problem domain.
What I mean by this is if you need 64 bit long then use a 64bit long, but if you don't need a 64 bit long then use an int.
Using 32 bits on a 64 bit platform is not expensive, and it makes no sense to do a comparison based on performance.
To me this doesn't look right:
for(long l = 0; l<100; l++)
//SendToTheMoon(l);
And SendToTheMoon has nothing to do with it.

Might be faster, might be slower. Depends on the specific Java implementation, and how you use those variables. But in general it's probably not enough difference to worry about.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.