jvm design decision

jvm design decision - java

Why does the jvm require around 10 MB of memory for a simple hello world but the clr doesn't. What is the trade-off here, i.e. what does the jvm gain by doing this?
Let me clarify a bit because I'm not conveying the question that is in my head. There is clearly an architectural difference between the jvm and clr runtimes. The jvm has a significantly higher memory footprint than the clr. I'm assuming there is some benefit to this overhead otherwise why would it exist. I'm asking what the trade-offs are in these two designs. What benefit does the jvm gain from it's memory overhead?

I guess one reason is that Java has to do everything itself (another aspect of platform independence). For instance, Swing draws it's own components from scratch, it doesn't rely on the OS to draw them. That's all got to take place in memory. Lots of stuff that windows may do, but linux does not (or does differently) has to be fully contained in Java so that it works the same on both.
Java also always insists that it's entire library is "Linked" and available. Since it doesn't use DLLs (they wouldn't be available on every platform), everything has to be loaded and tracked by java.
Java even does a lot of it's own floating point since the FPUs often give different results which has been deemed unacceptable.
So if you think about all the stuff C# can delegate to the OS it's tied to vs all the stuff Java has to do for the OS to compensate for others, the difference should be expected.
I've run java apps on 2 embedded platforms now. One was a spectrum analyzer where it actually drew the traces, the other is set-top cable boxes.
In both cases, this minimum memory footprint hasn't been an issue--there HAVE been Java specific issues, that just hasn't been one. The number of objects instantiated and Swing painting speed were bigger issues in these cases.

I don't know if initial memory footprint or a footprint of a Hello World application is important. A difference might be due to the number and sizes of the libraries that are loaded by the JVM / CLR. There can also be an amount of memory that is preallocated for garbage collection pools.
Every application that I know off, uses a lot more then Hello World functionality. That will load and free memory thousands of times throughout the execution of the application. If you are interested in Memory Utilization differences of JVM vs CLR, here are a couple of links with good information
http://benpryor.com/blog/2006/05/04/jvm-vs-clr-memory-allocation/
Memory Management Case study (JVM & CLR)
Memory Management Case study is in Power Point. A very interesting presentation.

Seems like java is just using more virtual memory.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
amwise 20598 0.0 0.5 22052 5624 pts/3 Sl+ 14:59 0:00 mono Test.exe
amwise 20601 0.0 0.7 214312 7284 pts/2 Sl+ 15:00 0:00 java Program
I made a test program in C# and in Java that print the string "test" and waits for input. I believe that the resident set size (RSS) value more accurately shows the memory usage. The virtual memory useage (VSZ) is less meaningful.
As I understand it applications can reserve a ton of virtual memory without actually using any real memory. For example you can ask the VirtualAlloc function on Windows to either reserve or commit virtual memory.
EDIT:
Here is a pretty picture from my windows box:
alt text http://awise.us/images/mem.png
Each app was a simple printf followed by a getchar.
Lots of virtual memory usage by Java and CLR. The C version depends on just about nothing, so it's memory usage is tiny relatively.
I doubt it really matters either way. Just pick whichever platform you are more familiar with and then don't write terrible, memory-wasting code. I'm sure it will work out.
EDIT:
This VMMap tool from Microsoft might be useful in figureing out where memory is going.

The JVM counts all its shared libraries whether they use memory or not.
Task manager is rather unreliable when it comes to reporting the memory consumption of programs. You should take it as a guide.

JVM loads lots of unnecessary core classes on each run from rt.jar. Unfortunately, the inner-cross dependencies (java.lang <-> java.io) of java packages make it hard to do a partial runtime init. Not to mention the rt.jar itself is over 40MB, needs lots of time for lookup and decompress.
Post Java 6u10 seems to load things a bit smarter (it has a jqs.exe = java quick starter service to keep necessary data in memory and do a faster startup), still Java 7 is told to be better.
The Process Explorer in Windows reports the Private Bytes correctly (Private bytes are those memory regions, which are not shared by any dll).
A slightly bigger annoyance is that after 10 years, JVM still defaults to 64MB memory usage. It is really annoying to use -Xmx almost every time and cannot run demanding programs in jars with a simple double click (unless I alter the file extension assignment's command).

CLR is counted as part of the OS so the task manager doesn't report it's memory consumption under the application process.

Related

JVM allocates way more than necessary?

My java heap is allocating at around 123 MB. I need this to be less. I have a 1 GB limit and both programs running are servers. One runs at 953 MB. The server JAR I am trying to run should only take up 10 MB, or less. How can I make ubuntu respond the same as other OS's I have tested the JAR on? My code can be found at GitHub.
Java Version: JDK/JRE-7

Out-of-the-box Java on *nix can look a little scary when you just look at it via top. The java executable often puts up huge numbers under the VIRT column, like 900m. Why is my small Java program using 900m of RAM?
Actually, it's probably not using 900m of RAM. The JVM has told the OS "I might use this much memory... be prepared". But it's probably not actually using anywhere near that much physical RAM -- and if it's a small program, it'll never come anywhere near that. Any physical RAM that java is not actually using is still freely available to other processes on the system.
For a more accurate picture of how much physical RAM the java process is using, look under top's RES column. Though, a full discussion of *nix memory management and profiling Java is probably outside the scope of this answer. I'd encourage you to try Googling the topic and developing specific questions based on the material you find.
Most of the time your Java programs (and other programs running along side them) are going to do just fine using Java's default memory settings. Sometimes you need to limit (or increase) the maximum amount of heap memory that JVM is allowed to allocate. This is the most commonly tuned Java memory setting, and it is usually set with the -Xmx command-line argument. You can read more about it here and here.
Sometimes it can be a little bit tricky figuring out where to modify java's command-line options if your Java program is being magically started for you, e.g., as a system service, or part of some larger script. Googling Xmx will probably get you started on the conventional way of modifying java arguments for that product.
For example Google search: ubuntu tomcat Xmx
Gives links that point us in the direction of /etc/default/tomcat6.

Java RAM increases although Heap stays same? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Limit jvm process memory on ubuntu
In my application I'm uploading documents to a server, which does some analyzing on it.
Today I analyzed my application using jconsole.exe and heap dumps as I tried to find out if I'm having memory issues / a memory leak. I thought I might suffer of one since my application is growing very much on RAM while the application is running.
As I watched the heap / codecache / perm gen etc. memory with jconsole after some runs, I was surprised as I saw the following:
picture link: https://www7.pic-upload.de/13.06.12/murk9qrka8al.png
As you can see at the jconsole on the right, the heap is increasing when I'm doing analyzing-related stuff, but it's also decreasing again to its normal size when the work is over. On the left you can see the "htop" of the sever the application is deployed on. And there it is: The RAM is, although the heap acts normally and it also seems the garbage collector is running correct, incredible high at almost 3,2gb.
This is now really confusing me. I was thinking if my java vm stack could have to do something with this? I did some research and what I found spoke about the vm stack as a little memory with only a few megabytes (or even only kb).
My technical background:
The application is running on glassfish v.3.1.2
The database is running on MySQL
Hibernate is used as ORM framework
Java version is 1.7.0_04
It's implemented using VAADIN
MySQL database and glassfish are the only things running on this server
I'm constructing XML-DOM-style documents using JAXB during the analysis and save them in the database
Uploaded documents are either .txt or .pdf files
OS is linux
Solution?
Do you have any ideas why this happens and what I can do for fixing it? I'm really surprised at the moment, since I thought the memory problems came from a memory leak which causes the heap to explode. But now, the heap isn't the problem. It's the RAM that goes higher and higher while the heap stays on the same level. And I don't know what to do to resolve it.
Thanks for every thought you're sharing with me.
Edit: Maybe I should also state out that this behaviour is currently making me impossible to really let other people use my application. When the RAM is full and the server doesn't respond anymore I'm out.
Edit2: Maybe I should also add that this RAM keeps increasing after every successfull further analyzation.

There are lots more things that use memory in a JVM implementation than the Heap Settings.
The Heap settings via -Xmx only controls the Java Heap, it doesn't control consumption of native memory by the JVM, which is consumed completely differently based on implementation.
From the following article Thanks for the Memory ( Understanding How the JVM uses Native Memory on Windows and Linux )
Maintaining the heap and garbage collector use native memory you can't control.
More native memory is required to maintain the state of the
memory-management system maintaining the Java heap. Data structures
must be allocated to track free storage and record progress when
collecting garbage. The exact size and nature of these data structures
varies with implementation, but many are proportional to the size of
the heap.
and the JIT compiler uses native memory just like javac would
Bytecode compilation uses native memory (in the same way that a static
compiler such as gcc requires memory to run), but both the input (the
bytecode) and the output (the executable code) from the JIT must also
be stored in native memory. Java applications that contain many
JIT-compiled methods use more native memory than smaller applications.
and then you have the classloader(s) which use native memory
Java applications are composed of classes that define object structure
and method logic. They also use classes from the Java runtime class
libraries (such as java.lang.String) and may use third-party
libraries. These classes need to be stored in memory for as long as
they are being used. How classes are stored varies by implementation.
I won't even start quoting the section on Threads, I think you get the idea that
the Java Heap isn't the only thing that consumes memory in a JVM implementation, not everything
goes in the JVM heap, and the heap takes up way more native memory that what you specify for
management and book keeping.
Native Code
App Servers many times have native code that runs outside the JVM but still shows up to the OS as memory associated with the process that controls the app server.

Why does System. gc () seem to have no effect on some JVMs

I have been developing a small Java utility that uses two frameworks: Encog and Jetty to provide neural network functionality for a website.
The code is 'finished' in that it does everything it needs to do, but I have some problems with memory usage. When running on my development machine the memory usage seems to fluctuate between about 4MB and 13MB when the application is doing things (training neural networks) and at most it uses about 18MB. This is very good usage and I think it is due to the fact that I call System.GC() fairly regularly. I do this because the processing time doesn't matter for me, but the memory usage does.
So it all works fine on my machine, but as soon as I put it online on our server (shared unix hosting with memory limits) it uses about 19MB to start with and rises to hundreds of MB of memory usage when doing things. These are the same things that I have been doing in testing. The only way, I believe, to reduce the memory usage, is to quit the application and restart it.
The only difference that I can tell is the Java Virtual Machine that it is being run on. I do not know about this and I have tried to find the reason why it is acting this way, but a lot of the documentation assumes a great knowledge of Java and Virtual Machines. Could someone please help m with some reasons why this may be happening and perhaps some things to try to stop it.
I have looked at using GCJ to compile the application, but I don't know if this is something I should be putting a lot of time in to and whether it will actually help.
Thanks for the help!
UPDATE: Developing on Mac OS 10.6.3 and server is on a unix OS but I don't know what. (Server is from WebFaction)

I think it is due to the fact that I
call System.GC() fairly regularly
You should not do that, it's almost never useful.
A garbage collector works most efficiently when it has lots of memory to play with, so it will tend to use a large part of what it can get. I think all you need to do is to set the max heap size to something like 32MB with an -Xmx32m command line parameter - the default depends on whether the JVM believes it's running on a "server class" system, in which case it assumes that you want the application to use as much memory as it can in order to give better throughput.
BTW, if you're running on a 64 bit JVM on the server, it will legitimately need more memory (usually about 30%) than on a 32bit JVM due to larger references.

Two points you might consider:
Calls of System.gc can be disabled by a commandline parameter (-XX:-DisableExplicitGC), I think the behaviour also depends on the gc algorithm the vm uses. Normally invoking the gc should be left to the jvm
As long as there is enough memory available for the jvm I don't see anything wrong in using this memory to increase application and gc performance. As Michael Borgwardt said you can restrict the amount of memory the vm uses at the command line.

Also you may want to look at what mode the JVM has been started when you deploy it online. My guess its a server VM.
Take a look at the differences between the two right here on stackoverflow. Also, see what garbage collector is actually running on the actual deployment. See if you can tweek the GC behaviour, or change the GC algorithm.See the -X options if its a Sun JVM.

Basically the JVM takes the amount of memory it is allowed to as needed, in order to make the "new" operation as fast as possible (this is a science in itself).
So if you have a lot of objects being used, and then discarded, you will slowly and surely fill up the available memory. Then you can ask for garbage collection, but it is just a hint, and the JVM may choose not to listen.
So, you need another mechanism to keep memory usage down. The typical approach is to limit the amount of memory with -Xoptions, but be careful since the JVM you use on your pc may be very different from the one you deploy on, and the memory need may therefore be different.
Is there a deliberate requirement for low memory usage? If not, then just let it run and see how the JVM behaves. Use jvisualvm to attach and monitor.

Perhaps the server uses more memory because there is a higher load on your app and so more threads are in use? Jetty will use a number of threads to spread out the load if there are a lot of requests. Its worth a look at the thread count on the server versus on your test machine.

Java using too much memory on Linux?

I was testing the amount of memory java uses on Linux. When just staring up an application that does absolutely NOTHING it already reports that 11 MB is in use. When doing the same on a Windows machine about 6 MB is in use. These were measured with the top command and the windows task manager. The VM on linux I use is the 1.6_0_11 one, and the hotspot VM is Server 11.2. Starting the application using -client did not influence anything.
Why does java take this much memory? How can I reduce this?
EDIT: I measure memory using the windows task manager and in Linux I open the terminal and type top.
Also, I am only interested in how to reduce this or if I even CAN reduce this. I'll decide for myself whether a couple of megs is a lot or not. It's just that the difference of 5 MB between windows and Linux is strange, and I want to know if I am able to do this on Linux too.

If you think 11MB is "too much" memory... you'd better avoid using Java entirely. Seriously, the JVM needs to do quite a lot of stuff (bytecode verifier, GC, loading all the essential classes), and in an age where average desktop machines have 4GB of RAM, keeping the base JVM overhead (and memory use in generay) very low is simply not a design priority.
If you need your app to run on an embedded system (pretty much the only case where 11 MB might legitimately be considered "too much"), then there are special JVMs designed for such sytems that use less RAM - but at the cost of lacking many of the features and/or performance of mainstream JVMs.

You can control the heap size otherwise default values will be used, java -X gives you an explanation of the meaning of these switches
i.g.
set JAVA_OPTS="-Xms6m -Xmx6m"
java ${JAVA_OPTS} MyClass

The question you might really be asking is, "Does windows task manager and Linux top report memory in the same way?" I'm sure there are others that can answer this question better than I, but I suspect that you may not be doing an apples to apples comparison.
Try using the jconsole application on each respective machine to do a more granular inspection. You'll find jconsole on your sdk under the bin directory.
There is also a very extensive discussion of java memory management at http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/
The short answer is that how memory is being allocated is a more complex answer than just looking at a single figure at the top of a user simplifed system utility.

Both Top and TaskManager will report how much memory has been allocated to a process, not how much the process is actually using, so I would say it's not an apples to apples comparison. Regardless, in the age of Gigs of memory what's a couple megs here or there on startup?

Linux and Windows are radically different operating systems and use RAM very differently. Windows kind of allocates as you go, and Linux caches more at once, and prepares for the future, so that the next operations are smooth.
This explanation is not quite right, but it's close enough for you.

RAM memory reallocation - Windows and Linux

I am working on a project involving optimizing energy consumption within a system. Part of that project consists in allocating RAM memory based on locality, that is allocating memory segments for a program as close as possible to each other. Is there a way I can know where exactly is the position of the memory I allocate (the memory chips) and I was also wondering if it is possible to force allocation in a deterministic manner. I am interested in both Windows and Linux. Also, the project will be implemented in Java and .NET so I am interested in managed APIs to achieve this.
[I am aware that this might not translate into direct energy consumption reduction but the project is supposed to be a proof of concept.]

You're working at the wrong level of abstraction.
Java (and presumably .NET) refers to objects using handles, rather than raw pointers. The underlying Java VM can move objects around in virtual memory at any time; the Java application doesn't see any difference.
Win32 and Linux applications (such as the Java VM) refer to memory using virtual addresses. There is a mapping from virtual address to a physical address on a RAM chip. The kernel can change this mapping at any time (e.g. if the data gets paged to disk then read back into a different memory location) and applications don't see any difference.
So if you're using Java and .NET, I wouldn't change your Java/.NET application to achieve this. Instead, I would change the underlying Linux kernel, or possibly the Java VM.
For a prototype, one approach might be to boot Linux with the mem= parameter to restrict the kernel's memory usage to less than the amount of memory you have, then look at whether you can mmap the spare memory (maybe by mapping /dev/mem as root?). You could then change all calls to malloc() in the Java VM to use your own special memory allocator, which allocates from that free space.
For a real implementation of this, you should do it by changing the kernel and keeping userspace compatibility. Look at the work that's been done on memory hotplug in Linux, e.g. http://lhms.sourceforge.net/

If you want to try this in a language with a big runtime you'd have to tweak the implementation of that runtime or write a DLL/shared object to do all the memory management for your sample application. At which point the overall system behaviour is unlikely to be much like the usual operation of those runtimes.
The simplest, cleanest test environment to detect the (probably small) advantages of locality of reference would be in C++ using custom allocators. This environment will remove several potential causes of noise in the runtime data (mainly the garbage collection). You will also lose any power overhead associated with starting the CLR/JVM or maintaining its operating state - which would presumably also be welcome in a project to minimise power consumption. You will naturally want to give the test app a processor core to itself to eliminate thread switching noise.
Writing a custom allocator to give you one of the preallocated chunks on your current page shouldn't be too tough, but given that to accomplish locality of reference in C/C++ you would ordinarily just use the stack it seems unlikely there will be one you can just find, download and use.

In C/C++, if you coerce a pointer to an int, this tells you the address. However, under Windows and Linux, this is a virtual address -- the operating system determines the mapping to physical memory, and the memory management unit in the processor carries it out.
So, if you care where your data is in physical memory, you'll have to ask the OS. If you just care if your data is in the same MMU block, then check the OS documentation to see what size blocks it's using (4KB is usual for x86, but I hear kids these days are playing around with 16M giant blocks?).
Java and .NET add a third layer to the mix, though I'm afraid I can't help you there.

Is pre-allocating in bigger chunks (than needed) an option at all? Will it defeat the original purpose?

I think that if you want such a tide control over memory allocation you are better of using a compiled language such as C, the JVM, isolated the actual implementation of the language from the hardware, chip selection for data storage included.

The approach requires specialized hardware. In ordinary memory sticks and slots arrangements are designed to dissipate heat as even per chip as possible. For example 1 bit in every bus word per physical chip.

This is an interesting topic although I think it is waaaaaaay beyond the capabilities of managed languages such as Java or .NET. One of the major principals of those languages is that you don't have to manage the memory and consequently they abstract that away for you. C/C++ gives you better control in terms of actually allocating that memory, but even in that case, as referenced previously, the operating system can do some hand waving and indirection with memory allocation making it difficult to determine how things are allocated together. Even then, you make reference to the actual chips, that's even harder and I would imagine would be hardware-dependent. I seriously would consider utilizing a prototyping board where you can code at the assembly level and actually control every memory unit allocation explicitly without any interference from compiler optimizations or operating system security practices. That would give you the most meaningful results as it would give you the ability to control every aspect of the program and determine, definitively that any power consumption improvements are due to your algorithm rather than some invisible optimization performed by the compiler or operating system. I imagine this is some sort of research project (very intriguing) so spending ~$100 on a prototyping board would definitely be worth it in my opinion.

In .NET there is a COM interface exposed for profiling .NET applications that can give you detailed address information. I think you will need to combine this with some calls to the OS to translate virtual addresses though.
As zztop eluded to, the .NET CLR compacts memory everytime a garbage collection is done. Although for large objects, they are not compacted. These are objects on the large object heap. The large object heap can consist of many segments scattered around from OS calls to VirtualAlloc.
Here are a couple links on the profiling APIs:
http://msdn.microsoft.com/en-us/magazine/cc300553.aspx
David Broman's CLR Profiling API Blog

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.