If an object X exists in java heap, and if I knew the address of the object X on java heap, is it possible for the native code to access this object directly from memory without involving JNI? And vice versa, if java code does know the address of object Y on native heap, can java access it without involving JNI?
To be more precise, "Does the java objects gets stored in memory the same way as the native object or is it any different?". If not, wont byteArray objects in java and native gets stored in the same way?
Please provide your suggestions and references.
EDIT: Might be this one is the right question, why do the objects need to be transferred from java heap to native heap through JNI? Why cant the java heap object is accessible to native heap directly?
Can Java code access native objects? No. Java code is managed by the JVM. (More precisely, it's bytecode, not Java code.) The specification of the JVM does not allow bytecode to access arbitrary memory. Bytecode can't even access arbitrary addresses on the JVM heap. For example, private fields can only be accessed by bytecode in the same class.
Can native code access JVM heap objects directly (without JNI)? Yes. Native code is running in the same process and address space as the JVM. As far as I know, on most operating systems and hardware platforms this means that native code can do whatever it wants in that address space.
Should native code access JVM heap objects directly? Definitely not.
First of all, the JVM specification does not specify the layout of objects on the JVM heap, not even of byte arrays. For example, the JVM may split the array into chunks and transparently translate addresses when bytecode uses the array. If you tried to write native code that accesses the array, you would have to re-implement that translation. Such code may work in one JVM implementation, but probably not in another, or maybe not even in a newer version of the same JVM, or in the same JVM when it runs with a different configuration. That's one reason why you have to use JNI: it gives native code a well-defined "view" of objects on the JVM heap.
Secondly, the JVM garbage collector can move around objects on the heap anytime. Native code should access JVM heap objects through handles. The garbage collector knows about it and updates the handles if necessary. Native code that tries to bypass a handle can never be sure if the object is still there.
A third problem is native code that directly modifies pointers between objects on the JVM heap. Depending on the garbage collector algorithm, this may cause all kinds of problems.
In a nutshell: You probably could access JVM heap objects from native code directly, but you almost certainly shouldn't.
Short answer: No.
Other than being a Java/C++ issue, that contradicts with basic OS concepts. Since each process has its own address space, one process cannot reach any object of others.
This limitation can be mitigated only if the process (that tries to reach other's memory) runs in kernel space and the underlying OS allows operations, or some utility like "shared memory" is involved. Even if this were the case, you will face with virtual address space problem. The same physical portions of memory is addressed with different values in different processes. That's why, if you think that you know the address of an object, this address is virtual and useless in other processes.
EDIT: If they are not in different processes, then the answer is definitely yes. Theoretically, you can implement your own JNI :).
a possible answer is using the APR (Apache Portable Runtime) yeah i know it's JNI based but it have concept of Shared memory. so it's possible to bind a shared memory space created by another program (and vice-versa)
https://apr.apache.org/docs/apr/1.5/group__apr__shm.html
ouside of the JNI part, this not seems possible.
Related
ChronicleMap on OpenHFT's repository on Github states in their documentation:
Chronicle Map implements the java.util.concurrent.ConcurrentMap, that stores
its data off the java heap.
I've built a compiler and contributed to a few off-shoot languages' compiler implementation. The one's I've worked with allocate everything on the stack (that's what's available during code generation). I've never worked on the JVM and the java compiler, but I do know that typically only the heap and stack are available to allocate instances of classes, local variables, function parameters, etc.
Could someone please explain how we're able to write code, where we can tell the compiler to instantiate data-structures such as the ChronicalMap, have them available to garbage collection by the JVM (and be kept track-of with JVM's general memory management features), but live off the heap?
I've read up on the simple construction documentation and the associate example. I see the how but the reasoning underlying what exactly is going on in-conjunction with the JVM is unclear.
An important thing to remember is that the javac compiler doesn't do much in the way of optimisation, nor does it give you any means of specifying where data is stored or how code should be optimised. (With a few obscure exceptions in Java 8 like #Contended)
Java derives much of it's extensibility from libraries which generally operate at runtime. (There is often a build time option as well) A key thing to realise is that a Java program can generate and alter code while it is running, so in fact much of the smarts are happening at runtime.
In the case of off-heap usage, you need a library which supports this functionality and this will directly, or indirectly use sun.misc.Unsafe (On most popular JVMs) This class allows you to do many things the language doesn't support, but is still really useful to have if you are a low level library builder.
While off heap memory is not directly managed by the GC, you can have proxy objects e.g. ByteBuffer which have a Cleaner so that when these objects are GC-ed the off heap memory associated with it is also cleaned up.
Disclaimer, I wrote most of ChronicleMap.
The term off heap refers to the ability to use "raw" memory buffers in java. these maybe regular memory buffers from the process address space, or memory mapped files.
These buffers are "raw" - you manage their content yourself - they are not managed by the garbage collector.
I've been learning about Java and how it uses garbage collection vs manual deallocation of objects. I couldn't find an answer to whether java objects get removed when a java application closes or not? What exactly happens in the JVM when, say, a small console application with an object
public class Hello {
public String name = "Y_Y";
}
exists in memory and the console application is closed?
Thanks,
Y_Y
When an application closes, the jvm stops running and all of its memory is returned to the host.
For all practical purposes, the heap and all object allocated there stop to exist.
If you're concerned about security, any process with raised privileges would be able to scan that memory and read whatever's left around. It would have to do so before the memory gets allocated to another process. But that could also happen while the original program/jvm is running.
You can't know for sure. The behavior is not specified or guaranteed. But you should not care too much about that. What you should care is that the memory is reclaimed.
If security is your issue, well it shouldn't be. Security cases should be treated when encountered. Rewriting the entire memory with 0 or garbage would make exit really slow.
What happens is the memory occupied by the string is freed on exit.
If the object implements a finalize() method, it may be called.
Also, you can invoke the Garbage collector manually using System.gc();
The following is an extract from the Sun specifications.
The specification for the Java platform makes very few promises about
how garbage collection actually works. Here is what the Java Virtual
Machine Specification (JVMS) has to say about memory management.
The heap is created on virtual machine start-up. Heap storage for
objects is reclaimed by an automatic storage management system (known
as a garbage collector); objects are never explicitly deallocated. The
Java virtual machine assumes no particular type of automatic storage
management system, and the storage management technique may be chosen
according to the implementor's system requirements.1 While it can seem
confusing, the fact that the garbage collection model is not rigidly
defined is actually important and useful-a rigidly defined garbage
collection model might be impossible to implement on all platforms.
Similarly, it might preclude useful optimizations and hurt the
performance of the platform in the long term.
Although there is no one place that contains a full definition of
required garbage collector behavior, much of the GC model is
implicitly specified through a number of sections in the Java Language
Specification and JVMS. While there are no guarantees about the exact
process followed, all compliant virtual machines share the basic
object lifecycle described in this chapter.
Is it possible to perform the memory management by yourself. e.g. We allocate a chunk of memory outside the heap space so that it is not subject to GC. And we ourselves take care of allocation/deallocation of objects from this chunk of memory.
Some people have pointed to frameworks like Jmalloc/EHcache. Actually i more want to understand that how they actually do it.
I am fine with some direct approach or even some indirect approach ( e.g. first serialize java objects).
You can not allocate Java objects in a foreign memory location, but you can map memory which is e.g. allocated in a native library into a direct ByteBuffer to use it from Java code.
You can use the off the heap memory approach
Look for example jmalloc
and this is also usefull link Difference between on and off the heap
I have a library which does this sort of thing. You create excerpts which can be used a rewritable objects or queued events. It keeps track of where objects start and end. The downside is that the library assumes you will cycle all the object once per day or week. i.e. there is no clean up as such. On the plus side its very efficient, can be used in an event driven manner, persisted and can be shared between processes. You can have hundreds of GB of data while using only a few MB of heap.
https://github.com/peter-lawrey/Java-Chronicle
BTW: It supports ByteBuffers or using Unsafe for extra performance.
If you mean Java objects, then no, this isn't possible with standard VMs. Although you can always modify the VM if you want to experiment (Jikes RVM for example was made for this very purpose), but bear in mind that the result won't really be Java any more.
As for memory allocation for non-java data structures, that is possible and is being done regularly by native libraries and there is even some Java support for it (mentioned in the other answers), with the general caveat that you can very easily self-destruct with it.
http://courses.washington.edu/css342/zander/css332/arch.html
bottom of the page:
The C++ memory model differs from the Java memory model. In C++,
memory comes from two places, the run time stack and the memory heap.
This reads as if Java doesnt have a heap (or stack)?
I am trying to learn all the "under the bonnet" details for Java and C++
Java has a heap and a (per-thread) stack as well. The difference is that in Java, you cannot choose where to allocate a variable or object.
Basically, all objects and their instance variables are allocated on the heap, and all method parameters and local variables (just the references in the case of objects) are allocated on the stack.
However, some modern JVMs will allocate some objects on the stack as a performance optimization when they detect that the object is only used locally.
Java uses a heap memory model. All objects are created on the heap; references are used to refer to them.
It also puts method frames onto a stack when processing them.
I would say it has both.
Yes, Java have both heap (common to the entire JVM) and stack (one stack per thread).
And having stack & heap is more a property of implementations than of languages.
I would even say that most Linux programs have heap (obtained thru mmap & sbrk system calls) and stack (at the level of the operating system, this is not dependent of the language).
What Java have, but C++ usually not, is a garbage collector. You don't need to release unused memory in Java. But in C++ you need to release it, by calling delete, for every C++ object allocated in the heap with new.
See however Boehm's garbage collector for a GC usable in C & C++. It works very well in practice (even if it can leak in theory, being a conservative, not a precise, GC).
Some restricted C++ or C environments (in particular free standing implementations for embedded systems without operating system kernel) don't have any heap.
I've written a library in C which consumes a lot of memory (millions of small blocks). I've written a c program which uses this library. And I've written a java program which uses the same library. The Java program is a very thin layer around the library. Basically there is only one native method which is called, does all the work and returns hours later. There is no further communication between Java and the native library using the java invocation interface. Nor there are Java object which consume a noteworthy amount of memory.
So the c program and the Java program are very similar. The whole computation/memmory allocation happens inside the native library. Still. When executed the c program consumes 3GB of memory. But the Java program consumes 4.3GB! (VIRT amount reported by top)
I checked the memory map of the Java process (using pmap). Only 40MB are used by libraries. So additional libraries loaded by Java are not the cause.
Does anyone have an explanation for this behavior?
EDIT: Thanks for the answers so far. To make it a little bit more clearer: The java code does nothing but invoke the native library ONCE! The java heap is standard size (perhaps 60MB) and is not used (except for the one class containing the main method and the other class invoking the native library).
The native library method is a long running one and does a lot of mallocs and frees. Fragmentation is one explanation I thought of myself too. But since there is no Java code active the fragmentation behavior should be the same for the Java program and the c program. Since it is different I also presume the used malloc implementations are different when run in c program or in Java program.
Just guessing: You might be using a non-default malloc implementation when running inside the JVM that's tune to the specfic needs of the JVM and produces more overhead than the general-purpose malloc in your normal libc implementation.
Sorry guys. Wrong assumptions.
I got used to the 64MB the Sun Java implementations used to use for default maximum heap size. But I used openjdk 1.6 for testing. Openjdk uses a fraction of the physical memory if no maximum heap size was explicitly specified. In my case one fourth. I used a 4GB machine. One fourth is thus 1GB. There it is the difference between C and Java.
Sadly this behavior isn't documented anywhere. I found it looking at the source code of openjdk (arguments.cpp):
// If the maximum heap size has not been set with -Xmx,
// then set it as fraction of the size of physical memory,
// respecting the maximum and minimum sizes of the heap.
Java need to have continuous memory for its heap so it can allocate the maximum memory size as virtual memory. However, this doesn't consume physical memory and might not even consume swap. I would check how much your resident memory increases by.
There are different factors that you need to take into account especially on a language like Java, Java runs on a virtual machine and garbage collection is handled by the Java Runtime, as there is considerable effort (I would imagine) from using the Java Invocation Interface to switch or execute the native method within the native library as there would have to be a means to allocate space on the stack, switch to native code, execute the native method, switch back to the Java virtual machine and perhaps somehow, the space on the stack was not freed up - that's what I would be inclined to think.
Hope this helps,
Best regards,
Tom.
It is hard to say, but I think at the heart of the problem is that there are two heaps in your application which need to be maintained -- the standard Java heap for Java object allocations (maintained by the JVM), and the C heap which is maintained by calls to malloc/free. It is hard to say what is going on exactly without seeing some code.
Here is a suggestion for combating it.
Make the C code stop using the standard malloc call, and use an alternate version of malloc that grabs memory by mmaping /dev/zero. You can either modify an implementation of malloc from a library or roll your own if you feel competent enough to do that.
I strongly suspect you will discover that your problem goes away after you do that.