java - can we do our own memory management? - java

Is it possible to perform the memory management by yourself. e.g. We allocate a chunk of memory outside the heap space so that it is not subject to GC. And we ourselves take care of allocation/deallocation of objects from this chunk of memory.
Some people have pointed to frameworks like Jmalloc/EHcache. Actually i more want to understand that how they actually do it.
I am fine with some direct approach or even some indirect approach ( e.g. first serialize java objects).

You can not allocate Java objects in a foreign memory location, but you can map memory which is e.g. allocated in a native library into a direct ByteBuffer to use it from Java code.

You can use the off the heap memory approach
Look for example jmalloc
and this is also usefull link Difference between on and off the heap

I have a library which does this sort of thing. You create excerpts which can be used a rewritable objects or queued events. It keeps track of where objects start and end. The downside is that the library assumes you will cycle all the object once per day or week. i.e. there is no clean up as such. On the plus side its very efficient, can be used in an event driven manner, persisted and can be shared between processes. You can have hundreds of GB of data while using only a few MB of heap.
https://github.com/peter-lawrey/Java-Chronicle
BTW: It supports ByteBuffers or using Unsafe for extra performance.

If you mean Java objects, then no, this isn't possible with standard VMs. Although you can always modify the VM if you want to experiment (Jikes RVM for example was made for this very purpose), but bear in mind that the result won't really be Java any more.
As for memory allocation for non-java data structures, that is possible and is being done regularly by native libraries and there is even some Java support for it (mentioned in the other answers), with the general caveat that you can very easily self-destruct with it.

Related

Memory regions of Java program?

I haven't deep dive into how Java treats memory when a program is running as I have been in working at application level. I recently had one instance in which I needed to know owing to performance issues of application.
I have been aware of "stack" , "heap" regions of memory and I thought this is the model of a Java program. However, it turns out that it is much more, and beyond that.
For example, I came across terms like: Eden, s0, s1, Old memory and so on. I was never aware of these terminologies prior.
As Java is / have been changing and so may be these terminologies are/aren't relevant as of Java 8.
Can anyone guide where to get this information and under what circumstance we need to know them? Are these part of main memory that is RAM.
Eden, s0, s1, Old memory and other memory areas exist only in the context of the specific garbage collector implementation e.g. generational collectors like G1 will divide the heap into mentioned areas however non-generational collectors like ZGC will not.
Start by reviewing the main garbage collectors in the JVM:
ParNew
CMS
G1
ZGC / Shenandoah / Azul C4
and then try to understand related concepts:
Thread-local allocation buffers (TLAB)
Escape analysis
String constant pools, string interning, string de-duplication
Permanent generation vs Metaspace
Object layout e.g. why boolean is not taking 1 bit (word tearing)
Native memory e.g. JNI or off-heap memory access
I don't believe that there is a single website that will explain the full JVM memory management approach.
Java, as defined by the Java Language Specification and the Java Virtual Machine Specification talks about the stack and the heap (as well as the method area).
Those are the things that are needed to describe, conceptually, what makes a Java Virtual Machine.
If you wanted to implement a JVM you'd need to implement those in some way. They are just as valid in Java 13 as they were back in Java 1. Nothing has fundamentally changed about how those work.
The other terms you mentioned (as well as "old gen", "new gen", ...) are memory areas used in the implementation of specific garbage collection mechanisms, specifically those of implemented in the Oracle JDK / OpenJDK.
All of those areas are basically specific parts of the heap. The exact way the heap is split into those areas is up to the garbage collector to decide and knowing about them shouldn't be necessary unless you want to tweak your garbage collector.
Since garbage collectors change between releases and new garbage collector approaches are implemented regularly (as this is one of the primary ways to speed up JVMs), the concrete terms used here will change over the years.

Can a java object be accessed from native code and vice versa?

If an object X exists in java heap, and if I knew the address of the object X on java heap, is it possible for the native code to access this object directly from memory without involving JNI? And vice versa, if java code does know the address of object Y on native heap, can java access it without involving JNI?
To be more precise, "Does the java objects gets stored in memory the same way as the native object or is it any different?". If not, wont byteArray objects in java and native gets stored in the same way?
Please provide your suggestions and references.
EDIT: Might be this one is the right question, why do the objects need to be transferred from java heap to native heap through JNI? Why cant the java heap object is accessible to native heap directly?
Can Java code access native objects? No. Java code is managed by the JVM. (More precisely, it's bytecode, not Java code.) The specification of the JVM does not allow bytecode to access arbitrary memory. Bytecode can't even access arbitrary addresses on the JVM heap. For example, private fields can only be accessed by bytecode in the same class.
Can native code access JVM heap objects directly (without JNI)? Yes. Native code is running in the same process and address space as the JVM. As far as I know, on most operating systems and hardware platforms this means that native code can do whatever it wants in that address space.
Should native code access JVM heap objects directly? Definitely not.
First of all, the JVM specification does not specify the layout of objects on the JVM heap, not even of byte arrays. For example, the JVM may split the array into chunks and transparently translate addresses when bytecode uses the array. If you tried to write native code that accesses the array, you would have to re-implement that translation. Such code may work in one JVM implementation, but probably not in another, or maybe not even in a newer version of the same JVM, or in the same JVM when it runs with a different configuration. That's one reason why you have to use JNI: it gives native code a well-defined "view" of objects on the JVM heap.
Secondly, the JVM garbage collector can move around objects on the heap anytime. Native code should access JVM heap objects through handles. The garbage collector knows about it and updates the handles if necessary. Native code that tries to bypass a handle can never be sure if the object is still there.
A third problem is native code that directly modifies pointers between objects on the JVM heap. Depending on the garbage collector algorithm, this may cause all kinds of problems.
In a nutshell: You probably could access JVM heap objects from native code directly, but you almost certainly shouldn't.
Short answer: No.
Other than being a Java/C++ issue, that contradicts with basic OS concepts. Since each process has its own address space, one process cannot reach any object of others.
This limitation can be mitigated only if the process (that tries to reach other's memory) runs in kernel space and the underlying OS allows operations, or some utility like "shared memory" is involved. Even if this were the case, you will face with virtual address space problem. The same physical portions of memory is addressed with different values in different processes. That's why, if you think that you know the address of an object, this address is virtual and useless in other processes.
EDIT: If they are not in different processes, then the answer is definitely yes. Theoretically, you can implement your own JNI :).
a possible answer is using the APR (Apache Portable Runtime) yeah i know it's JNI based but it have concept of Shared memory. so it's possible to bind a shared memory space created by another program (and vice-versa)
https://apr.apache.org/docs/apr/1.5/group__apr__shm.html
ouside of the JNI part, this not seems possible.

ChronicleMap (and more general off-heap data structures) implementation?

ChronicleMap on OpenHFT's repository on Github states in their documentation:
Chronicle Map implements the java.util.concurrent.ConcurrentMap, that stores
its data off the java heap.
I've built a compiler and contributed to a few off-shoot languages' compiler implementation. The one's I've worked with allocate everything on the stack (that's what's available during code generation). I've never worked on the JVM and the java compiler, but I do know that typically only the heap and stack are available to allocate instances of classes, local variables, function parameters, etc.
Could someone please explain how we're able to write code, where we can tell the compiler to instantiate data-structures such as the ChronicalMap, have them available to garbage collection by the JVM (and be kept track-of with JVM's general memory management features), but live off the heap?
I've read up on the simple construction documentation and the associate example. I see the how but the reasoning underlying what exactly is going on in-conjunction with the JVM is unclear.
An important thing to remember is that the javac compiler doesn't do much in the way of optimisation, nor does it give you any means of specifying where data is stored or how code should be optimised. (With a few obscure exceptions in Java 8 like #Contended)
Java derives much of it's extensibility from libraries which generally operate at runtime. (There is often a build time option as well) A key thing to realise is that a Java program can generate and alter code while it is running, so in fact much of the smarts are happening at runtime.
In the case of off-heap usage, you need a library which supports this functionality and this will directly, or indirectly use sun.misc.Unsafe (On most popular JVMs) This class allows you to do many things the language doesn't support, but is still really useful to have if you are a low level library builder.
While off heap memory is not directly managed by the GC, you can have proxy objects e.g. ByteBuffer which have a Cleaner so that when these objects are GC-ed the off heap memory associated with it is also cleaned up.
Disclaimer, I wrote most of ChronicleMap.
The term off heap refers to the ability to use "raw" memory buffers in java. these maybe regular memory buffers from the process address space, or memory mapped files.
These buffers are "raw" - you manage their content yourself - they are not managed by the garbage collector.

Using memcached with Java and ScheduledFuture objects

I've been playing around with caching objects (by first creating my own cache which turned out a stable implementation but very inefficient) and then trying my hand at using Memcached.
Although memcached works beautifully, I've ran into a problem.
How I'm using my objects is as follows:
I read data from a database into an object, then store the object in memcached.
Every couple of minutes I retrieve the object from memcached, retrieve any additional data from either the database or other objects in memcached, update the object with any new / relevant data, then store the object back into memcached.
Objects that need to be viewed are pulled from memcached, packaged and sent onto a client-side application for display.
This works very well, except when the number of objects I'm creating-storing-updating-viewing in memcached becomes high. Java/Tomcat-jvm doesn't seem to be garbage-collecting "fast enough" on the objects I pulled out of memcached, and the vm runs out of memory.
I'm limited to 8GB of memory (and would preferably like to bring that down to 4 if I can - using memcached), so my question is, is there a solution in preventing the JVMs memory usage from expanding so fast (or tune the garbage collector)?
(PS I have considered using Guava cache from Google, but this limits my options in concurrency e.g. if I have to restart tomcat, and using both Guava and memcached seems like a duplication of sorts which I'd like to avoid of possible)
--
Hein.
The garbage collector can't be "too slow" and run out of memory. Before throwing an OutOfMemoryError, the garbage collector is guaranteed to run. Only if it cannot free enough memory will the error be thrown.
You should use a profiler to see whether you have memory leaks, or if you're just hanging on to too many objects.
Afterwards you may want to tune the GC to improve performance, see for example here: GC tuning

RAM memory reallocation - Windows and Linux

I am working on a project involving optimizing energy consumption within a system. Part of that project consists in allocating RAM memory based on locality, that is allocating memory segments for a program as close as possible to each other. Is there a way I can know where exactly is the position of the memory I allocate (the memory chips) and I was also wondering if it is possible to force allocation in a deterministic manner. I am interested in both Windows and Linux. Also, the project will be implemented in Java and .NET so I am interested in managed APIs to achieve this.
[I am aware that this might not translate into direct energy consumption reduction but the project is supposed to be a proof of concept.]
You're working at the wrong level of abstraction.
Java (and presumably .NET) refers to objects using handles, rather than raw pointers. The underlying Java VM can move objects around in virtual memory at any time; the Java application doesn't see any difference.
Win32 and Linux applications (such as the Java VM) refer to memory using virtual addresses. There is a mapping from virtual address to a physical address on a RAM chip. The kernel can change this mapping at any time (e.g. if the data gets paged to disk then read back into a different memory location) and applications don't see any difference.
So if you're using Java and .NET, I wouldn't change your Java/.NET application to achieve this. Instead, I would change the underlying Linux kernel, or possibly the Java VM.
For a prototype, one approach might be to boot Linux with the mem= parameter to restrict the kernel's memory usage to less than the amount of memory you have, then look at whether you can mmap the spare memory (maybe by mapping /dev/mem as root?). You could then change all calls to malloc() in the Java VM to use your own special memory allocator, which allocates from that free space.
For a real implementation of this, you should do it by changing the kernel and keeping userspace compatibility. Look at the work that's been done on memory hotplug in Linux, e.g. http://lhms.sourceforge.net/
If you want to try this in a language with a big runtime you'd have to tweak the implementation of that runtime or write a DLL/shared object to do all the memory management for your sample application. At which point the overall system behaviour is unlikely to be much like the usual operation of those runtimes.
The simplest, cleanest test environment to detect the (probably small) advantages of locality of reference would be in C++ using custom allocators. This environment will remove several potential causes of noise in the runtime data (mainly the garbage collection). You will also lose any power overhead associated with starting the CLR/JVM or maintaining its operating state - which would presumably also be welcome in a project to minimise power consumption. You will naturally want to give the test app a processor core to itself to eliminate thread switching noise.
Writing a custom allocator to give you one of the preallocated chunks on your current page shouldn't be too tough, but given that to accomplish locality of reference in C/C++ you would ordinarily just use the stack it seems unlikely there will be one you can just find, download and use.
In C/C++, if you coerce a pointer to an int, this tells you the address. However, under Windows and Linux, this is a virtual address -- the operating system determines the mapping to physical memory, and the memory management unit in the processor carries it out.
So, if you care where your data is in physical memory, you'll have to ask the OS. If you just care if your data is in the same MMU block, then check the OS documentation to see what size blocks it's using (4KB is usual for x86, but I hear kids these days are playing around with 16M giant blocks?).
Java and .NET add a third layer to the mix, though I'm afraid I can't help you there.
Is pre-allocating in bigger chunks (than needed) an option at all? Will it defeat the original purpose?
I think that if you want such a tide control over memory allocation you are better of using a compiled language such as C, the JVM, isolated the actual implementation of the language from the hardware, chip selection for data storage included.
The approach requires specialized hardware. In ordinary memory sticks and slots arrangements are designed to dissipate heat as even per chip as possible. For example 1 bit in every bus word per physical chip.
This is an interesting topic although I think it is waaaaaaay beyond the capabilities of managed languages such as Java or .NET. One of the major principals of those languages is that you don't have to manage the memory and consequently they abstract that away for you. C/C++ gives you better control in terms of actually allocating that memory, but even in that case, as referenced previously, the operating system can do some hand waving and indirection with memory allocation making it difficult to determine how things are allocated together. Even then, you make reference to the actual chips, that's even harder and I would imagine would be hardware-dependent. I seriously would consider utilizing a prototyping board where you can code at the assembly level and actually control every memory unit allocation explicitly without any interference from compiler optimizations or operating system security practices. That would give you the most meaningful results as it would give you the ability to control every aspect of the program and determine, definitively that any power consumption improvements are due to your algorithm rather than some invisible optimization performed by the compiler or operating system. I imagine this is some sort of research project (very intriguing) so spending ~$100 on a prototyping board would definitely be worth it in my opinion.
In .NET there is a COM interface exposed for profiling .NET applications that can give you detailed address information. I think you will need to combine this with some calls to the OS to translate virtual addresses though.
As zztop eluded to, the .NET CLR compacts memory everytime a garbage collection is done. Although for large objects, they are not compacted. These are objects on the large object heap. The large object heap can consist of many segments scattered around from OS calls to VirtualAlloc.
Here are a couple links on the profiling APIs:
http://msdn.microsoft.com/en-us/magazine/cc300553.aspx
David Broman's CLR Profiling API Blog

Categories

Resources