Using memcached with Java and ScheduledFuture objects

Using memcached with Java and ScheduledFuture objects - java

I've been playing around with caching objects (by first creating my own cache which turned out a stable implementation but very inefficient) and then trying my hand at using Memcached.
Although memcached works beautifully, I've ran into a problem.
How I'm using my objects is as follows:
I read data from a database into an object, then store the object in memcached.
Every couple of minutes I retrieve the object from memcached, retrieve any additional data from either the database or other objects in memcached, update the object with any new / relevant data, then store the object back into memcached.
Objects that need to be viewed are pulled from memcached, packaged and sent onto a client-side application for display.
This works very well, except when the number of objects I'm creating-storing-updating-viewing in memcached becomes high. Java/Tomcat-jvm doesn't seem to be garbage-collecting "fast enough" on the objects I pulled out of memcached, and the vm runs out of memory.
I'm limited to 8GB of memory (and would preferably like to bring that down to 4 if I can - using memcached), so my question is, is there a solution in preventing the JVMs memory usage from expanding so fast (or tune the garbage collector)?
(PS I have considered using Guava cache from Google, but this limits my options in concurrency e.g. if I have to restart tomcat, and using both Guava and memcached seems like a duplication of sorts which I'd like to avoid of possible)
--
Hein.

The garbage collector can't be "too slow" and run out of memory. Before throwing an OutOfMemoryError, the garbage collector is guaranteed to run. Only if it cannot free enough memory will the error be thrown.
You should use a profiler to see whether you have memory leaks, or if you're just hanging on to too many objects.
Afterwards you may want to tune the GC to improve performance, see for example here: GC tuning

Related

Should I call System.gc() in my java persistence project?

I'm using Java persistence API to develop a standalone software. Recently I saw that the memory usage keep rising when I'm creating objects from entity classes, as well as JPAController classes. It seems that the objects stays at the memory since the memory allocation to the project won't decrease (Eg: 400mb ---> Create Object ---> 450mb ---> Stays at 450mb). Will this affect badly on performance? Should I call System.gc() method to remove these objects?

Generally System.gc() is not guarenteed to perform a garbage collection. Ultimately it is up to the JVM to decide. See the javadoc.
Have you observed what happens when you are approaching your memory limits of the JVM, does garbage collection happen then ? If not and you receive an OutOfMemoryError, you either are retaining something longer than you need to, or actually need extra heap allocated to your VM.
In anycase System.gc() I believe shouldn't be used to solve such problems.

In my opinion, the approach to the problem should be different. Actually the call to System.gc() is not a guarantee that it will free any memory at all; please see When does System.gc() do anything
If you can measure a problem in your memory allocation, either via jconsole, or making a post mortem analysis on the jvm dump, or whatever, then this is another problem. By gathering this information you will know what remains where in your memory regions, and then take actions in order to contain it.

The only way that this would negatively affect performance throughout the life of your program is if you want to keep these entities around forever but the size of your old generation in your heap is less than the 450MB you specified. Assuming that you are want to keep around between 1 and 2 times the 450MB you have you have specified forever, with the default ratios of the JVM, setting a parameter such as -Xmx2g will probably be fine. There are many more parameters to fine tune your performance much more than that, but that's probably all the complexity you're looking for for now. If you want to check out some more details on heap tuning and really get into performance, check out this doc on Garbage Collection Tuning by Oracle. Alternatively, something to eat lunch to is a great Youtube video on GC tuning by a guy named Gil Tene.
But calling System.gc() probably won't do anything useful.

Explanation of how heap memory works (Java)

I read something on the internet but I'm not sure I get it.
I created a little website that's host by a cheap server with 128mb of heap memory.
When I start the server and i visit the first page I have these values (I get them by using Runtime.getRuntime().totalMemory() and Runtime.getRuntime().freeMemory()):
total memory:128974848
used memory:42376200
free memory:86598648
After seeing some pages (some times few, some times many) the total memory decreases. After seeing other pages it increases.
I can't understand if this is the way the heap memory behaves or if there's something wrong in my code.

I can't understand if this is the way the heap memory behaves
Yes, this is the gist of how any garbage-collection-based system would behave.
The principle behind garbage collection is that objects are not explicitly deleted or cleared when they are no longer needed. Rather, they simply go out of scope (that is, no live objects continue to refer to them). And for some non-zero period, these objects will still exist in memory.
Periodically, the garbage collector will run, and it will find these unreachable objects and actually remove them from memory. The exact details of how often the garbage collector runs, and how hard it tries to find garbage, depend entirely on the exact algorithm and any tuning parameters (and can make a significant difference to the performance of an application). In general though, they tend to run more often when there's less free memory left, and vice versa.
The gist is that even if your "real" memory usage is more or less static, you'll probably see a sawtooth type pattern, as temporary objects are created, and hang around for a while until the garbage collector runs and gets rid of them. That's entirely normal.

That is pretty normal. Different web pages will use different amounts of memory based on the content they have in them. Where I would be worried is if you keep opening pages and memory never gets freed. That is a memory leak and is cause for concern if you are developing web browsers.
It is also possible that total memory may decrease because the server keeps information about who requests web pages out of convenience, so it can serve them faster in the future. An example would be Amazon's way of storing what products you search so it can suggest other items while you're browsing.

JSR 107 - Caching (JCache) vs CPU caching

I read about JSR 107 Caching (JCache).
I'm confused:
From what I know, every CPU manage his caching memory (without any help from OS).
So, why do we need Java Caching handler ? (if the CPU manage his own cache)
What I miss here ?
Thanks

This is about caching Java objects, like objects that are expensive to create, or need to be shared between multiple Java VMs. See https://jcp.org/en/jsr/detail?id=107
A cache is generally used to temporarily keep data between uses because it takes too much time or is plain impossible to recreate if you just throw it away between uses.
The CPU cache keeps data and instructions in case it has to access it again, because reading it from memory takes more time.
The JSR 107 cache works o a completely different level.

There is a difference between CPU caching and memory caching. This JCache would cache things in memory so you don't have to get it from an expensive resource like disk or over the network.
So CPUs have caches built into them so that they can avoid going to memory. CPUs commonly have three levels of cache and store around 8MB. CPU caching is not something you have to worry about because it is taken care of for you. If something isn't in the CPU cache then it has to go fetch it out of memory.
Caching in memory is to avoid going to disk or even slower resources as I mentioned earlier. This mechanism programs have control over. So if you want to avoid continously asking your DB for some object you can store it memory and keep returning the same object. This saves quite a bit of performance. As Thomas mentioned JCache adds the functionality to be able to provide caching across JVMs. From what I understand this means that different Java programs can share the same cache.

Problems with Java garbage collector and memory

I am having a really weird issue with a Java application.
Essentially it is a web page that uses magnolia (a cms system), there are 4 instances available on production environment. Sometimes the CPU goes to 100% in a java process.
So, first approach was to make a thread dump, and check the offending thread, what I found was weird:
"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000000ce37800 nid=0x7dcb runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x000000000ce39000 nid=0x7dcc runnable
Ok, that is pretty weird, I have never had a problem with the garbage collector like that, so the next thing we did was to activate JMX and using jvisualvm inspect the machine: the heap memory usage was really high (95%).
Naive approach: Increase memory, so the problem takes more time to appear, result, on the restarted server with increased memory (6 GB!) the problem appeared 20 hours after restart while on other servers with less memory (4GB!) that had been running for 10 days, the problem took still a few more days to reappear. Also, I tried to use the apache access log from the server failing and use JMeter to replay the requests into a local server in an attemp to reproduce the error... it did not work either.
Then I investigated the logs a little bit more to find this errors
info.magnolia.module.data.importer.ImportException: Error while importing with handler [brightcoveplaylist]:GC overhead limit exceeded
at info.magnolia.module.data.importer.ImportHandler.execute(ImportHandler.java:464)
at info.magnolia.module.data.commands.ImportCommand.execute(ImportCommand.java:83)
at info.magnolia.commands.MgnlCommand.executePooledOrSynchronized(MgnlCommand.java:174)
at info.magnolia.commands.MgnlCommand.execute(MgnlCommand.java:161)
at info.magnolia.module.scheduler.CommandJob.execute(CommandJob.java:91)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Another example
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2894)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:407)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at java.lang.StackTraceElement.toString(StackTraceElement.java:175)
at java.lang.String.valueOf(String.java:2838)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at java.lang.Throwable.printStackTrace(Throwable.java:529)
at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60)
at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:576)
at info.magnolia.module.templatingkit.functions.STKTemplatingFunctions.getReferencedContent(STKTemplatingFunctions.java:417)
at info.magnolia.module.templatingkit.templates.components.InternalLinkModel.getLinkNode(InternalLinkModel.java:90)
at info.magnolia.module.templatingkit.templates.components.InternalLinkModel.getLink(InternalLinkModel.java:66)
at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at freemarker.ext.beans.BeansWrapper.invokeMethod(BeansWrapper.java:866)
at freemarker.ext.beans.BeanModel.invokeThroughDescriptor(BeanModel.java:277)
at freemarker.ext.beans.BeanModel.get(BeanModel.java:184)
at freemarker.core.Dot._getAsTemplateModel(Dot.java:76)
at freemarker.core.Expression.getAsTemplateModel(Expression.java:89)
at freemarker.core.BuiltIn$existsBI._getAsTemplateModel(BuiltIn.java:709)
at freemarker.core.BuiltIn$existsBI.isTrue(BuiltIn.java:720)
at freemarker.core.OrExpression.isTrue(OrExpression.java:68)
Then I find out that such problem is due to the garbage collector using a ton of CPU but not able to free much memory
Ok, so it is a problem with the MEMORY that manifests itself in the CPU, so If the memory usage problem is solved, then the CPU should be fine, so I took a heapdump, unfortunatelly it was just too big to open it (the file was 10GB), anyway I run the server locallym loaded it a little bit and took a heapdump, after opening it, I found something interesting:
There are a TON of instances of
AbstractReferenceMap$WeakRef ==> Takes 21.6% of the memory, 9 million instances
AbstractReferenceMap$ReferenceEntry ==> Takes 9.6% of the memory, 3 million instances
In addition, I have found a Map which seems to be used as a "cache" (horrible but true), the problem is that such map is NOT synchronized and it is shared among threads (being static), the problem could be not only concurrent writes but also the fact that with lack of synchronization, there is no guarantee that thread A will see the changes done to the map by thread B, however, I am unable to figure out how to link this suspicious map using the memory eclipse analyzer, as it does not use the AbstracReferenceMap, it is just a normal HashMap.
Unfortunately, we do not use those classes directly (obviously the code uses them, but not directly), so I have seem to hit a dead end.
Problems for me are
I cannot reproduce the error
I cannot figure out where the hell the memory is leaking (if that is the case)
Any ideas at all?

The 'no-op' finalize() methods should definitely be removed as they are likely to make any GC performance problems worse. But I suspect that you have other memory leak issues as well.
Advice:
First get rid of the useless finalize() methods.
If you have other finalize() methods, consider getting rid of them. (Depending on finalization to do things is generally a bad idea ...)
Use memory profiler to try to identify the objects that are being leaked, and what is causing the leakage. There are lots of SO Questions ... and other resources on finding leaks in Java code. For example:
How to find a Java Memory Leak
Troubleshooting Guide for Java SE 6 with HotSpot VM, Chapter 3.
Now to your particular symptoms.
First of all, the place where the OutOfMemoryErrors were thrown is probably irrelevant.
However, the fact that you have huge numbers of AbstractReferenceMap$WeakRef and AbstractReferenceMap$ReferenceEntry objects is a string indication that something in your application or the libraries it is using is doing a huge amount of caching ... and that that caching is implicated in the problem. (The AbstractReferenceMap class is part of the Apache Commons Collections library. It is the superclass of ReferenceMap and ReferenceIdentityMap.)
You need to track down the map object (or objects) that those WeakRef and ReferenceEntry objects belong to, and the (target) objects that they refer to. Then you need to figure out what is creating it / them and figure out why the entries are not being cleared in response to the high memory demand.
Do you have strong references to the target objects elsewhere (which would stop the WeakRefs from being broken)?
Is / are the map(s) being used incorrectly so as to cause a leak. (Read the javadocs carefully ...)
Are the maps being used by multiple threads without external synchronization? That could result in corruption, which potentially could manifest as a massive storage leak.
Unfortunately, these are only theories and there could be other things causing this. And indeed, it is conceivable that this is not a memory leak at all.
Finally, your observation that the problem is worse when the heap is bigger. To me, this is still consistent with a Reference / cache-related issue.
Reference objects are more work for the GC than regular references.
When the GC needs to "break" a Reference, that creates more work; e.g. processing the Reference queues.
Even when that happens, the resulting unreachable objects still can't be collected until the next GC cycle at the earliest.
So I can see how a 6Gb heap full of References would significantly increase the percentage of time spent in the GC ... compared to a 4Gb heap, and that could cause the "GC Overhead Limit" mechanism to kick in earlier.
But I reckon that this is an incidental symptom rather than the root cause.

With a difficult debugging problem, you need to find a way to reproduce it. Only then will you be able to test experimental changes and determine if they make the problem better or worse. In this case, I'd try writing loops that rapidly create & delete server connections, that create a server connection and rapidly send it memory-expensive requests, etc.
After you can reproduce it, try reducing the heap size to see if you can reproduce it faster. But do that second since a small heap might not hit the "GC overhead limit" which means the GC is spending excessive time (98% by some measure) trying to recover memory.
For a memory leak, you need to figure out where in the code it's accumulating references to objects. E.g. does it build a Map of all incoming network requests?
A web search https://www.google.com/search?q=how+to+debug+java+memory+leaks shows many helpful articles on how to debug Java memory leaks, including tips on using tools like the Eclipse Memory Analyzer that you're using. A search for the specific error message https://www.google.com/search?q=GC+overhead+limit+exceeded is also helpful.
The no-op finalize() methods shouldn't cause this problem but they may well exacerbate it. The doc on finalize() reveals that having a finalize() method forces the GC to twice determine that the instance is unreferenced (before and after calling finalize()).
So once you can reproduce the problem, try deleting those no-op finalize() methods and see if the problem takes longer to reproduce.
It's significant that there are many AbstractReferenceMap$WeakRef instances in memory. The point of a weak reference is to refer to an object without forcing it to stay in memory. AbstractReferenceMap is a Map that lets one make the keys and/or values be weak references or soft references. (The point of a soft reference is to try to keep an object in memory but let the GC free it when memory gets low.) Anyway, all those WeakRef instances in memory are probably exacerbating the problem but shouldn't keep the referenced Map keys/values in memory. What are they referring to? What else is referring to those objects?

Try a tool that locates the leaks in your source code such as plumbr

There are a number of possibilities, perhaps some of which you've explored.
It's definitely a memory leak of some sort.
If your server has user sessions, and your user sessions aren't expiring or being disposed of properly when the user is inactive for more than X minutes/hours, you will get a buildup of used memory.
If you have one or more maps of something that your program generates, and you don't clear the map of old/unneeded entries, you could again get a buildup of used memory. For example, I once considered adding a map to keep track of process threads so that a user could get info from each thread, until my boss pointed out that at no point were finished threads getting removed from the map, so if the user stayed logged in and active, they would hold onto those threads forever.
You should try doing a load test on a non-production server where you simulate normal usage of your app by large numbers of users. Maybe even limit the server's memory even lower than usual.
Good luck, memory issues are a pain to track down.

You say that you have already tried jvisualvm, to inspect the machine. Maybe, try it again, like this:
This time look at the "Sampler -> Memory" tab.
It should tell you which (types of) objects occupy the most memory.
Then find out where such objects are usually created and removed.

A lot of times 'weird' errors can be caused by java agents plugged into the JVM. If you have any agents running (e.g. jrebel/liverebel, newrelic, jprofiler), try running without them first.
Weird things can also happen when running JVM with non-standard parameters (-XX); certain combinations are known to cause problems; which parameters are you using currently?
Memory leak can also be in Magnolia itself, have you tried googling "magnolia leak"? Are you using any 3rd-party magnolia modules? If possible, try disabling/removing them.
The problem might be connected to just one part of your You can try reproducing the problem by "replaying" your access logs on your staging/development server.
If nothing else works, if it were me, I would do the following:
- trying to replicate the problem on an "empty" Magnolia instance (without any of my code)
- trying to replicate the problem on an "empty" Magnolia instance (without 3rd party modules)
- trying to upgrade all software (magnolia, 3rd-party modules, JVM)
- finally try to run the production site with YourKit and try to find the leak

My guess is that you have automated import running which invokes some instance of ImportHandler. That handler is configured to make a backup of all the nodes it's going to update (I think this is default option), and since you have probably a lot of data in your data type, and since all of this is done in session you run out of memory. Try to find out which import job it is and disable backup for it.
HTH,
Jan

It appears that your memory leaks are emanating from your arrays. The garbage collector has trouble identifying object instances that were removed from arrays, therefore would not be collected for releasing of memory. My advice is when you do remove an object from an array, assign the former object's position to null, therefore the garbage collector can realize that it is a null object, and remove it. Doubt this will be your exact problem, but it is always good to know these things, and check if this is your problem.
It is also good to assign an object instance to null when you need to remove it/clean it up. This is because the finalize() method is sketchy and evil, and sometimes will not be called by the garbage collector. The best workaround for this is to call it (or another similar method) yourself. That way, you are assured that garbage cleanup was performed successfully. As Joshua Bloch said in his book: Effective Java, 2nd edition, Item 7, page 27: Avoid finalizers. "Finalizers are unpredictable, often dangerous and generally unnecessary". You can see the section here.
Because there is no code displayed, I cannot see if any of these methods can be useful, but it is still worth knowing these things. Hope these tips help you!

As recommended above, I'd get in touch with the devs of Magnolia, but meanwhile:
You are getting this error because the GC doesn't collect much on a run
The concurrent collector will throw an OutOfMemoryError if too much
time is being spent in garbage collection: if more than 98% of the
total time is spent in garbage collection and less than 2% of the heap
is recovered, an OutOfMemoryError will be thrown.
Since you can't change the implementation, I would recommend changing the config of the GC, in a way that runs less frequently, so it would be less likely to fail in this way.
Here is a example config just to get you started on the parameters, you would have to figure out your sweet spot. The logs of the GC will probably be of help for that
My VM params are as follow:
-Xms=6G
-Xmx=6G
-XX:MaxPermSize=1G
-XX:NewSize=2G
-XX:MaxTenuringThreshold=8
-XX:SurvivorRatio=7
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenSweepingEnabled
-XX:CMSInitiatingOccupancyFraction=60
-XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-Xloggc:logs/gc.log

java - can we do our own memory management?

Is it possible to perform the memory management by yourself. e.g. We allocate a chunk of memory outside the heap space so that it is not subject to GC. And we ourselves take care of allocation/deallocation of objects from this chunk of memory.
Some people have pointed to frameworks like Jmalloc/EHcache. Actually i more want to understand that how they actually do it.
I am fine with some direct approach or even some indirect approach ( e.g. first serialize java objects).

You can not allocate Java objects in a foreign memory location, but you can map memory which is e.g. allocated in a native library into a direct ByteBuffer to use it from Java code.

You can use the off the heap memory approach
Look for example jmalloc
and this is also usefull link Difference between on and off the heap

I have a library which does this sort of thing. You create excerpts which can be used a rewritable objects or queued events. It keeps track of where objects start and end. The downside is that the library assumes you will cycle all the object once per day or week. i.e. there is no clean up as such. On the plus side its very efficient, can be used in an event driven manner, persisted and can be shared between processes. You can have hundreds of GB of data while using only a few MB of heap.
https://github.com/peter-lawrey/Java-Chronicle
BTW: It supports ByteBuffers or using Unsafe for extra performance.

If you mean Java objects, then no, this isn't possible with standard VMs. Although you can always modify the VM if you want to experiment (Jikes RVM for example was made for this very purpose), but bear in mind that the result won't really be Java any more.
As for memory allocation for non-java data structures, that is possible and is being done regularly by native libraries and there is even some Java support for it (mentioned in the other answers), with the general caveat that you can very easily self-destruct with it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.