Memory leakage in cluster environment in tomcat - java

I am getting OutOfMemory Exception in my app. I have taken the heap dump and ananlyzed through MAT. While analyzing my app memory usage I found following suspect. I am unable to understand the main cause behind these suspects.
Please help me in understanding this leakage suspects and what is relevant solution for it.
Suspect 1
The thread org.apache.tomcat.util.threads.TaskThread # 0x2bdf5ff8 "ajp-bio-9002"-exec-5 keeps local variables with total size 113,973,288 (50.72%) bytes.
The memory is accumulated in one instance of "org.apache.tomcat.util.threads.TaskThread" loaded by "org.apache.catalina.loader.StandardClassLoader # 0x293b4488".
Thread Stack
"ajp-bio-9002"-exec-5
at java.util.Arrays.copyOf([CI)[C (Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(I)V (AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(C)Ljava/lang/AbstractStringBuilder; (AbstractStringBuilder.java:572)
at java.lang.StringBuffer.append(C)Ljava/lang/StringBuffer; (StringBuffer.java:320)
at org.apache.myfaces.renderkit.html.util.ReducedHTMLParser.consumeString(C)Ljava/lang/String; (ReducedHTMLParser.java:303)
at org.apache.myfaces.renderkit.html.util.ReducedHTMLParser.consumeAttrValue()Ljava/lang/String; (ReducedHTMLParser.java:327)
at org.apache.myfaces.renderkit.html.util.ReducedHTMLParser.parse()V (ReducedHTMLParser.java:579)
at org.apache.myfaces.renderkit.html.util.ReducedHTMLParser.parse(Ljava/lang/CharSequence;Lorg/apache/myfaces/renderkit/html/util/CallbackListener;)V (ReducedHTMLParser.java:66)
at org.apache.myfaces.renderkit.html.util.DefaultAddResource.parseResponse(Ljavax/servlet/http/HttpServletRequest;Ljava/lang/String;Ljavax/servlet/http/HttpServletResponse;)V (DefaultAddResource.java:699)
at org.apache.myfaces.webapp.filter.ExtensionsFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (ExtensionsFilter.java:157)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (ApplicationFilterChain.java:243)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (ApplicationFilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (StandardWrapperValve.java:240)
at org.apache.catalina.core.StandardContextValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (StandardContextValve.java:164)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (AuthenticatorBase.java:462)
at org.apache.catalina.core.StandardHostValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (StandardHostValve.java:164)
at org.apache.catalina.valves.ErrorReportValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (ErrorReportValve.java:100)
at org.apache.catalina.valves.AccessLogValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (AccessLogValve.java:562)
at org.apache.catalina.core.StandardEngineValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (StandardEngineValve.java:118)
at org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (JvmRouteBinderValve.java:218)
at org.apache.catalina.ha.tcp.ReplicationValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (ReplicationValve.java:333)
at org.apache.catalina.connector.CoyoteAdapter.service(Lorg/apache/coyote/Request;Lorg/apache/coyote/Response;)V (CoyoteAdapter.java:395)
at org.apache.coyote.ajp.AjpProcessor.process(Lorg/apache/tomcat/util/net/SocketWrapper;)Lorg/apache/tomcat/util/net/AbstractEndpoint$Handler$SocketState; (AjpProcessor.java:301)
at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(Lorg/apache/tomcat/util/net/SocketWrapper;Lorg/apache/tomcat/util/net/SocketStatus;)Lorg/apache/tomcat/util/net/AbstractEndpoint$Handler$SocketState; (AjpProtocol.java:183)
at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(Lorg/apache/tomcat/util/net/SocketWrapper;)Lorg/apache/tomcat/util/net/AbstractEndpoint$Handler$SocketState; (AjpProtocol.java:169)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run()V (JIoEndpoint.java:302)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Ljava/lang/Runnable;)V (ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:908)
at java.lang.Thread.run()V (Thread.java:662)
Suspect 2
One instance of "java.lang.StringBuffer" loaded by "" occupies 59,216,088 (26.35%) bytes. The instance is referenced by org.apache.myfaces.renderkit.html.util.ReducedHTMLParser # 0x276990e8 , loaded by "org.apache.catalina.loader.WebappClassLoader # 0x29592038". The memory is accumulated in one instance of "char[]" loaded by "".

You can go to the "dominator_tree" tab of memory analyzer (MAT) and expand the TaskThread. This will show you the retained heap of all the objects in that taskthread. This might help you reach the part (code) in your application causing the issue.

It looks like org.apache.myfaces.renderkit.html.util.ReducedHTMLParser is the culprit. The javadoc for ReducedHTMLParser explains how it works. It buffers the entire HTML response in memory and then processes. It looks like it it trying - and failing - to process a very large response.

Related

Java Native Memory 'Other' section consumes a lot of memory

Prerequisites
Application is run in docker-container with Java openjdk version "13.0.1" with these options:
-Xmx6G -XX:MaxHeapFreeRatio=30 -XX:MinHeapFreeRatio=10 -XX:+AlwaysActAsServerClassMachine -XX:+UseContainerSupport -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -XX:HeapDumpPath==/.../crush.hprof -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -Xlog:gc*:file=/var/log/.../log.gc.log:time::filecount=5,filesize=100000
When I run jcmd 1 VM.native_memory, I get this:
Total: reserved=9081562KB, committed=1900002KB
- Java Heap (reserved=6291456KB, committed=896000KB)
(mmap: reserved=6291456KB, committed=896000KB)
- Class (reserved=1221794KB, committed=197034KB)
(classes #34434)
( instance classes #32536, array classes #1898)
(malloc=7330KB #121979)
(mmap: reserved=1214464KB, committed=189704KB)
( Metadata: )
( reserved=165888KB, committed=165752KB)
( used=161911KB)
( free=3841KB)
( waste=0KB =0.00%)
( Class space:)
( reserved=1048576KB, committed=23952KB)
( used=21501KB)
( free=2451KB)
( waste=0KB =0.00%)
- Thread (reserved=456661KB, committed=50141KB)
(thread #442)
(stack: reserved=454236KB, committed=47716KB)
(malloc=1572KB #2654)
(arena=853KB #882)
- Code (reserved=255027KB, committed=100419KB)
(malloc=7343KB #26005)
(mmap: reserved=247684KB, committed=93076KB)
- GC (reserved=316675KB, committed=116459KB)
(malloc=47311KB #70516)
(mmap: reserved=269364KB, committed=69148KB)
- Compiler (reserved=1429KB, committed=1429KB)
(malloc=1634KB #2498)
(arena=18014398509481779KB #5)
- Internal (reserved=2998KB, committed=2998KB)
(malloc=2962KB #5480)
(mmap: reserved=36KB, committed=36KB)
- Other (reserved=446581KB, committed=446581KB)
(malloc=446581KB #368)
- Symbol (reserved=36418KB, committed=36418KB)
(malloc=34460KB #906917)
(arena=1958KB #1)
- Native Memory Tracking (reserved=18786KB, committed=18786KB)
(malloc=587KB #8291)
(tracking overhead=18199KB)
- Shared class space (reserved=11180KB, committed=11180KB)
(mmap: reserved=11180KB, committed=11180KB)
- Arena Chunk (reserved=19480KB, committed=19480KB)
(malloc=19480KB)
- Logging (reserved=7KB, committed=7KB)
(malloc=7KB #271)
- Arguments (reserved=17KB, committed=17KB)
(malloc=17KB #471)
- Module (reserved=1909KB, committed=1909KB)
(malloc=1909KB #11057)
- Safepoint (reserved=8KB, committed=8KB)
(mmap: reserved=8KB, committed=8KB)
- Synchronization (reserved=1136KB, committed=1136KB)
(malloc=1136KB #6628)
Here we can see that 'Other' section consumes 446581 KB whereas total committed memory is 1900002 KB.
So, 'Other' section takes 23% of all committed memory!
Also this memory is not freed when application is running.
Because of this I changed java flag -XX:NativeMemoryTracking=summary to -XX:NativeMemoryTracking=detail to check where memory is allocated and got this 2 strange blocks of memory:
[0x00007f8db4b32bae] Unsafe_AllocateMemory0+0x8e
[0x00007f8da416e7db]
(malloc=298470KB type=Other #286)
[0x00007f8db4b32bae] Unsafe_AllocateMemory0+0x8e
[0x00007f8d9b84bc90]
(malloc=148111KB type=Other #82)
Analyze
I tried to use async-profiler to check event Unsafe_AllocateMemory0.
I run async-profiler as agent like this:
java -agentpath:/async-profiler/build/libasyncProfiler.so=start,event=itimer,Unsafe_AllocateMemory0,file=/var/log/.../unsafe_allocate_memory.html
And got this flamegraph: https://i.stack.imgur.com/PbE5D.png
Also, I tried to profile events malloc,mmap,mprotect. malloc showed the same flamegraph as event Unsafe_AllocateMemory0, but flamegraphs for mmap and mprotect were empty.
I thought that problem can be related with C2 compiler and disabled it, but after restart nothing changed - the 'Other' section still occupied a lot of memory memory. Moreover, this application is long-living and I'm not sure that disabling C2 can be a good idea.
I tried to use jeprof to check which part of code executes os.malloc
I run java application like this:
LD_PRELOAD=/usr/local/lib/libjemalloc.so MALLOC_CONF=prof:true,lg_prof_interval:30,lg_prof_sample:17 exec java -jar /srv/app/myapp.jar
After 10+ minutes I used jeprof and got this: https://i.stack.imgur.com/45adD.gif
And again there are 2 blocks of memory which occupied many native memory.
Result
I cannot find the place, which allocates so much memory.
Maybe someone can recommend how to spot the root cause of this problem? And what steps do I need to take to avoid this problem?
UPDATE 1
Thanks to apangin I have finally found the place, where so much memory is occupied!
It's related to Redisson/Lettuce, which are using Netty under the hood: flamegraph
I used experimental native mode and run java:
java -agentpath:/async-profiler/build/libasyncProfiler.so=start,event=nativemem,file=/var/log/.../profile.jfr -jar /srv/app/myapp.jar
Your async-profilers arguments seem wrong.
Change event=itimer,Unsafe_AllocateMemory0 to event=Unsafe_AllocateMemory0
async-profiler also has an experimental nativemem mode specifically for finding native memory leaks. See https://github.com/jvm-profiling-tools/async-profiler/discussions/491 for the details.
Other section in NMT typically includes off-heap memory allocated with Unsafe.allocateMemory, in particular, Direct ByteBuffers.

Memory leak Micrometer java

I analyzed my heap dump by uploading on an online website and am getting this
Java Static io.micrometer.shaded.io.netty.util.internal.ObjectCleaner.LIVE_SET
Object might be causing memory leak.
115,297K (23.4%) of char[]
Java Static
io.micrometer.shaded.io.netty.util.internal.ObjectCleaner.LIVE_SET
io.micrometer.shaded.io.netty.util.internal.ConcurrentSet.map
{java.util.concurrent.ConcurrentHashMap}.keys
io.micrometer.shaded.io.netty.util.internal.ObjectCleaner$AutomaticCleanerReference.referent
org.apache.tomcat.util.threads.TaskThread.threadLocals
j.l.ThreadLocal$ThreadLocalMap.table
j.l.ThreadLocal$ThreadLocalMap$Entry[]
j.l.ThreadLocal$ThreadLocalMap$Entry.value
j.l.r.SoftReference.referent
com.fasterxml.jackson.core.util.BufferRecycler._charBuffers
char[][]
Any suggestions regarding this?

How does a JVM process allocate its memory?

I have a little gap in understanding how a JVM process allocates its own memory. As far as I know
RSS = Heap size + MetaSpace + OffHeap size
where OffHeap consists of thread stacks, direct buffers, mapped files (libraries and jars) and JVM code itself;
At the moment I’m trying to analyze my Java application (Spring Boot + Infinispan) which RSS is 779M (it runs in a docker container, so pid 1 is ok):
[ root#daf5a5ae9bb7:/data ]$ ps -o rss,vsz,sz 1
RSS VSZ SZ
798324 6242160 1560540
According to jvisualvm, committed Heap size is 374M
Metasapce size is 89M
In other words, I want to explain 799M - (374M + 89M) = 316M of OffHeap memory.
My app has (in average) 36 live threads.
Each of these threads consumes 1M:
[ root#fac6d0dfbbb4:/data ]$ java -XX:+PrintFlagsFinal -version |grep ThreadStackSize
intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024
So, here we can add 36M.
The only place where the app uses DirectBuffer is NIO. As far as I can see from JMX, it doesn’t consume a lot of resources - only 98K
The last step is mapped libs and jars. But according to pmap (full output)
[ root#daf5a5ae9bb7:/data ]$ pmap -x 1 | grep ".so.*" | awk '{ sum+=$3} END {print sum}'
12896K
plus
root#daf5a5ae9bb7:/data ]$ pmap -x 1 | grep “.jar" | awk '{ sum+=$3} END {print sum}'
9720K
we only have 20M here.
Hence, we still have to explain 316M - (36M + 20M) = 260M :(
Does anyone have any idea what I missed?
Approach:
You may want to use Java HotSpot Native Memory Tracking (NMT).
This may give you an exact list of memory allocated by the JVM, splitted up into the different areas heap, classes, threads, code, GC, compiler, internal, symbols, memory tracking, pooled free chunks, and unknown.
Usage:
You can start your application with -XX:NativeMemoryTracking=summary.
Observations of the current heap can be done with jcmd <pid> VM.native_memory summary.
Where to find jcmd / pid:
On a default OpedJDK installation on Ubuntu this can be found at /usr/bin/jcmd.
By just running jcmd without any parameter, you get a list of running Java applications.
user#pc:~$ /usr/bin/jcmd
5169 Main <-- 5169 is the pid
Output:
You will then receive a complete overview over your heap, looking something like the following:
Total: reserved=664192KB, committed=253120KB <--- total memory tracked by Native Memory Tracking
Java Heap (reserved=516096KB, committed=204800KB) <--- Java Heap
(mmap: reserved=516096KB, committed=204800KB)
Class (reserved=6568KB, committed=4140KB) <--- class metadata
(classes #665) <--- number of loaded classes
(malloc=424KB, #1000) <--- malloc'd memory, #number of malloc
(mmap: reserved=6144KB, committed=3716KB)
Thread (reserved=6868KB, committed=6868KB)
(thread #15) <--- number of threads
(stack: reserved=6780KB, committed=6780KB) <--- memory used by thread stacks
(malloc=27KB, #66)
(arena=61KB, #30) <--- resource and handle areas
Code (reserved=102414KB, committed=6314KB)
(malloc=2574KB, #74316)
(mmap: reserved=99840KB, committed=3740KB)
GC (reserved=26154KB, committed=24938KB)
(malloc=486KB, #110)
(mmap: reserved=25668KB, committed=24452KB)
Compiler (reserved=106KB, committed=106KB)
(malloc=7KB, #90)
(arena=99KB, #3)
Internal (reserved=586KB, committed=554KB)
(malloc=554KB, #1677)
(mmap: reserved=32KB, committed=0KB)
Symbol (reserved=906KB, committed=906KB)
(malloc=514KB, #2736)
(arena=392KB, #1)
Memory Tracking (reserved=3184KB, committed=3184KB)
(malloc=3184KB, #300)
Pooled Free Chunks (reserved=1276KB, committed=1276KB)
(malloc=1276KB)
Unknown (reserved=33KB, committed=33KB)
(arena=33KB, #1)
This gives a detailed overview of the different memory areas used by the JVM, and also shows the reserved and commited memory.
I don't know of a technique that gives you a more detailed memory consumption list.
Further reading:
You can also use -XX:NativeMemoryTracking=detail in combination with further jcmd commands. A more detailed explaination can be found at Java Platform, Standard Edition Troubleshooting Guide - 2.6 The jcmd Utility. You can check possible commands via "jcmd <pid> help"

Java heap Error while running R code

I'm trying to do a feature selection using the chi.squared function in FSelector package in R.
My dataset is about 132 variables X 192,000 rows.
chisquared.fs <- chi.squared(fo,df)
where fo contains the class variable: class ~.
I'm getting this error while running the code:
Error in .jcall("weka/filters/Filter", "Lweka/core/Instances;", "useFilter",
:java.lang.OutOfMemoryError: Java heap space
I know it is a Java memory leak error and I have already tried this before calling any libraries:
options( java.parameters = "-Xmx6g")
Any pointers would be really welcome.
Guys update: I had done what #copeg suggested without restarting R. I restarted R and with the options statement at the beginning before calling the libraries and it worked. Thanks for your suggestions.

Hadoop - MultipleOutputs.write - OutofMemory - Java heap space

I am writing a hadoop job which processes many files and creates multiple files from each file. I am using "MultipleOutputs" to write them. It works fine for smaller number of files but i get the following error for large number of files.
The exception is raised on the MultipleOutputs.write(key, value, outputPath);
I have tried increasing the ulimit and -Xmx but to no avail.
2013-01-15 13:44:05,154 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hdfs.DFSOutputStream$Packet.<init>(DFSOutputStream.java:201)
at org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1423)
at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:161)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:136)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:125)
at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:90)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter. writeObject( TextOutputFormat.java:78)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter. write(TextOutputFormat.java:99)
**at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write( MultipleOutputs.java:386)
at com.demoapp.collector.MPReducer.reduce(MPReducer.java:298)
at com.demoapp.collector.MPReducer.reduce(MPReducer.java:28)**
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:595)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:433)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Any ideas?
If it doesn't work with a large number of files, it's probably because you've hit the max number of files that can be served by a datanode. This can be controlled with a property called dfs.datanode.max.xcievers in hdfs-site.xml.
As recommended here, you should bump its value to something that will allow your job to run correctly, they recommend 4096:
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
I increased the number of reduce task from 1 to 8 and increased the values of io.sort.mb to and mapred.task.timeout.
Details
This link was helpful-
Cloudera blog

Categories

Resources