I'm tuning our product for G1GC, and as part of that testing, I'm experiencing regular segfaults on my Spark Workers, which of course causes the JVM to crash. When this happens, the Spark Worker/Executor JVM automagically restarts itself, which then overwrites the GC logs that were written for the previous Executor JVM.
To be honest, I'm not quite sure the mechanism for how the Executor JVM restarts itself, but I launch the Spark Driver service via init.d, which in turn calls off to a bash script. I do use a timestamp in that script that gets appended to the GC log filename:
today=$(date +%Y%m%dT%H%M%S%3N)
SPARK_HEAP_DUMP="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${SPARK_LOG_HOME}/heapdump_$$_${today}.hprof"
SPARK_GC_LOGS="-Xloggc:${SPARK_LOG_HOME}/gc_${today}.log -XX:LogFile=${SPARK_LOG_HOME}/safepoint_${today}.log"
GC_OPTS="-XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:+PrintFlagsFinal -XX:+PrintJNIGCStalls -XX:+PrintTLAB -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=15 -XX:GCLogFileSize=48M -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintAdaptiveSizePolicy -XX:+PrintHeapAtGC -XX:+PrintGCCause -XX:+PrintReferenceGC -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1"
I think the problem is that this script sends these options along to the Spark Driver, which then passes them off to the Spark Executors (via the -Dspark.executor.extraJavaOptions argument), which are all separate servers, and when the Executor JVM crashes, it simply uses the command that had been originally sent to start back up, which would mean that the timestamp portion of the GC log filename is static:
SPARK_STANDALONE_OPTS=`property ${SPARK_APP_CONFIG}/spark.properties "spark-standalone.extra.args"`
SPARK_STANDALONE_OPTS="$SPARK_STANDALONE_OPTS $GC_OPTS $SPARK_GC_LOGS $SPARK_HEAP_DUMP"
exec java ${SPARK_APP_HEAP_DUMP} ${GC_OPTS} ${SPARK_APP_GC_LOGS} \
${DRIVER_JAVA_OPTIONS} \
-Dspark.executor.memory=${EXECUTOR_MEMORY} \
-Dspark.executor.extraJavaOptions="${SPARK_STANDALONE_OPTS}" \
-classpath ${CLASSPATH} \
com.company.spark.Main >> ${SPARK_APP_LOGDIR}/${SPARK_APP_LOGFILE} 2>&1 &
This is making it difficult for me to debug the cause of the segfaults, since I'm losing the activity and state of the Workers that led up to the JVM crash. Any ideas for how I can handle this situation and keep the GC logs on the Workers, even after a JVM crash/segfault?
If you are using Java 8 and above, you may consider getting away with it by adding %p to the log file name to introduce the PID which will be kind of unique per crash.
Related
I’m running a conversion project from svn to git. As the application is single threaded, I’m moving the project to a Faster PC.
So without any options bar httpSpooling = true; It runs OK on a VM – 4 CPU's, 20 Gb of Ram.
RAM Usage with two separate instances is 8GB, hitting a max of 9.8Gb.
Jobs Paused, Zipped & SCP'd to new machine – Bare Metal build of Deb9 (same as VM) i7 (8 CPUs(effective)) 16GB ram.
However when starting just one instance of SubGit; I get either Java out of memory or GC Overhead Limit Exceeded.
I’ve tried adding the following permutations to repo.git/subgit/config to [daemon]
javaOptions = -noverify -client -Djava.awt.headless=true -Xmx8g -XX:+UseParallelGC -XX:-UseGCOverheadLimit – This gives GC Overhead Limit Exceeded Error
#javaOptions = -noverify -client -Djava.awt.headless=true -Xmx8g -XX:+UseParallelGC -XX:-UseGCOverheadLimit – (OPS Disabled) Gives an out of memory error.
javaOptions = -noverify -client -Djava.awt.headless=true –Xmx12g -XX:-UseGCOverheadLimit – this gives out of memory errors.
I’ve tried other settings too, including changing –client for –server, but that appears to be more two way conversion, which is not something I’m trying to do.
There should be plenty of RAM based on the application usage on a system running successfully, so unless SubGit is ignoring some values, I can’t tell.
The 'javaOptions' in the [daemon] section may indeed be ignored depending on the operation you run: those java options affect SubGit daemon, but not the 'subgit install' or 'subgit fetch' operation. Since you've mentioned that repositories were moved to another machine, I believe, you have invoked either of those two commands to restart the mirror and that's why that 'daemon.javaOptions' is ignored. To tune SubGit's java options edit it right in the SubGit launching script (EXTRA_JVM_ARGUMENTS line):
EXTRA_JVM_ARGUMENTS="-Dsun.io.useCanonCaches=false -Djava.awt.headless=true -Djna.nosys=true -Dsvnkit.http.methods=Digest,Basic,NTLM,Negotiate -Xmx512m"
As for the memory consumption itself, it depends on which operations are being run. It's not completely clear how did you pause the jobs on the virtual machine (by 'subgit shutdown' or in another way?), which operations were running at that time (initial translation or regular fetches) and how did you restart the jobs on the new machine.
I use DCOS with Spring boot applications inside Docker containers. I noticed that sometimes containers are killed, but there are no errors in the container logs, only:
Killed
W1114 19:27:59.663599 119266 logging.cpp:91] RAW: Received signal SIGTERM
from process 6484 of user 0; exiting
HealthCheck is enabled only for SQL connection and disk space. The disk is ok on all nodes, in case of SQL problems error should appear in the logs. Other reason could be the memory but it also looks fine.
From marathon.production.json:
"cpus": 0.1,
"mem": 1024,
"disk": 0
And docker-entrypoint.sh:
java -Xmx1024m -server -XX:MaxJavaStackTraceDepth=10 -XX:+UseNUMA
-XX:+UseCondCardMark -XX:-UseBiasedLocking -Xms1024M -Xss1M
-XX:MaxPermSize=128m -XX:+UseParallelGC -jar app.jar
What could be the reason for container killing and are there any logs on DCOS regarding it?
Solved with java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
Or just use openjdk:11.0-jre-slim
My Spark driver runs out of memory after running for about 10 hours with the error Exception in thread "dispatcher-event-loop-17" java.lang.OutOfMemoryError: GC overhead limit exceeded. To further debug, I enabled G1GC mode and also the GC logs option using spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp
but it looks like it is not taking effect on the driver.
The job got stuck again on the driver after 10 hours and I dont see any GC logs under stdout on the driver node under /var/log/hadoop-yar/userlogs/[application-id]/[container-id]/stdout - so not sure where else to look. According to Spark GC tuning docs, it looks like these settings only happen on worker nodes (which I can see in this case as well as workers have GC logs in stdout after I had used the same configs under spark.executor.extraJavaOptions). Is there anyway to enable/acquire GC logs from the driver? Under Spark UI -> Environment, I see these options are listed under spark.driver.extraJavaOptions which is why I assumed it would be working.
Environment:
The cluster is running on Google Dataproc and I use /usr/bin/spark-submit --master yarn --deploy-mode cluster ... from the master to submit jobs.
EDIT
Setting the same options for the driver during the spark-submit command works and I am able to see the GC logs on stdout for the driver. Just that setting the options via SparkConf programmatically does not seem to take effect for some reason.
I believe spark.driver.extraJavaOptions is handled by SparkSubmit.scala, and needs to be passed at invocation. To do that with Dataproc you can add that to the properties field (--properties in gcloud dataproc jobs submit spark).
Also instead of -Dlog4j.configuration=log4j.properties you can use this guide to configure detailed logging.
I could see GC driver logs with:
gcloud dataproc jobs submit spark --cluster CLUSTER_NAME --class org.apache.spark.examples.SparkPi --jars file:///usr/lib/spark/examples/jars/spark-examples.jar --driver-log-levels ROOT=DEBUG --properties=spark.driver.extraJavaOptions="-XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp" --
You probably don't need --driver-log-levels ROOT=DEBUG, but can copy in your logging config from log4j.properties. If you really want to use log4j.properties, you can probably use --files log4j.properties
I had to run jmap in order to take heap dump of my process. but jvm returned:
Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding
So I used the -F:
./jmap -F -dump:format=b,file=heap.bin 10330
Attaching to process ID 10331, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.51-b03
Dumping heap to heap.bin ...
Using -F is allright for taking heap dump?
I am waiting 20 minutes and not finished yet. Any ideas why?
jmap vs. jmap -F, as well as jstack vs. jstack -F use completely different mechanisms to communcate with the target JVM.
jmap / jstack
When run without -F these tools use Dynamic Attach Mechanism. This works as follows.
Before connecting to Java process 1234, jmap creates a file .attach_pid1234 at the working directory of the target process or at /tmp.
Then jmap sends SIGQUIT to the target process. When JVM catches the signal and finds .attach_pid1234, it starts AttachListener thread.
AttachListener thread creates UNIX domain socket /tmp/.java_pid1234 to listen to commands from external tools.
For security reasons when a connection (from jmap) is accepted, JVM verifies that credentials of the socket peer are equal to euid and egid of JVM process. That's why jmap will not work if run by different user (even by root).
jmap connects to the socket, and sends dumpheap command.
This command is read and executed by AttachListener thread of the JVM. All output is sent back to the socket. Since the heap dump is made in-process directly by JVM, the operation is really fast. However, JVM can do this only at safepoints. If a safepoint cannot be reached (e.g. the process is hung, not responding, or a long GC is in progress), jmap will timeout and fail.
Let's summarize the benefits and the drawbacks of Dynamic Attach.
Pros.
Heap dump and other operations are run collaboratively by JVM at the maximum speed.
You can use any version of jmap or jstack to connect to any other version of JVM.
Cons.
The tool should be run by the same user (euid/egid) as the target JVM.
Can be used only on live and healthy JVM.
Will not work if the target JVM is started with -XX:+DisableAttachMechanism.
jmap -F / jstack -F
When run with -F the tools switch to special mode that features HotSpot Serviceability Agent. In this mode the target process is frozen; the tools read its memory via OS debugging facilities, namely, ptrace on Linux.
jmap -F invokes PTRACE_ATTACH on the target JVM. The target process is unconditionally suspended in response to SIGSTOP signal.
The tool reads JVM memory using PTRACE_PEEKDATA. ptrace can read only one word at a time, so too many calls required to read the large heap of the target process. This is very and very slow.
The tool reconstructs JVM internal structures based on the knowledge of the particular JVM version. Since different versions of JVM have different memory layout, -F mode works only if jmap comes from the same JDK as the target Java process.
The tool creates heap dump itself and then resumes the target process.
Pros.
No cooperation from target JVM is required. Can be used even on a hung process.
ptrace works whenever OS-level privileges are enough. E.g. root can dump processes of all other users.
Cons.
Very slow for large heaps.
The tool and the target process should be from the same version of JDK.
The safepoint is not guaranteed when the tool attaches in forced mode. Though jmap tries to handle all special cases, sometimes it may happen that target JVM is not in a consistent state.
Note
There is a faster way to take heap dumps in forced mode. First, create a coredump with gcore, then run jmap over the generated core file. See the related question.
I just found that jmap (and presumably jvisualvm when using it to generate a heap dump) enforces that the user running jmap must be the same user running the process attempting to be dumped.
in my case the jvm i want a heap dump for is being run by linux user "jboss". so where sudo jmap -dump:file.bin <pid> was reporting "Unable to open socket:", i was able to grab my heap dump using:
sudo -u jboss jmap -dump:file.bin <pid>
If your application is runing as a systemd service.You should open service file that under /usr/lib/systemd/system/ and named by your service name. Then check whether privateTmp attribute is true.
If it is true,you shoud change it to false,then refresh service by command as follow:
systemctl daemon-reload
systemctl restart [servicename]
If you want runing jmap/jcmd before restart, you can make use of the execStop script in the service file. Just put command in it and to execute systemctl stop [service name]
Just like ben_wing said, you can run with:
sudo -u jboss-as jmap -dump:file.bin <pid>
(in my case the user is jboss-as, but yours could be jboss or some other.)
But it was not enough, because it asked me for a password ([sudo] password for ec2-user:), although I could run sudo without prompting me for a password with other commands.
I found the solution here, and I just needed to add another sudo first:
sudo sudo -u jboss-as jmap -dump:file.bin <pid>
It works with other commands like jcmd and jinfo too.
It's usually solved with -F.
As stated in the message:
The -F option can be used when the target process is not responding
I encountered a situation where the full GC made it impossible to execute the command.
I have my java-written application being "killed" after some time of work.
Java application is started from SH script under Linux, which is running for some time. After then the PID displayed and "killed" word said.
Like this:
runMyServer.sh: line 3: 3593 Killed java -Xmx2024m -cp ...
There is information about out of memory event in the system log. So it looks like out of memory error.
My question is: when OutOfMemroyError exception can be not generating?
You probably have too little memory on your system or run processes that eat up all ram and swap. When GNU/Linux runs out of memory it will kill processes using much memory. This is basically just kill on the process, so it is not you Java process running out of memory, but rather the OS.
To avoid your java application to be killed by the OOM killer just add enough swap to your system and disable memory overcomitting.
dd if=/dev/zero of=/swapfile bs=1M count=2048
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo "/swapfile swap swap defaults 0 0" >> /etc/fstab
echo 2 > /proc/sys/vm/overcommit_memory