Huge time gap between spark jobs - java

I create and persist a df1 on which then I am doing the below:
df1.persist (From the Storage Tab in spark UI it says it is 3Gb)
df2=df1.groupby(col1).pivot(col2) (This is a df with 4.827 columns and 40107 rows)
df2.collect
df3=df1.groupby(col2).pivot(col1) (This is a df with 40.107 columns and 4.827 rows)
-----it hangs here for almost 2 hours-----
df4 = (..Imputer or na.fill on df3..)
df5 = (..VectorAssembler on df4..)
(..PCA on df5..)
df1.unpersist
I have a cluster with 16 nodes(each node has 1 worker with 1 executor with 4 cores and 24Gb Ram) and a master(with 15Gb of Ram). Also spark.shuffle.partitions is 192. It hangs for 2 hours and nothing is happening. Nothing is active in Spark UI. Why does it hang for so long? Is it the DagScheduler? How can I check it? Please let me know if you need any more information.
----Edited 1----
After waiting for almost two hours it proceeds and then eventually fails. Below is the stages and executor tabs from Spark UI:
Also, in the stderr file in the worker nodes it says:
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000003fe900000, 6434586624, 0) failed; error='Cannot allocate memory' (errno=12)
Moreover, it seems there is a file produced named "hs_err_pid11877" in the folder next to stderr and stdout which says:
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 6434586624 bytes for committing reserved memory.
Possible reasons:
The system is out of physical RAM or swap space
The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
Possible solutions:
Reduce memory load on the system
Increase physical memory or swap space
Check if swap backing store is full
Decrease Java heap size (-Xmx/-Xms)
Decrease number of Java threads
Decrease Java thread stack sizes (-Xss)
Set larger code cache with -XX:ReservedCodeCacheSize=
JVM is running with Zero Based Compressed Oops mode in which the Java heap is
placed in the first 32GB address space. The Java Heap base address is the
maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
to set the Java Heap base and to place the Java Heap above 32GB virtual address.
This output file may be truncated or incomplete.
Out of Memory Error (os_linux.cpp:2792), pid=11877, tid=0x00007f237c1f8700
JRE version: OpenJDK Runtime Environment (8.0_265-b01) (build 1.8.0_265-8u265-b01-0ubuntu2~18.04-b01)
Java VM: OpenJDK 64-Bit Server VM (25.265-b01 mixed mode linux-amd64 compressed oops)
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
...and other information about the task it fails, GC information, etc..
----Edited 2----
Here is the Tasks Section of the the last pivot(stage with id 16 from stages picture).. just before the hanging. It seems that all 192 partitions have a pretty equal amount of data, from 15 to 20MB.

pivot in Spark generates an extra Stage to get the pivot values, that happens underwater and can take some time and depends how your resources are allocated, etc.

Related

"Insufficient memory" on Jenkins server startup

First time user of Jenkins here, and having a bit of trouble getting it started. From the Linux shell I run a command like:
java -Xms512m -Xmx512m -jar jenkins.war
and consistently get an error like:
# There is insufficient memory for the Java Runtime Environment to continue.
# pthread_getattr_np
# An error report file with more information is saved as:
# /home/twilliams/.jenkins/hs_err_pid36290.log
First, the basics:
Jenkins 1.631
Running via the jetty embedded in the war file
OpenJDK 1.7.0_51
Oracle Linux (3.8.13-55.1.5.el6uek.x86_64)
386 GB ram
40 cores
I get the same problem with a number of other configurations as well: using Java Hotspot 1.8.0_60, running through Apache Tomcat, and using all sorts of different values for -Xms/-Xmx/-Xss and similar options.
I've done a fair bit of research and think I know what the problem is, but am at a loss as how to solve it. I suspect that I'm running into the virtual memory overcommit issue mentioned here; the relevant bits from ulimit:
--($:)-- ulimit -a
...
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) 8388608
stack size (kbytes, -s) 8192
virtual memory (kbytes, -v) 8388608
...
If I double the virtual memory limit as root, I can start Jenkins, but I'd rather not run Jenkins as the root user.
Another workaround: a soon-to-be-decommissioned machine with 48 GB ram and 24 cores can start Jenkins without issue, though (I suspect) just barely: according to htop, its virtual memory footprint is just over 8 GB. I suspect, as a result, that the memory overcommit issue is scaling with the number of processors on the machine, presumably the result of Jenkins starting a number of threads proportional to the number of processors present on the host machine. I roughly captured the thread count via ps -eLf | grep jenkins | wc -l and found that the thread count spikes at around 114 on the 40 core machine, and 84 on the 24 core machine.
Does this explanation seem sound? Provided it does...
Is there any way to configure Jenkins to reduce the number of threads it spawns at startup? I tried the arguments discussed here but, as advertised, they didn't seem to have any effect.
Are there any VMs available that don't suffer from the overcommit issue, or some other configuration option to address it?
The sanest option at this point may be to just run Jenkins in a virtualized environment to limit the resources at its disposal to something reasonable, but at this point I'm interested in this problem on an intellectual level and want to know how to get this recalcitrant configuration to behave.
Edit
Here's a snippet from the hs_error.log file, which guided my initial investigation:
# There is insufficient memory for the Java Runtime Environment to continue.
# pthread_getattr_np
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
Here are a couple of command lines I tried, all with the same result.
Starting with a pitiful amount of heap space:
java -Xms2m -Xmx2m -Xss228k -jar jenkins.war
Starting with significantly more heap space:
java -Xms2048m -Xmx2048m -Xss1m -XX:ReservedCodeCacheSize=512m -jar jenkins.war
Room to grow:
java -Xms2m -Xmx1024m -Xss228k -jar jenkins.war
A number of other configurations were tried as well. Ultimately I don't think that the problem is heap exhaustion here - it's that the JVM is trying to reserve too much virtual memory for itself (in which to store the heap, thread stacks, etc) than allowed by the ulimit settings. Presumably this is the result of the overcommit issue linked earlier, such that if Jenkins is spawning 120 threads, it's erroneously trying to reserve 120x as much VM space as the master process originally occupied.
Having done what I could with the other options suggested in that log, I'm trying to figure out how to reduce the thread count in Jenkins to test the thread VM overcommit theory.
Edit #2
Per Michał Grzejszczak, this is an issue with the glibc distributed with Red Hat Enterprise Linux 6 as discussed here. The issue can be worked around via explicit setting of the environment variable MALLOC_ARENA_MAX, in my case export MALLOC_ARENA_MAX=2. Without explicit setting of this variable, the JVM will apparently attempt to spawn (8 x cpu core count) threads, each consuming 64M. My 40 core case would have required northward of 10 gigs of virtual ram, exceeding (by itself) the ulimit on my machine of 8 gigs. Setting this to 2 reduces VM consumption to around 128 megs.
Jenkins memory footprint is related more to the number and size of projects it is managing than the number of CPUs or available memory. Jenkins should run fine on 1GB of heap memory unless you have gigantic projects on it.
You may have misconfigured the JVM though. -Xmx and -Xms parameters govern heap space JVM can use. -Xmx is a limit for heap memory, -Xms is a starting value for heap memory. Heap is a single memory area for entire JVM. You can easily monitor it by various tools like JConsole or VisualVM.
On the other hand -Xss is not related to heap. It is the size of a thread stack for all threads in this JVM process. As Java programs tend to create numerous threads setting this parameter too big can prevent your program from launching. Typically this value is in the range of 512kb. Entering here 512m instead makes it impossible for JVM to start. Make sure your settings do not contain any such mistakes (and post your memory config too).

unexpected error has been detected by Java Runtime Environment

I am generating report(CSV) through java and i am using hibernate for fetching the data from data base.
Part of my code is as below :
ScrollableResults items = null;
String sql = " from " + topBO.getClass().getName() + " where " + spec;
StringBuffer sqlQuery = new StringBuffer(sql);
Query query = sessionFactory.getCurrentSession().createQuery(sqlQuery.toString());
items = query.setFetchSize( 1000 ).setCacheable(false).scroll(ScrollMode.FORWARD_ONLY);
list = new ArrayList<TopBO>();
// error occurs in while loop. at the time of fetching more data.
while(items.next())
{
TopBO topBO2 =(TopBO) items.get(0);
list.add(topBO2 );
topBO2 = null;
}
sessionFactory.evict(topBO.getClass());
Environment info
JVM config : Xms512M -Xmx1024M -XX:MaxPermSize=512M -XX:MaxHeapSize=1024M
Jboss : JBoss 5.1 Runtime Server
Oracle : 10g
JDK : jdk1.6.0_24(32-bit/x86)
Operating System : Window 7(32-bit/x86)
Ram : 4gb
Error : When i fetch the data up to 50k it works fine. but when i am fetching the data more then it. it gives me the error :
#
# An unexpected error has been detected by Java Runtime Environment:
#
# java.lang.OutOfMemoryError: requested 4096000 bytes for GrET in C:\BUILD_AREA\jdk6_11\hotspot\src\share\vm\utilities\growableArray.cpp. Out of swap space?
#
# Internal Error (allocation.inline.hpp:42), pid=1408, tid=6060
# Error: GrET in C:\BUILD_AREA\jdk6_11\hotspot\src\share\vm\utilities\growableArray.cpp
#
# Java VM: Java HotSpot(TM) Client VM (11.0-b16 mixed mode windows-x86)
# An error report file with more information is saved as:
# D:\jboss-5.1.0.GA\bin\hs_err_pid1408.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#
When i set the Xms512M -Xmx768M -XX:MaxPermSize=512M -XX:MaxHeapSize=768M It throws me another exception :
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError generally caused due to lacking required Heap space. What you can do is to increase your jvm heap size using flag -Xmx1548M or more MB than 1548.
But you seems like running out of System memory so you should use a better JVM that handles memory more efficiently and I suggest a JVM upgrade. How about upgrading your JVM 1.6 to some more recent versions?
The solution of this problem can be uncommon. Try to use recommendations from article OutOfMemoryError: Out of swap space - Problem Patterns.
There are multiple scenarios which can lead to a native
OutOfMemoryError.
Native Heap (C-Heap) depletion due to too many Java EE applications deployed on a single 32-bit JVM (combined with large Java
Heap e.g. 2 GB) * most common problem *
Native Heap (C-Heap) depletion due to a non-optimal Java Heap size e.g. Java Heap too large for the application(s) needs on a single
32-bit JVM
Native Heap (C-Heap) depletion due to too many created Java Threads e.g. allowing the Java EE container to create too many Threads
on a single 32-bit JVM
OS physical / virtual memory depletion preventing the HotSpot VM to allocate native memory to the C-Heap (32-bit or 64-bit VM)
OS physical / virtual memory depletion preventing the HotSpot VM to expand its Java Heap or PermGen space at runtime (32-bit or
64-bit VM)
C-Heap / native memory leak (third party monitoring agent / library, JVM bug etc.)
I would begin with such recommendation to troubleshooting:
Review your JVM memory settings. For a 32-bit VM, a Java Heap of 2 GB+
can really start to add pressure point on the C-Heap; depending how
many applications you have deployed, Java Threads etc… In that case,
please determine if you can safely reduce your Java Heap by about 256
MB (as a starting point) and see if it helps improve your JVM memory
“balance”.
It is also possible to try (but it is more difficult in respect of labor costs) to upgrade your environment to 64-bit versions of OS and JVM, because 4Gb of physical RAM will be better utilized on x64 OS.

Unable to set Java heap size larger than 1568

I am running a server with the following attributes:
Windows Server 2008 R2 Standard - 64bit
4gb RAM
I am trying to set the heap size to 3gb for an application. I am using the flags -Xmx3G -Xms3G. Running with the flags results in the following error:
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
I have been playing with the setting to see what my ceiling is and found that 1568 is my ceiling. What am I missing?
How much physical memory is available on your system (out of the original 4 GB)? It sounds like your system doesn't have 3GB of physical memory available when the vm starts up.
Remember that the JVM needs more memory than is allocated to the heap -- there are other data structures as well (thread stacks, etc) that also need memory. So the settings you are providing attempt to use more than 3GB of memory.
Also, are you using a 64-bit jvm? The practical limit for heap size on a 32-bit vm is 1.4 to 1.6 gigabytes according to this document.
Java requires continuous virtual memory on startup. On windows, 32-bit application run in an 32-bit emulated environment so you don't get much more continuous memory than you would in a 32-bit OS. c.f. on Solaris you get over 3 GB virtual memory for 32-bit Java.
I suggest you use the 64-bit version of Java as this will make use of all the memory you have. You still need to have free memory but the larger address space doesn't suffer from fragmentation.
BTW: The heap space is only part of the memory used, you need memory for shared libraries, direct memory, GUI components etc.
It seems you don't have 3G of physical mememory available. Here is an interesting article on Java heap size settings errors. Java heap size setting errors

"Error occurred during initialization of VM" in linux

I am trying to run java command in linux server it was running well but today when I tried to run java I got some error-
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
my memory space is -
root#vps [~]# free -m
total used free
Mem: 8192 226 7965
-/+ buf: 226 7965
Swap: 0 0 0
How can I solve this problem?
The machine did not have enough memory at that time to service the JVM's request for memory to start the program. I expect that you have 8 Gb of memory in the machine and that you use a 64-bit JVM.
I would suggest you add some swap space to the system to let it handle spikes in memory usage, and then figure out where the spike came from.
Which VM are you using? What is the maximum memory size you are trying to use?
If you are using 32-bit JVM on Windows and you are using close to the maximum it can access on your system, it can be impacted by memory fragmentation. You may have a similar problem.

Why am I able to set -Xmx to a value greater than physical and virtual memory on the machine on both Windows and Solaris?

On a 64-bit Windows machine with 12GB of RAM and 33GB of Virtual Memory (per Task Manager), I'm able to run Java (1.6.0_03-b05) with an impossible -Xmx setting of 3.5TB but it fails with 35TB. What's the logic behind when it works and when it fails? The error at 35TB seems to imply that it's trying to reserve space at startup. Why would it do that for -Xmx (as opposed to -Xms)?
C:\temp>java -Xmx3500g ostest
os.arch=amd64
13781729280 Bytes RAM
C:\temp>java -Xmx35000g ostest
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
On Solaris (4GB RAM, Java 1.5.0_16), I pretty much gave up at 1 PB on how high I can set -Xmx. I don't understand the logic for when it will error out on the -Xmx setting.
devsun1.mgo:/export/home/mgo> java -d64 -Xmx1000000g ostest
os.arch=sparcv9
4294967296 Bytes RAM
At least with the Sun 64-bit VM 1.6.0_17 for Windows, ObjectStartArray::initialize will allocate 1 byte for each 512 bytes of heap on VM startup. Starting the VM with 35TB heap will cause the VM to allocate 70GB immediately and hence fail on your system.
The 32-bit VM (and so I suppose the 64-bit VM) from Sun does not take account for available physical memory when calculating the maximum heap, but is only limited by the 2GB addressable memory on Windows and Linux or 4GB on Solaris or possibly failing to allocate enough memory at startup for the management area.
If you think about it, checking the sanity of the max heap value against available physical memory does not make much sense. X GB of physical memory does not mean that X GB is available to the VM when required, it can just as well have been used by other processes, so the VM needs a way to cope with the situation that more heap is required than available from the OS anyway. If the VM is not broken, OutOfMemoryErrors are thrown if memory cannot be allocated from the OS, just as if the max heap size has been reached.
According to this thread on Sun's java forums (the OP has 16GB of physical memory):
You could specify -Xmx20g, but if the total of the memory needed by all the processes on your machine ever exceeds the physical memory on your machine, you are likely to end up paging. Some applications can survive running in paged memory, but the JVM isn't one of them. Your code might run okay, but, for example, garbage collections will be abysmally slow.
UPDATE: I googled a bit further and, according to the Frequently Asked Questions About the Java HotSpot VM and more precisely How large a heap can I create using a 64-bit VM?
How large a heap can I create using a 64-bit VM?
On 64-bit VMs, you have 64 bits of
addressability to work with resulting
in a maximum Java heap size limited
only by the amount of physical memory
and swap space your system provides.
See also Why can't I get a larger
heap with the 32-bit JVM?
I don't know why you are able to start a JVM with a heap >45GB. This is a bit confusing...
Just to re-enforce Pascal's answer--Be very careful in windows when specifying a high max memory size. I was working on a server project that required as much physical memory as possible, but once you are over physical ram, abysmal performance is not a good description of what happens--Hung machine might be better.
What happens (at least this is my evaluation of it after days of examining logs and re-running tests) is, Windows runs out of ram and asks all apps to free up what they can. When it asks Java, Java kicks off a GC. The GC touches all of memory (causing anything that has been swapped out to be swapped in). This in turn causes windows to run out of memory. Windows then sends a message to all apps asking them to free up what they can.... (recurse indefinitely)
This may not ACTUALLY be what is going on, but the fact that Java GC touches Very Old Memory at times makes it incompatible with paging.

Categories

Resources