I am experimenting with loading data bigger than the memory size in h2o.
H2o blog mentions: A note on Bigger Data and GC: We do a user-mode swap-to-disk when the Java heap gets too full, i.e., you’re using more Big Data than physical DRAM. We won’t die with a GC death-spiral, but we will degrade to out-of-core speeds. We’ll go as fast as the disk will allow. I’ve personally tested loading a 12Gb dataset into a 2Gb (32bit) JVM; it took about 5 minutes to load the data, and another 5 minutes to run a Logistic Regression.
Here is the R code to connect to h2o 3.6.0.8:
h2o.init(max_mem_size = '60m') # alloting 60mb for h2o, R is running on 8GB RAM machine
gives
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
.Successfully connected to http://127.0.0.1:54321/
R is connected to the H2O cluster:
H2O cluster uptime: 2 seconds 561 milliseconds
H2O cluster version: 3.6.0.8
H2O cluster name: H2O_started_from_R_RILITS-HWLTP_tkn816
H2O cluster total nodes: 1
H2O cluster total memory: 0.06 GB
H2O cluster total cores: 4
H2O cluster allowed cores: 2
H2O cluster healthy: TRUE
Note: As started, H2O is limited to the CRAN default of 2 CPUs.
Shut down and restart H2O as shown below to use all your CPUs.
> h2o.shutdown()
> h2o.init(nthreads = -1)
IP Address: 127.0.0.1
Port : 54321
Session ID: _sid_b2e0af0f0c62cd64a8fcdee65b244d75
Key Count : 3
I tried to load a 169 MB csv into h2o.
dat.hex <- h2o.importFile('dat.csv')
which threw an error,
Error in .h2o.__checkConnectionHealth() :
H2O connection has been severed. Cannot connect to instance at http://127.0.0.1:54321/
Failed to connect to 127.0.0.1 port 54321: Connection refused
which is indicative of out of memory error.
Question: If H2o promises loading a data set larger than its memory capacity(swap to disk mechanism as the blog quote above says), is this the correct way to load the data?
Swap-to-disk was disabled by default awhile ago, because performance was so bad. The bleeding-edge (not latest stable) has a flag to enable it: "--cleaner" (for "memory cleaner").
Note that your cluster has an EXTREMELY tiny memory:
H2O cluster total memory: 0.06 GB
That's 60MB! Barely enough to start a JVM with, much less run H2O. I would be surprised if H2O could come up properly there at all, never mind the swap-to-disk. Swapping is limited to swapping the data alone. If you're trying to do a swap-test, up your JVM to 1 or 2 Gigs ram, and then load datasets that sum more than that.
Cliff
Related
I create and persist a df1 on which then I am doing the below:
df1.persist (From the Storage Tab in spark UI it says it is 3Gb)
df2=df1.groupby(col1).pivot(col2) (This is a df with 4.827 columns and 40107 rows)
df2.collect
df3=df1.groupby(col2).pivot(col1) (This is a df with 40.107 columns and 4.827 rows)
-----it hangs here for almost 2 hours-----
df4 = (..Imputer or na.fill on df3..)
df5 = (..VectorAssembler on df4..)
(..PCA on df5..)
df1.unpersist
I have a cluster with 16 nodes(each node has 1 worker with 1 executor with 4 cores and 24Gb Ram) and a master(with 15Gb of Ram). Also spark.shuffle.partitions is 192. It hangs for 2 hours and nothing is happening. Nothing is active in Spark UI. Why does it hang for so long? Is it the DagScheduler? How can I check it? Please let me know if you need any more information.
----Edited 1----
After waiting for almost two hours it proceeds and then eventually fails. Below is the stages and executor tabs from Spark UI:
Also, in the stderr file in the worker nodes it says:
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000003fe900000, 6434586624, 0) failed; error='Cannot allocate memory' (errno=12)
Moreover, it seems there is a file produced named "hs_err_pid11877" in the folder next to stderr and stdout which says:
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 6434586624 bytes for committing reserved memory.
Possible reasons:
The system is out of physical RAM or swap space
The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
Possible solutions:
Reduce memory load on the system
Increase physical memory or swap space
Check if swap backing store is full
Decrease Java heap size (-Xmx/-Xms)
Decrease number of Java threads
Decrease Java thread stack sizes (-Xss)
Set larger code cache with -XX:ReservedCodeCacheSize=
JVM is running with Zero Based Compressed Oops mode in which the Java heap is
placed in the first 32GB address space. The Java Heap base address is the
maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
to set the Java Heap base and to place the Java Heap above 32GB virtual address.
This output file may be truncated or incomplete.
Out of Memory Error (os_linux.cpp:2792), pid=11877, tid=0x00007f237c1f8700
JRE version: OpenJDK Runtime Environment (8.0_265-b01) (build 1.8.0_265-8u265-b01-0ubuntu2~18.04-b01)
Java VM: OpenJDK 64-Bit Server VM (25.265-b01 mixed mode linux-amd64 compressed oops)
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
...and other information about the task it fails, GC information, etc..
----Edited 2----
Here is the Tasks Section of the the last pivot(stage with id 16 from stages picture).. just before the hanging. It seems that all 192 partitions have a pretty equal amount of data, from 15 to 20MB.
pivot in Spark generates an extra Stage to get the pivot values, that happens underwater and can take some time and depends how your resources are allocated, etc.
I launch our spring boot application in docker container on AWS Fargate service, so once the CPU consumption is reached more then 100% the container is stopped Docker OOM-killer with error
Reason: OutOfMemoryError: Container killed due to memory usage
On metrics we can see that CPU becomes more then 100%. It seems after some time of profiling we found CPU consuming code, but my question is, how CPU can be grater than 100%?
Is it some way to say JVM use only 100%?
I remember we had similar issue with memory consumption. I read a lot of articles about cgroups, and the solution was found to specify
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
So when you launch docker with option -m=512 heap size will be 1/4 of mac size. The heap size can also be tuned with option
-XX:MaxRAMFraction=2
which will allocate 1/2 of docker memory for heap.
Should I use something similar for CPU?
I read article https://blogs.oracle.com/java-platform-group/java-se-support-for-docker-cpu-and-memory-limits, but it tells that
As of Java SE 8u131, and in JDK 9, the JVM is Docker-aware with
respect to Docker CPU limits transparently. That means if
-XX:ParalllelGCThreads, or -XX:CICompilerCount are not specified as command line options, the JVM will apply the Docker CPU limit as the
number of CPUs the JVM sees on the system. The JVM will then adjust
the number of GC threads and JIT compiler threads just like it would
as if it were running on a bare metal system with number of CPUs set
as the Docker CPU limit.
Docker command is used to start
docker run -d .... -e JAVA_OPTS='-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:+PrintFlagsFinal -XshowSettings:vm' -m=512 -c=256 ...
Java version is used
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-1~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
Some additional info on app during start up
VM settings:
Max. Heap Size (Estimated): 123.75M
Ergonomics Machine Class: client
Using VM: OpenJDK 64-Bit Server VM
ParallelGCThreads = 0
CICompilerCount := 2
CICompilerCountPerCPU = true
I found answer to my question.
The behaviour to identify number of processors to use was fixed in https://bugs.openjdk.java.net/browse/JDK-8146115
Number of CPUs
Use a combination of number_of_cpus() and cpu_sets() in order to determine how many processors are available to
the process and adjust the JVMs os::active_processor_count
appropriately. The number_of_cpus() will be calculated based on the
cpu_quota() and cpu_period() using this formula: number_of_cpus() =
cpu_quota() / cpu_period(). If cpu_shares has been setup for the
container, the number_of_cpus() will be calculated based on
cpu_shares()/1024. 1024 is the default and standard unit for
calculating relative cpu usage in cloud based container management
software.
Also add a new VM flag (-XX:ActiveProcessorCount=xx) that allows the
number of CPUs to be overridden. This flag will be honored even if
UseContainerSupport is not enabled.
So on AWS you generally setup cpu_shares on task definition level.
Before jvm fix it was calculated incorrectly.
On java8 version < 191: cpu_shares()/1024 = 256/1024 = was identified as 2
After migration on java8 version > 191: cpu_shares()/1024 = 256/1024 = was identified as 1
The code to test
val version = System.getProperty("java.version")
val runtime = Runtime.getRuntime()
val processors = runtime.availableProcessors()
logger.info("========================== JVM Info ==========================")
logger.info("Java version is: {}", version)
logger.info("Available processors: {}", processors)
The sample output
"Java version is: 1.8.0_212"
"Available processors: 1"
I hope it will help someone, as I can't find the answer anywhere (spring-issues-tracker, AWS support, etc.)
I am running docker-swarm on Ubuntu servers (VM) with 24Go of RAM.
My cluster has 2 managers and 3 nodes
Here is the memory status (on all nodes) :
total used free shared buff/cache available
Mem: 23 5 14 0 3 17
A java application can't start with :
There is insufficient memory for the Java Runtime Environment to
continue.
Any advice ? Thanks a lot
I am trying to run java command in linux server it was running well but today when I tried to run java I got some error-
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
my memory space is -
root#vps [~]# free -m
total used free
Mem: 8192 226 7965
-/+ buf: 226 7965
Swap: 0 0 0
How can I solve this problem?
The machine did not have enough memory at that time to service the JVM's request for memory to start the program. I expect that you have 8 Gb of memory in the machine and that you use a 64-bit JVM.
I would suggest you add some swap space to the system to let it handle spikes in memory usage, and then figure out where the spike came from.
Which VM are you using? What is the maximum memory size you are trying to use?
If you are using 32-bit JVM on Windows and you are using close to the maximum it can access on your system, it can be impacted by memory fragmentation. You may have a similar problem.
I saw Java -server in http://shootout.alioth.debian.org/ for programming language benchmark.
I know that -server is a parameter for running JVM. I want to know:
When we use -server parameter and how it work?
Can we use this parameter for java desktop application?
thanks.
It just selects the "Server Hotspot VM". See documentation (Solaris/Linux) for java.
According to Wikipedia:
Sun's JRE features 2 virtual machines,
one called Client and the other
Server. The Client version is tuned
for quick loading. It makes use of
interpretation, compiling only
often-run methods. The Server version
loads more slowly, putting more effort
into producing highly optimized JIT
compilations, that yield higher
performance.
See: http://en.wikipedia.org/wiki/HotSpot
The -server flag will indicate to the launcher that the hw is a server class machine which for java 6 means at least 2 cores and at least 2 GB physical memory (ie most machines these days). On server class machines the deafult selection is
The throughput gc.
initial heap size of 1/64th of phys mem up to 1 GB
maximum heap size of 1/4th of phys mem up to max of 1 GB.
The server run time compiler.
Note that on 32 bit windows there is no server vm so the client vm is the default.
On the other 32 bit machines the server vm is chosen if the hw is server class, otherwise it's client. On 64 bit machines there is no client vm so the server vm is the default.
A link to the hot spot faq: HotSpot
You can check this blog for additional info: http://victorpillac.wordpress.com/2011/09/11/notes-on-the-java-server-flag/
Basically on most recent machines different from 32bits windows the flag will be turned on by default.
For 32bits windows you will need to download the JDK to get the server system.
More info on server vms : http://download.oracle.com/javase/1.3/docs/guide/performance/hotspot.html#server