Java job gives OOM error inconsistently - java

I have scheduled(cron) a jar file on Linux box. The jar connects with Hive server over JDBC and runs select query, after that I write the selected data in csv file. The daily data volume is around 150 Million records and the csv file is approx. of size 30GB.
Now, this job does not completes every time it is invoked and results in writing part of data. I checked the PID for error with dmesg | grep -E 31866 and I can see:
[1208443.268977] Out of memory: Kill process 31866 (java) score 178 or sacrifice child
[1208443.270552] Killed process 31866 (java) total-vm:25522888kB, anon-rss:11498464kB, file-rss:104kB, shmem-rss:0kB
I am invoking my jar with memory options like :
java -Xms5g -Xmx20g -XX:+UseG1GC -cp jarFile
I want to know what exact the error text means and Is there any solution I can apply to ensure my job will not run OOM. The wired thing is the job does not fail every time its behaviour is inconsistence.

That message is actually from linux kernel, not your job. It means that your system ran out of memory and the kernel has killed your job to resolve the problem (otherwise you'd probably get a kernel panic).
You could try modifying your app to lower memory requirements (e.g. load your data incrementally or write a distributed job that would complete needed transformations on the cluster, not just one machine).

Related

Got error of "Too many open file", what's the ulimit when running the program using ProcessBuilder()...start()?

I need to run a program after a Linux EC2 machine is provisioned on AWS. The following code will get "Too many open file" error. my_program will open a lot of files, maybe around 5000.
string cmd = "my_program";
Process process = new ProcessBuilder()
.inheritIO()
.command(cmd)
.start();
However, running my_program in the console can finish without any error. What's the ulimit when running the program using ProcessBuilder()...start()?
ulimit -n output 65535 in bash terminal.
First find out the limits your app has when running:
ps -ef | grep <<YOUR-APP-NAME>>
then:
cat /proc/<<PID-of-your-APP>>/limits
Here the problem is that you app. starts under X or Y user and these users have a different ulimit setup.
Check:
cat /etc/security/limits
... I think and increase those values.
Just my 2 cents...
You need to ensure that you close() the files after use. They will be closed by the garbage collector (I'm not completely sure about this, as it can differ on each implementation) but if you process many files without closing them, you can run out of file descriptors right before the garbage collector has any chance to execute.
Another thing you can do is to use a resource based try statement, that will ensure that every resource you declare in the parenthesis group is a Closeable resource that will be forcibly close()d on exit from the try.
Anyway, if you want to rise the value of the maximum number of open files per process, just look at your shell's man page (bash(1) most probably) and search in it about the ulimit command (there's no ulimit manual page as it is an internal command to the shell, the ulimit values are per process, so you cannot start a process to make your process change it's maximum limits)
Beware that linux distributions (well, the most common linux distributions) don't have normally a way to configure an enforced value for this in a per user way (there's a full set of options to do things like this in BSD systems, but linux's login(8) program has not implemented the /etc/login.conf feature to do this) and rising arbitrarily this value can be a security problem in your system if it runs as a multiuser system.

JMETER: JMeter 5.3 java.lang.OutOfMemoryError. During Jmeter execution

I have configured a Testplan using Jmeter shown below in the image and have been using the CLI to run my parallel load tests. MAC USER
I have configured a connection with my AWS RedShift database, when I check my queries monitoring, all of the queries get stuck in a Running state.
After some time, on my terminal, i get the following error: JMeter 5.3 java.lang.OutOfMemoryError.
I have gone into my bin/jemeter file and have made the memory changes but I am still facing the same issue.
When I run the same queries from DBeaver, the queries are run and completed and can be seen on Redshift query monitoring.
How can I solve the memory problem in order for the queries to run without being stuck in a running state?
Below is the Error i am getting even after increasing the heap size to 5 gigabytes.
WARNING: package sun.awt.X11 not in java.desktop
Creating summariser <summary>
Created the tree successfully using //Users/mbyousaf/Desktop/redshit-test/test-redhsift.jmx
Starting standalone test # Wed Dec 02 14:53:17 GMT 2020 (1606920797442)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
Warning: Nashorn engine is planned to be removed from a future JDK release
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid35596.hprof ...
Heap dump file created [3071802740 bytes in 3.747 secs]
Which exact OutOfMemoryError? There are several possible reasons:
Lack of heap space, if this is the case - you're looking at the right place, just make sure that your changes are applied
GC Overhead Limit Exceeded occurs when the GC executing almost 100% of time not leaving the program any chance to do its job
Requested array size exceeds VM limit when the program tries to create too large objects
Unable to Create New Native Thread when the program cannot create a new thread because the operating system doesn't allow it
and so on
It's not possible to state what's wrong without seeing your full test plan (at least screenshot) as it might be the case you added tons of Listeners and each of them stores large DB query response in memory and jmeter.log file (definitely not in the form of screenshot) which in the majority of cases contains either the cause of the problem or at least a clue

disk I/O of a command line java program

I have a simple question, I've read up online but couldn't find a simple solution:
I'm running a java program on the command line as follows which accesses a database:
java -jar myProgram.jar
I would like a simple mechanism to see the number of disk I/Os performed by this program (on OSX).
So far I've come across iotop but how do I get iotop to measure the disk I/O of myProgram.jar?
Do I need a profiler like JProfiler do get this information?
iotop is a utility which gives you top n processes in descending order of IO consumption/utilization.
Most importantly it is a live monitoring utility which means its output changes every n sec( or time interval you specify). Though you can redirect it to a file, you need to parse that file and find out meaningful data after plotting a graph.
I would recommend to use sar. you can read more about it here
It is the lowest level monitoring utility in linux/unix. It will give you much more data than iotop.
best thing about sar is you can collect the data using a daemon when your program is running and then later analyze it using ksar
According to me, you can follow below approach,
Start sar monitoring, collect sar data every n seconds. value of n depends of approximate execution time of your program.
example : if your program takes 10 seconds to execute then monitoring per sec is good but if your program takes 1hr to execute then monitor per min or 30 sec. This will minimize overhead of sar process and still your data is meaningful.
Wait for some time (so that you get data before your program starts) and then start your program
end of your program execution
wait for some time again (so that you get data after your program finishes)
stop sar.
Monitor/visualize sar data using ksar. To start with, you check for disk utilization and then IOPS for a disk.
You can use Profilers for same thing but they have few drawbacks,
They need their own agents (agents will have their own overhead)
Some of them are not free.
Some of them are not easy to set up.
may or may not provide enough/required data.
besides this IMHO, Using inbuilt/system level utilities is always beneficial.
I hope this was helpful.
Your Java program will eventually be a process for host system so you need to filter out output of monitoring tool for your own process id. Refer Scripts section of this Blog Post
Also, even though you have tagged question with OsX but do mention in question that you are using OsX.
If you are looking for offline data - that is provided by proc filesystem in Unix bases systems but unfortunately that is missing in OSX , Where is the /proc folder on Mac OS X?
/proc on Mac OS X
You might chose to write a small script to dump data from disk and process monitoring tools for your process id. You can get your process id in script by process name, put script in a loop to look for that process name and start script before you execute your Java program. When script finds the said process, it will keep dumping relevant data from commands chosen by you at intervals decided by you. Once your programs ends ,log dumping script also terminates.

Hadoop MapReduce Out of Memory on Small Files

I'm running a MapReduce job against about 3 million small files on Hadoop (I know, I know, but there's nothing we can do about it - it's the nature of our source system).
Our code is nothing special - it uses CombineFileInputFormat to wrap a bunch of these files together, then parses the file name to add it into the contents of the file, and spits out some results. Easy peasy.
So, we have about 3 million ~7kb files in HDFS. If we run our task against a small subset of these files (one folder, maybe 10,000 files), we get no trouble. If we run it against the full list of files, we get an out of memory error.
The error comes out on STDOUT:
#
# java.lang.OutOfMemoryError: GC overhead limit exceeded
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing /bin/sh -c "kill -9 15690"...
I'm assuming what's happening is this - whatever JVM is running the process that defines the input splits is getting totally overwhelmed trying to handle 3 million files, it's using too much memory, and YARN is killing it. I'm willing to be corrected on this theory.
So, what I need to know how to do is to increase the memory limit for YARN for the container that's calculating the input splits, not for the mappers or reducers. Then, I need to know how to make this take effect. (I've Googled pretty extensively on this, but with all the iterations of Hadoop over the years, it's hard to find a solution that works with the most recent versions...)
This is Hadoop 2.6.0, using the MapReduce API, YARN framework, on AWS Elastic MapReduce 4.2.0.
I would spin up a new EMR cluster and throw a larger master instance at it to see if that is the issue.
--instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.4xlarge InstanceGroupType=CORE,InstanceCount=1,InstanceType=m3.xlarge
If the master is running out of memory when configuring the input splits you can modify the configuration
EMR Configuration
Instead of running the MapReduce on 3 million individual files, you can merge them into manageable bigger files using any of the following approaches.
1. Create Hadoop Archive ( HAR) files from the small files.
2. Create sequence file for every 10K-20K files using MapReduce program.
3. Create a sequence file from your individual small files using forqlift tool.
4. Merge your small files into bigger files using Hadoop-Crush.
Once you have the bigger files ready, you can run the MapReduce on your whole data set.

hadoop windows (work ok) linux java heap space

Here is my problem:
First of all I'm working with hadoop and a single node configuration
I'm developing an application and I made just one map function, in this map function I call like 10 functions,
the application reads from a csv file and process a certain column, I already made the jar file and everything so when I run this app with a csv with 4000 rows on windows (windows 7) (using cygwin) on a 4 GB RAM machine, the application works fine, but when I run it on linux- ubuntu on a 2 GB RAM machine, it process some rows but then it throws a "Java heap space" error, or sometimes the thread is killed.
For the linux:
I already tried to change the hadoop export HEAP_SIZE and also the Xmx and Xms parameters on the app and it made some difference but not too much, the error stills happening...
Do you know why it s happening? its because the 4GB and 2GB of RAM difference between machines?
One thing I ran into with a mapper is if you call/use functions/objects that start their own threads from within the map function, this can easily create enough threads to use all the heap space for that JVM.
Each mapper will have the setup and cleanup function called once. In my situation I was able to process and put into an ArrayList all my data, then do the additional processing I needed in the cleanup function.

Categories

Resources