I have a spring boot application running on ubuntu 20 ec2 machine where I am creating around 200000 threads to write data into kafka. However it is failing repeatedly with the following error
[138.470s][warning][os,thread] Attempt to protect stack guard pages failed (0x00007f828d055000-0x00007f828d059000).
[138.470s][warning][os,thread] Attempt to deallocate stack guard pages failed.
OpenJDK 64-Bit Server VM warning: [138.472s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
INFO: os::commit_memory(0x00007f828cf54000, 16384, 0) failed; error='Not enough space' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 16384 bytes for committing reserved memory.
I have tried increasing memory of my ec2 instance to 64 gb which have been of no use. I am using docker stats and htop to monitor the memory footprint of the process and when it touches around 10 gb it fails with the given error.
I have also tried increasing the heap size and max memory for the process.
docker run --rm --name test -e JAVA_OPTS=-Xmx64g -v /workspace/logs/test:/logs -t test:master
Below is my code
final int LIMIT = 200000;
ExecutorService executorService = Executors.newFixedThreadPool(LIMIT);
final CountDownLatch latch = new CountDownLatch(LIMIT);
for (int i = 1; i <= LIMIT; i++) {
final int counter = i;
executorService.execute(() -> {
try {
kafkaTemplate.send("rf-data", Integer.toString(123), "asdsadsd");
kafkaTemplate.send("rf-data", Integer.toString(123), "zczxczxczxc");
latch.countDown();
} catch (Exception e) {
logger.error("Error sending data: ", e);
}
});
}
try {
latch.await();
} catch (InterruptedException e) {
logger.error("error ltach", e);
}
Related
I have a problem with my cluster.
the cluster have
2 worker primary
2 secondary worker
30 gb di ram
The cluster runs correctly and launches the job hives for at least about 10h.
After 10h I have an error of :Java heap space
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236) ~[?:1.8.0_292]
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) ~[?:1.8.0_292]
at org.apache.hadoop.ipc.ResponseBuffer.toByteArray(ResponseBuffer.java:53) ~[hadoop-common-3.2.2.jar:?]
at org.apache.hadoop.ipc.Client$Connection$3.run(Client.java:1159) ~[hadoop-common-3.2.2.jar:?]
... 5 more
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
INFO : Completed executing command(queryId=hive_20210923102707_66b4cd11-7cfb-4910-87bc-7f062ce1b00e); Time taken: 75.101 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)
i tried to set this cofiguration but it didn't help.
SET hive.execution.engine = tez;
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
SET mapreduce.job.reduces=1;
SET hive.auto.convert.join=false;
set hive.stats.column.autogather=false;
set hive.optimize.sort.dynamic.partition=true;
is there any way to clean the java heap space or I have got some configuration wrong?
the problem is solved by restarting the cluster
It seems that the default Tez container and heap sizes set by Dataproc are too small for your job. You can update the following Hive properties to increase them:
hive.tez.container.size: The YARN container size in MB for Tez. If set to "-1" (default value), it picks the value of mapreduce.map.memory.mb. Consider increasing the value if the query / Tez app fails with something like "Container is running beyond physical memory limits. Current usage: 4.1 GB of 4 GB physical memory used; 6.0 GB of 20 GB virtual memory used. Killing container.". Example: SET hive.tez.container.size=8192 in Hive, or --properties hive:hive.tez.container.size=8192 when creating the cluster.
hive.tez.java.opts: The JVM options for the Tez YARN application. If not set, it picks the value of mapreduce.map.java.opts. This value should be less or equal to the container size. Consider increasing the JVM heap size if the query / Tez app fails with an OOM exception. Example: SET hive.tez.java.opts=-Xmx8g or --properties hive:hive.tez.java.opts=-Xmx8g when creating the cluster.
You can check /etc/hadoop/conf/mapred-site.xml to get the value of mapreduce.map.java.opts, and /etc/hive/conf/hive-site.xml for the 2 Hive properties mentioned above.
I have the following env:
$ ulimit -s
100000
$ ulimit -i
63645
$ cat /proc/sys/kernel/threads-max
127626
$ cat /proc/sys/vm/max_map_count
600000
$ cat /proc/sys/kernel/pid_max
200000
$ java -Xmx4G -Xss256k -cp . ThreadCreation
...
11542
11543
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:717)
at ThreadCreation.main(ThreadCreation.java:15)
The following testing class
public class ThreadCreation {
public static void main(String[] args) {
try {
for (int i = 0; i < 100000; i++) {
System.out.println(i);
new Thread("Thread-" + i) {
#Override
public void run() {
try {
Thread.sleep(1000000000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}.start();
}
} catch (Throwable t) {
t.printStackTrace();
System.exit(0);
}
}
}
This is the ulimit status:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63645
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 100000
cpu time (seconds, -t) unlimited
max user processes (-u) 63645
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Can you please help me on what I'm doing wrong? Thanks! It's OpenJDK 1.8.0. I know I should use ExecutorService, but this is just for testing/demo purposes.
There should be enough memory:
$ free
total used free shared buff/cache available
Mem: 16336132 5935372 3513992 1377844 6886768 8600916
Swap: 16678908 0 16678908
I tried lots of suggestions suggested by other StackOverflow questions but none worked for me; that's why I'm opening a new question.
Ah, found the answer here: How to increase maximum number of JVM threads (Linux 64bit)
The culprit was systemd which capped the number of processes at around 12288; increasing UserTasksMax helped me to break the 12k limit.
I am trying to continuously start new threads with Java for testing purposes. Here is my code.
import java.util.TimerTask;
import java.util.Timer;
class MakeNewThreads extends TimerTask {
public static long threadNum=0;
public void run() {
Thread thread = new Thread(){
public void run(){
try {
threadNum++;
System.out.println(threadNum);
Thread.sleep(10000000);
} catch(InterruptedException e) { }
}
};
thread.start();
}
}
public class Main {
public static void main(String[] args) {
Timer timer = new Timer();
timer.schedule(new MakeNewThreads(), 0, 100);
}
}
However, I meet with the following error before I hit the maximum user process limit.
Exception in thread "Timer-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at MakeNewThreads.run(Main.java:16)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
I have checked the user process limit by ulimit -u. The result is 81920, but I can only create about 11000 threads before I meet with java.lang.OutOfMemoryError. In addition, if I set the user process limit to be lower manually, like 1000, I still cannot even approach the limit (I can create around 400 threads in this case). What could be possible reasons for this? Thanks in advance.
Update 1:
I have tried methods mentioned in How to increase maximum number of JVM threads (Linux 64bit)
I tried the sample program in this page. The result is around 11832.
I have tried to run the program with -Xss256K -Xmx1G. The result does not change.
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127051
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 81920
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
cat /proc/sys/kernel/threads-max 254103
cat /proc/sys/kernel/pid_max 4194303
cat /proc/sys/vm/max_map_count 262144
Update 2: The problem seems to be solved. When I set the user process limit to 7000. I can create around 7000 threads. I suppose previously it is still the memory problem, but I cannot understand why I can only create up to 11000 threads since 11000*256/1024/1024=2.7G, which is far below the total memory, 32G, of my desktop. I don't think other background threads can take up 30G. Could anyone explain this? Thanks for all the help.
I am getting java.lang.OutOfMemoryError errors, even when I still have enough free RAM. The memory dumps I took were between 200MB and 1GB, while my server has 24GB of RAM. I set -Xmx12288m -Xms12288m.
Also, when I try to log in to the server, I frequently get
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
I narrowed it down to the code snippet below:
import org.snmp4j.Snmp;
import org.snmp4j.transport.DefaultUdpTransportMapping;
long n = 0;
while (true) {
DefaultUdpTransportMapping transport = null;
try {
transport = new DefaultUdpTransportMapping();
transport.listen();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
// } finally { // (*) I forgot this
// transport.close(); // (*) I forgot this
}
n++;
double freeMemMB = Runtime.getRuntime().freeMemory() / 1024 / 1024;
System.out.println("Created " + n
+ " DefaultUdpTransportMappings. Free Mem (mb): "
+ freeMemMB);
}
Output (on my developer machine, with mvn exec:java):
Created 2026 DefaultUdpTransportMappings. Free Mem (mb): 299.0
Created 2027 DefaultUdpTransportMappings. Free Mem (mb): 299.0
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.snmp4j.util.DefaultThreadFactory$WorkerThread.run(DefaultThreadFactory.java:91)
at org.snmp4j.transport.DefaultUdpTransportMapping.listen(DefaultUdpTransportMapping.java:168)
at App.main(App.java:19)
... 6 more
I found that I get the errors because I don't close the DefaultUdpTransportMapping. Enabling the finally { ... } block solves the problem. Now I'm wondering which limits (if not the amount of free memory) I reached. The ulimits on the server are:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 191968
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
On my developer Mac:
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 709
-n: file descriptors 2560
Which limit did I reach?
The java.lang.OutOfMemoryError: unable to create new native thread is a confusing message, since it has not really to do with running out of heap memory. Therefore the heap size settings (Xmx and Xms) have no influence on this case.
The exception is thrown when a new operating process cannot not be created for your application, either because a maximum number of processess/open file handles is reached or there is no memory left on the system the create a new thread.
Regarding the ulimit settings, it can be either the number of file descriptors, the stack size or the number of processes. The stack size is the a per thread number, the number of threads times the stack size will be the amount of used memory.
Usually, as it your case, getting this exception means your application is not closing its threads properly and keeps hold of system processes. This is why closing the trasnport fixed the issue for you.
I get this error on my UNIX server, when running my java server:
Exception in thread "Thread-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at [... where ever I launch a new Thread ...]
It happens everytime I have about 600 threads running.
I have set up this variable on the server:
$> ulimit -s 128
What looks strange to me is the result of this command, which I ran when the bug occured the last time:
$> free -m
total used free shared buffers cached
Mem: 2048 338 1709 0 0 0
-/+ buffers/cache: 338 1709
Swap: 0 0 0
I launch my java server like this:
$> /usr/bin/java -server -Xss128k -Xmx500m -jar /path/to/myJar.jar
My debian version:
$> cat /etc/debian_version
5.0.8
My java version:
$> java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
My question: I have read on Internet that my program should handle something like 5000 threads or so. So what is going on, and how to fix please ?
Edit: this is the output of ulimit -a when I open a shell:
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 794624
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 100000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 794624
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I run the script as a daemon from init.d, and this is what i run:
DAEMON=/usr/bin/java
DAEMON_ARGS="-server -Xss128k -Xmx1024m -jar /path/to/myJar.jar"
ulimit -s 128 && ulimit -n 10240 && start-stop-daemon -b --start --quiet --chuid $USER -m -p $PIDFILE --exec $DAEMON -- $DAEMON_ARGS \
|| return 2
Edit2: I have come across this stack overflow question with a java test for threads: how-many-threads-can-a-java-vm-support
public class DieLikeADog {
private static Object s = new Object();
private static int count = 0;
public static void main(String[] argv){
for(;;){
new Thread(new Runnable(){
public void run(){
synchronized(s){
count += 1;
System.err.println("New thread #"+count);
}
for(;;){
try {
Thread.sleep(100);
} catch (Exception e){
System.err.println(e);
}
}
}
}).start();
}
}
}
On my server, the program crashes after 613 threads. Now i'm certain this is not normal, and only related to my server configuration. Can anyone help please ?
Edit 3:
I have come across this article, and many others, explaining that linux can't create 1000 threads, but you guys are telling me that you can do it on your systems. I don't understand.
I have also ran this script on my server: threads_limits.c and the limit is around 620 threads.
My website is now offline and this is the worst thing that could have happened to my project.
I don't know how to recompile glibc and this stuff. It's too much work imo.
I guess I should switch to windows server. Because none of the settings proposed on this page did make any change: The limit on my system is between 600 and 620 threads, no matter the program involved.
Just got the following information: This is a limitation imposed by my host provider. This has nothing to do with programming, or linux.
The underlying operating system (Debian Linux in this case) does not allow the process to create any more threads. See here how to raise the maximum amount: Maximum number of threads per process in Linux?
I have read on Internet that my program should handle something like
5000 threads or so.
This depends on the limits set to the OS, amount of running processes etc. With correct settings you can easily reach that many threads. I'm running Ubuntu on my own computer, and I can create around 32000 threads before hitting the limit on a single Java program with all my "normal stuff" running on the background (this was done with a test program that just created threads that went to sleep immediately in an infinite loop). Naturally, that high amount of threads actually doing something would probably screech consumer hardware to a halt pretty fast.
Can you try the same command with a smaller stack size "-Xss64k" and pass on the results ?
Your JVM fails to allocate stack or some other per-thread memory. Lowering the stack size with -Xss will help increase the number of threads you can create before OOM occurs (but JVM will not let you set arbitrarily small stack size).
You can confirm this is the problem by seeing how the number of threads created change as you tweak -Xss or by running strace on your JVM (you'll almost certainly see an mmap() returning ENOMEM right before an exception is thrown).
Check also your ulimit on virtual size, i.e. ulimit -v. Increasing this limit should let you create more threads with the same stack size. Note that resident set size limit (ulimit -m) is ineffective in current Linux kernel.
Also, lowering -Xmx can help by leaving more memory for thread stacks.
I am starting to suspect that "Native Posix Thread Library" is missing.
>getconf GNU_LIBPTHREAD_VERSION
Should output something like:
NPTL 2.13
If not, the Debian installation is messed up. I am not sure how to fix that, but installing Ubuntu Server seems like a good move...
for ulimit -n 100000; (open fd:s) the following program should be able to handle 32.000 threads or so.
Try it:
package test;
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
import java.util.ArrayList;
import java.util.concurrent.Semaphore;
public class Test {
final static Semaphore ss = new Semaphore(0);
static class TT implements Runnable {
#Override
public void run() {
try {
Socket t = new Socket("localhost", 47111);
InputStream is = t.getInputStream();
for (;;) {
is.read();
}
} catch (Throwable t) {
System.err.println(Thread.currentThread().getName() + " : abort");
t.printStackTrace();
System.exit(2);
}
}
}
/**
* #param args
*/
public static void main(String[] args) {
try {
Thread t = new Thread() {
public void run() {
try {
ArrayList<Socket> sockets = new ArrayList<Socket>(50000);
ServerSocket s = new ServerSocket(47111,1500);
ss.release();
for (;;) {
Socket t = s.accept();
sockets.add(t);
}
} catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
}
};
t.start();
ss.acquire();
for (int i = 0; i < 30000; i++) {
Thread tt = new Thread(new TT(), "T" + i);
tt.setDaemon(true);
tt.start();
System.out.println(tt.getName());
try {
Thread.sleep(1);
} catch (InterruptedException e) {
return;
}
}
for (;;) {
System.out.println();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
return;
}
}
} catch (Throwable t) {
t.printStackTrace();
}
}
}
Related to the OPs self-answer, but I do not yet have the reputation to comment.
I had the identical issue when hosting Tomcat on a V-Server.
All standard means of system checks (process amount/limit, available RAM, etc) indicated a healthy system, while Tomcat crashed with variants of "out of memory / resources / GCThread exceptions".
Turns out some V-Servers have an extra configuration file that limits the amount of allowed Threads per process.
In my case (Ubuntu V -Server with Strato, Germany) this was even documented by the hoster, and the restriction can be lifted manually.
Original documentation by Strato (German) here: https://www.strato.de/faq/server/prozesse-vs-threads-bei-linux-v-servern/
tl;dr: How to fix:
-inspect thread limit per process:
systemctl show --property=DefaultTasksMax
-In my case the default was 60, which was insufficient for Tomcat. I changed it to 256:
vim /etc/systemd/system.conf
Change the value for:
DefaultTasksMax=60
to something higher, e.g. 256. (The HTTPS connector of tomcat has a default thread pool of 200, so it should be at least 200.)
Then reboot, to make the changes take effect.
Its going out of memory.
Also need to change ulimit. If your OS does not give your app enough memory -Xmx i suppose will not make any difference.
I guess the -Xmx500m is having no effect.
Try
ulimit -m 512m
with -Xmx512m