We have a small text box with 512Mb of ram. We wanted to see how many threads we can create in Java in this box. To our surprise, we can't create many. Essentially the minimum stack size you can set with -Xss is 64k. Simple math will tell you that 64*7000 will consume 430Mb so we were only able to get it up to around 7000 threads or so and then we encountered this error:
java.lang.OutOfMemoryError: unable to create new native thread.
Is this the true limit with Java? Per 512Mb of ram we can only squeeze in 7k number of threads or so?
Use asynchronous IO (java nio) and you'll don't need 7k threads to support 7k clients, a few threads for handling io (5?) will be enough.
Take a look at Netty ;)
One thread for each client is a really bad design.
Once you create your 7k threads, you're not going to have any memory to do anything useful. Perhaps you should have a rethink about the design of your application?
Anyway, isn't 512Mb quite small? Perhaps you could provide a bit more information about your application or perhaps the domain?
Keep in mind that you will never be able to dedicate 100% of the RAM to running Java threads. Some RAM is used by the OS and other running applications, meaning you will never have the full 512 Mb available.
It's not the programming language, it's on the operating system level.
More reading about it, for Windows:
Does Windows have a limit of 2000 threads per process?
Pushing the Limits of Windows: Processes and Threads (by Mark Russinovich)
You don't necessarily need one thread per client session. If you look at the way that a J2EE (or JavaEE) server handles multiple connections it uses a mixture of strategies including concurrency, queuing and swapping. Usually you can configure the maximum number of live concurrent instances and idle time-out values at deployment time to tune the performance of your application.
Try setting the maximum memory allowed -Xmx to a lower value and see whether the thread count can be increased. In a project at work I could allocate around 2,5k threads with -Xmx512m and about 4k threads with -Xmx96m.
The bigger your heap the smaller your thread stack space (at least to my experience).
Related
I have been working on Spring Boot application where I have limited memory for JVM upto 2 GB.
At controller, user sends some request which is handled by my Executors thread pool.
It is a light weight task where each thread requires just few String variables of memory while its processing.
Considering I might deploy this application in PROD, Is there is a limit of user requests I can cater to with this approach.
What is the maximum limit? If no, will it be until JVM memory is full?
Thanks for the help.
You told us about your max memory but we still have no clue as of how much memory one thread would consume. Similarly you did not tell us how many CPU cores the application is allowed to use.
Depending on how your application uses compute resources different amount of parallel threads may become the optimal throughput. If the application shall coexist with other applications it may even not be the goal to use the compute resources to the maximum.
Therefore I advise to make the size of your threadpool configurable - be it a command line parameter, environment variable, configuration file or else. Some sysadmin can then tune the application based on the available resources and wished performance throughput.
Considering I might deploy this application in PROD, Is there is a limit of user requests I can cater to with this approach.
It seems like you are maximizing concurrency. There are many factors missing from your description. First is the config of spring. For springboot application with tomcat as the underlying serverlet container, the default max thread number is 200, with 10000 as its max connection number. Second is the heap size of JVM. More space for heap, less space for stack, which limit the thread number since every thread contains thread-local objects. Also, the CPU core information of deployed machine is missing.
To fully calculate the exact concurrency, it is better to do some performance tests with mocked requests.
Let's assume that you've setup your SpringBoot app server to accept incoming requests without any limits. If you assign 2 GB as your JVM heap size, then the concurrent requests that you can handle depends on -
The memory taken up by each request
Other JVM overhead (app server, spring and other libraries etc.,)
How fast your threads complete their requests and free up resources
Once your heap space is full, the following requests would start throwing OutOfMemory exceptions.
I am developing a Netty based server where at the number of concurrent connections should be around 100000. However when I set that number to 100 I do not run out of memory but when I increase the number to 10000 I got outmoemory buffer exception. Knowing that Netty can handle even more than what I amexpecting I would like to know how to set the serverbootstrap to cater for such a great number.
Thank you.
With how much memory are you running this program? Since it is OutOfMemoryException getting thrown, it looks like you can just change the -Xmx setting to something like -Xmx2048m and see if it works. This may not have anything to do with netty per se.
If you are convinced that there is a leak, then use tools like visualvm (which is free by the way!) to analyze if any netty objects are on the heap even after requests/gc's are done.
In linux, you generally get too many open files exception when running so many concurrent connections, there is a ton of documentation out there on how to resolve it.
Is there a way to limit both the number of cores that java uses?
And in that same vein, is it possible to limit how much of that core is being used?
You can use taskset on linux. You can also lower the priority of a process, but unless the CPU(S) are busy, a process will get as much CPU as it can use.
I have a library for dedicating thread to a core, called Java Thread Affinity, but it may have a different purpose to what you have in mind. Can you clarify why you want to do this?
I don't think that there are built-in JVM options to do these kind of tweaks, however you can limit CPU usage by setting priority and/or CPU affinity of the JVM process.
If you are on Linux take a look at CPULimit that is an awesome tool to do these kind of resource limitations.
https://github.com/haosdent/jcgroup
jcgroup is your best choice. You could use this library to limit the CPU shares, Disk I/O speed, Network bandwidth and etc.
I am using Rackspace as a hosting provider, using their Cloud server hosting, with 256mb plan.
I am using Geronimo 2.2 to run my java application.
The server starts up no problem, loads Geronimo quite fast, however, when I started to deploy my web application, it is taking forever, and once it is deployed, it takes forever to navigate through pages.
I've been monitoring the server activity, the CPU is not so busy, however, 60% of the memory is being used up. Could this be the problem?
If so, what are my options? Should I consider upgrading this cloud server to something with more RAM, or changing a host provider to better suit my needs?
Edit:
I should note that, even if I don't deploy my application, just having Geronimo loaded, sometimes I would get a connection time when I try to shut down Geronimo.
Also the database is on the same server as the application. (however I wouldn't say its query intensive)
Update:
After what #matiu suggested, I tried running free -m, and this is the output that I get:
total used free shared buffers cached
Mem: 239 232 6 0 0 2
-/+ buffers/cache: 229 9
Swap: 509 403 106
This was totally different result than running ps ux, which is how I got my previous 60%.
And I did an iostat check, and about 25% iowait time, and device is constantly writing and reading.
update:
Upgraded my hosting to 512MB, now it is up to speed! Something I should note is, I forgot about the Java's Permanent Generation memory, which is also used by Geronimo. So it turns out, I do need more RAM, and more RAM did solve my problem. (As expected) yay.
I'm guessing you're running into 'swapping'.
As you'll know Linux swaps out some memory to disk. This is great for memory that's not accessed very much.
When Java starts eating heaps and heaps, linux starts:
Swapping memory block A out to disk to make space to read in block B
Reading/writing block B
Swapping block B to disk to make space for some other block.
As disk is 1000s of times slower than RAM, as the memory usage increases your machine grinds more and more closer to a halt.
With 256 MB Cloud Servers you get 512 MB of Swap space.
Checking:
You can check if this is the case with free -m .. this page shows how to read the output:
Next I'd check with 'iostat 5' to see what the disk IO rate on the swap partition is. I would say a write rate of 300 or more means you're almost dead in the water. I'd say you'd want to keep the write rate of the swap partition down below 50 blocks a second and the read rate down below 500 blocks a second .. if possible both should be zero most of the time. Remember disk is 1000s of times slower than RAM.
You can check if it's Java eating the ram by running top and hitting shift+m to order the processes by memory consumption.
If you want .. you can disable the swap partition with swapoff -a .. then open up the web console, and hit the site a bit .. you'll soon see error messages in the console like 'OOM Killed process xxx' (OOM is for Out of Memory I think). If you see those that's linux trying to satisfy memory requests by killing processes. Once that happens, it's best to hard reboot.
Fixing:
If it's Java using the RAM .. this link might help.
I think the easy fix would be just to upgrade the size of the Cloud Server.
You may find a different Java RTE may be better.
If you run it in a 32 bit chroot it may use less RAM.
You should consider running a virtual dedicated Linux server, from something like linode.
You'd have to worry about how to start a Java service and things like firewalls, etc, but once you get it right, you are in effect you're own hosting provider, allowing you to do anything a standalone actual Linux box can do.
As for memory, I wouldn't upgrade until you have evidence that you do not have enough. 60% being used up is less than 100% used up...
Java normally assumes that it can take whatever it is assigned to it. Meaning, if you give it a max of 200MB, it thins that it's ok to take 200MB even though it's using much less.
There is a way to make Java use less memory, by using the -Xincgc incremental garbage collector. It actually ends up giving chunks of memory back to the system when it no longer needs it. This is a bit of a kept secret really. You won't see anyone point this out...
Based on my experience, memory and CPU load on VPSes are quite related. Meaning, when application server will take up all available memory, CPU usage starts to sky rock, finally making application inaccessible.
This is just a side effect though - you should really need to investigate where your problems origin!
If the memory consumption is very high, then you can have multiple causes:
It's normal - maybe you have reached a point, where all processes (application server, applications within it, background processes, daemons, Operating System, etc.) put together need that huge amount of memory. This is least probably scenario.
Memory leak - can happen due to bug in framework or some library (not likely) or your own code (possible). This can be monitored and solved.
Huge amount of requests - each request will take both CPU and memory to be processed. You can have a look at the correlation between requests per second and memory consumption, meaning, it can be monitored and resolved.
If you are interested in CPU usage:
Again, monitor requests to your application. For constant count of requests - nothing extraordinary should happen.
One component is exhausting most resources (maybe your database is installed on the same server and it uses all CPU power due to inefficient queries? Slow log would help.)
As you can see, it's not trivial task, but you have tools support which will can help you out. I personally use java melody and probe.
I have heard several people claiming that you can not scale the JVM heap size up. I've heard claims of the practical limit being 4 gigabytes (I heard an IBM consultant say that), 10 gigabytes, 32 gigabytes, and so on... I simply can not believe any of those numbers and have been wondering about the issue now for a while.
So, I have three part question I would hope someone with experience could answer:
Given the following case how would you tune the heap and GC settings?
Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
Should this really still work? I think it should.
The case:
64 bit platform
64 cores
64 gigabytes of memory
The application server is client facing (ie. Jboss/tomcat web application server) - complete pauses of JVM would probably be noticed by end users
Sun JVM, probably 1.5
To prove I am not asking you guys to do my homework this is what I came up with:
-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:-EliminateZeroing -Xmn768m -Xmx55000m
CMS should reduce the amount of pauses, although it comes with overhead. The other settings for CMS seem to default automatically to the number of CPUs so they seem sane to me. The rest that I added are extras that might do good or bad generally for performance, and they should probably be tested.
Definitely.
I think it's going to be difficult for anybody to give you anything more than general advice, without having further knowledge of your application.
What I would suggest is that you use VisualGC (or the VisualGC plugin for VisualVM) to actually look at what the garbage collection is doing when your app is running. Once you have a greater understanding of how the GC is working alongside your application, it'll be far easier to tune it.
#1. Given the following case how would you tune the heap and GC settings?
First, having 64 gigabytes of memory doesn't imply that you have to use them all for one JVM. Actually, it rather means you can run many of them. Then, it is impossible to answer your question without any access to your machine and application to measure and analyse things (knowing what your application is doing isn't enough). And no, I'm not asking to get access to your environment :)
#2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
The goal of tuning is to find a good compromise between frequency and duration of (major) GCs. With a ~55g heap, GC won't be frequent but will take noticeable time, for sure (the bigger the heap, the longer the major GC). Using a Parallel or Concurrent garbage collector will help on multiprocessor systems but won't entirely solve this issue. Why do you need ~55g (this is mega ultra huge for a webapp IMO), that's my question. I'd rather run many clustered JVMs to handle load if required (at some point, the database will become the bottleneck anyway with a data oriented application).
#3. Should this really still work? I think it should.
Hmm... not sure I get the question. What is "this"? Instantiating a JVM with a big heap? Yes, it should. Is it equivalent to running several JVMs? No, certainly not.
PS: 4G is the maximum theoretical heap limit for the 32-bit JVM running on a 64-bit operating system (see Why can't I get a larger heap with the 32-bit JVM?)
PPS: On 64-bit VMs, you have 64 bits of addressability to work with resulting in a maximum Java heap size limited only by the amount of physical memory and swap space your system provides. (see How large a heap can I create using a 64-bit VM?)
Obviously heap size is not unlimited and the larger is the heap size, the more your JVM will eventually spend on GC. Though I think it is possible to set heap size quite high on 64-bit JVM, I still think it's not really practical. The advice here is better to have several JVMs running with the same parameters i.e. cluster of JBoss/Tomcat nodes running on the same physical machine and you will get better throughput.
EDIT: Also your GC behavior depends on the taxonomy of your heap. If you have a lot of short-living objects and each request to the server creates a lot of those, then your GC will collect a lot of garbage very often and thus on large heap size this will result in longer pauses. If you have very many long-living objects (e.g. caching most of your data in memory) and the amount of short-living objects is not that big, then having bigger heap size is OK.
As Chris Rice already wrote, I wouldn't expect any obvious problems with the GC for heap sizes up to 32-64GB, although there may of course be some point of your application logic, which can cause problems.
Not directly related to GC, but I would still recommend you to perform a realistic load test on your production system. I used to work on a project, where we had a similar setup (relatively large, clustered JBoss/Tomcat setup to serve a public web application) and without exaggeration, JBoss is not behaving very well under high load or with a high number of concurrent calls if you are using EJBs. JBoss is spending a lot of time in synchronized blocks when accessing and managing the EJB instance pools and if you opt for a cluster, it will even wait for intra-cluster network communication within these synchronized blocks. Be especially aware of poorly performing state replication, if you are using SFSBs.
Only to add some more switches I would use by default: -Xms55g can help to reduce the rampup time because it frees Java from the need to check if it can fall back to the initial size and allows also better internal initial sizing of memory areas.
Additionally we made good experiences with NewSize to give you a large young size to get rid of short term garbage: -XX:NewSize=1g Additionally most webapps create a lot of short time garbage that will never survive the request processing. You can even make that bigger. With Xms55g, the VM reserves a large chunk already. Maybe downsizing can help.
-Xincgc helps to clean the young generation incrementally and return the cpu often to the user threads.
-XX:CMSInitiatingOccupancyFraction=70 If you really fill all that memory, try to start CMS garbage collection earlier.
-XX:+CMSIncrementalMode puts the CMS into incremental mode to return the cpu to the user threads more often.
Attach to the process with jstat -gc -h 10 <pid> 1s and watch the GC working.
Will you really fill up the memory? I assume that 64cpus for request processing might even be able to work with less memory. What do you store in there?
Depending on your GC pause analysis, you may wish to implement Incremental mode whereby the long pause may be broken out over a period of time.
I have found memory architecture plays a part in large memory sizes. Applications in general don't perform as well if they use more than one memory bank. The JVM appears to suffer as well, esp the GC which has to sweep the whole memory.
If you have an application which doesn't fit into one memory bank, your application has to pull in memory which is not local to a processor and use memory local to another processor.
On linux you can run numactl --hardware to see the layout of processors and memory banks.