In my application i run some threads with untrusted code and so i have to prevent a memory overflow. I have a WatchDog wich analyses the time of the current thread (the threads were called in serial).
But how i can determine the memory usage?
I only know the memory usage of the whole VM with Runtime.totalMemory()?
If there is a possibility to find out the usage of the thread, or the usage of the single process it would be great. With the memory usage of the process i could calculate the usage of the thread anyway.
Since a JVM executing a Java program is a Java process you don't have to worry about that. All threads share the same memory space in the JVM process.
Hence it is sufficient to rely on
Runtime.totalMemory()
Runtime.freeMemory()
A Java application cannot control the amount of memory or (or CPU) used by its threads,
irrespective of whether the threads are running trusted or untrusted code. There are no APIs for doing
this in current generation JVMs. And there are certainly no APIs for monitoring a thread's usage of memory. (It is not even clear that this is a meaningful concept ... )
The only way you can guarantee to control the resource usage of untrusted Java code is to run the code in a separate JVM, and use operating system level resource controls (such as ulimit, nice, sigstop, etc) and "-Xmx" to limit that JVM's resource usage.
Some time back, a Sun produced JSR 121 aimed at addressing this issue. This JSR would allow an application to be split into parts (called "isolates") that communicated via message passing, and offered the ability for one isolate to monitor and control another. Unfortunately, the Isolate APIs have yet to be implemented in any mainstream JVM.
What you need to do is to run the untrusted code in its own process/JVM. This is possible using the JNI interfaces (if your operating system permits it).
Related
Is there a way to share core library between Java processes (or other way to minimize JVM initial memory impact)
So here's my case. I'm playing with microservices. I'm runing quite a lot of them. I'm setting their heap for 128M as it's enough for them. But I've noticed that the Linux process is consuming much more.
If I understand correctly from here
Max memory = [-Xmx] + [-XX:MaxPermSize] + number_of_threads * [-Xss]
although I am using Java 8 so probably perm size is no longer the issue? or is it.
There is initial "core" JVM memory footprint... and I was wondering if you heard a way to somehow share that "core" memory between processes (as it's really the same). Or any way to deal with that extra cost when running many processes of java.
Conceptually you're asking if you can fork a JVM - since forking (generally) uses copy-on-write memory semantics this can be an effective space-saving measure. Unfortunately as discussed in this answer forking the JVM is not supported, and generally not practical. Non-Unix systems cannot fork efficiently, and there are numerous other side-effects that a forked JVM would have to resolve in messy ways. Theoretically you could probably fork a JVM process, but you'd be walking squarely into "undefined behavior" territory.
The "right" way to avoid JVM startup costs is to reduce the number of JVMs you need to start up in the first place. Java is a highly-concurrent language that supports shared access to common memory out of the box via its threading model. If you can refactor your code to run concurrently in the same JVM you'll see much better performance.
Each java application will run in a specific Java Virtual Machine Instance. I am really getting confused on below aspects and Googling has confused me even more. Different articles on different sites.
If I have a web service written in java it will need a JVM instance to run.So can JVM be made a daemon process?
If yes when we run any other java application it will use this instance of JVM or create a new one?
Main memory available in any machine is constant. When we start n java processes simultaneously without providing any initial heap size how is the heap size distributed among processes?
Is there any process that manages n number of JVM instances or is it managed by OS itself?
When stop-the-world happens during an GC are other JVM instances(different threads I assume) affected?
1) If I have a web service written in java it will need a JVM instance to run. So can JVM be made a daemon process?
Yes it can. How it is done depends on the O/S and on the web server container itself.
2) If yes when we run any other java application it will use this instance of JVM or create a new one?
No. Each Java application uses an independent JVM.
Each JVM is a separate process, and that means there is no sharing of stacks, heaps, etcetera. (Generally, the only things that might be shared are the read only segments that hold the code of the core JVM and native libraries ... in the same way that normal processes might share code segments.)
3) Main memory available in any machine is constant. When we start n java processes simultaneously without providing any initial heap size how is the heap size distributed among processes?
The mechanism for deciding how big to make the heap if you don't specify a size depends on the JVM / platform / version you are using, and whether you using the "client" or "server" model (for Hotspot JVMs). The heuristic doesn't take account of the number or size of other JVMs.
Reference: https://stackoverflow.com/a/4667635/139985
In practice, you would probably be better off specifying the heap size directly.
4) Is there any process that manages n number of JVM instances or is it managed by OS itself?
Neither. The number of JVM instances is determined by the actions of various things that can start processes; e.g. daemons scripts, command scripts, users typing commands at the command line, etcetera. Ultimately, the OS may refuse to start any more processes if it runs out of resources, but JVMs are not treated any different to other processes.
5) When stop-the-world happens during an GC are other JVM instances(different threads I assume) affected?
No. The JVMs are independent processes. They don't share any mutable state. Garbage collection operates on each JVM independently.
see How to Daemonize a Java Program?
new instance of JVM will be created
the same way as memory is shared between all other processes
it is managed by O/S
other instances are not affected
If your instances have to coordinate their work, you can create single main instance which would run/stop other instances.
You did not explain why you need multiple JVM instances. Probably, single instance would work better.
I have a Tomcat webapp which does some pretty memory and CPU-intensive tasks on the behalf of clients. This is normal and is the desired functionality. However, when I run Tomcat, memory usage skyrockets over time to upwards of 4.0GB at which time I usually kill the process as it's messing with everything else running on my development machine:
I thought I had inadvertently introduced a memory leak with my code, but after checking into it with VisualVM, I'm seeing a different story:
VisualVM is showing the heap as taking up approximately a GB of RAM, which is what I set it to do with CATALINA_OPTS="-Xms256m -Xmx1024".
Why is my system seeing this process as taking up a ton of memory when according to VisualVM, it's taking up hardly any at all?
After a bit of further sniffing around, I'm noticing that if multiple jobs are running simultaneously in the applications, memory does not get freed. However, if I wait for each job to complete before submitting another to my BlockingQueue serviced by an ExecutorService, then memory is recycled effectively. How can I debug this? Why would garbage collection/memory reuse differ?
You can't control what you want to control, -Xmx only controls the Java Heap, it doesn't control consumption of native memory by the JVM, which is consumed completely differently based on implementation. VisualVM is only showing you what the Heap is comsuming, it doesn't show what the entire JVM is consuming as native memory as an OS process. You will have to use OS level tools to see that, and they will report radically different numbers, usually much much larger than anything VisualVM reports, because the JVM uses up native memory in an entirely different way.
From the following article Thanks for the Memory ( Understanding How the JVM uses Native Memory on Windows and Linux )
Maintaining the heap and garbage collector use native memory you can't control.
More native memory is required to maintain the state of the
memory-management system maintaining the Java heap. Data structures
must be allocated to track free storage and record progress when
collecting garbage. The exact size and nature of these data structures
varies with implementation, but many are proportional to the size of
the heap.
and the JIT compiler uses native memory just like javac would
Bytecode compilation uses native memory (in the same way that a static
compiler such as gcc requires memory to run), but both the input (the
bytecode) and the output (the executable code) from the JIT must also
be stored in native memory. Java applications that contain many
JIT-compiled methods use more native memory than smaller applications.
and then you have the classloader(s) which use native memory
Java applications are composed of classes that define object structure
and method logic. They also use classes from the Java runtime class
libraries (such as java.lang.String) and may use third-party
libraries. These classes need to be stored in memory for as long as
they are being used. How classes are stored varies by implementation.
I won't even start quoting the section on Threads, I think you get the idea that
-Xmx doesn't control what you think it controls, it controls the JVM heap, not everything
goes in the JVM heap, and the heap takes up way more native memory that what you specify for
management and book keeping.
Plain and simple the JVM uses more memory than what is supplied in -Xms and -Xmx and the other command line parameters.
Here is a very detailed article on how the JVM allocates and manages memory, it isn't as simple as what you are expected based on your assumptions in your question, it is well worth a comprehensive read.
ThreadStack size in many implementations have minimum limits that vary by Operating System and sometimes JVM version; the threadstack setting is ignored if you set the limit below the native OS limit for the JVM or the OS ( ulimit on *nix has to be set instead sometimes ). Other command line options work the same way, silently defaulting to higher values when too small values are supplied. Don't assume that all the values passed in represent what are actually used.
The Classloaders, and Tomcat has more than one, eat up lots of memory that isn't documented easily. The JIT eats up a lot of memory, trading space for time, which is a good trade off most of the time.
You should also check for CPU usage and garbage collector.
It is possible that garbage collection pauses and the CPU gc consumes further slow down your machine.
Is it possible to make some sub-set of threads (e.g. from specific ThreadPool) allocate memory from own heap? E.g. most of the threads are allocating from regular shared heap, and few worker threads are allocating from individual heaps (1:1 per thread).
The intent is to ensure safe execution of the code in shared environment - typical worker is stateless and is running on separate thread, processing of one request should not consume more than 4MB of heap.
Update #1
Re: But why are you worried about "safe execution" and unpredictable increasing of heap consumption?
The point is about safe hosting of arbitrary 3rd party java code within my process. Once of the points is to not get "Out of Memory" for my entire process because of bugs in the 3rd party code.
Update #2
Re: As of limiting memory usage per thread, in Java the language it's impossible
According to my investigation before I've posted this question my opinion is the same, I'm just hoping I'm missing something.
The only possible alternative solutions for my use-case as I see right now are ...
1) How much memory does my java thread take? - track thread memory usage in some governor thread and terminate bad threads
2) Run Java code on my own JVM - Yes it is possible. You can download a JVM open source implementation, modify it ... :)
Check out Java nonblocking memory allocation — threads are usually allocating memory from their own allocation blocks already. So if the speed is of concern, Sun has done it for you.
As of limiting memory usage per thread, in Java the language it's impossible. Whether it is possible (or makes sense) in JVM and Java the platform is an interesting question. You can of course do it the same way as any memory profiler does, but I'm afraid the management system will outgrow the application itself pretty soon.
No. There is no concept of this in Java. There is one 'heap' that new allocates from. Java allocation is thread-safe. And why do you think that making more heaps would cause threads to consume less memory?
If you want to control memory usage in a thread, don't allocate things.
You could, in theory, create pools of reusable objects for a purpose like this, but the performance would almost certainly be worse than the obvious alternative.
Threads by design share all the heap and other regions of memory. Only the stack is truly thread local, and this space can be limited.
If you have tasks which you want to run in their own memory and/or can be stopped, you have to run them as a separate process.
I have been developing a small Java utility that uses two frameworks: Encog and Jetty to provide neural network functionality for a website.
The code is 'finished' in that it does everything it needs to do, but I have some problems with memory usage. When running on my development machine the memory usage seems to fluctuate between about 4MB and 13MB when the application is doing things (training neural networks) and at most it uses about 18MB. This is very good usage and I think it is due to the fact that I call System.GC() fairly regularly. I do this because the processing time doesn't matter for me, but the memory usage does.
So it all works fine on my machine, but as soon as I put it online on our server (shared unix hosting with memory limits) it uses about 19MB to start with and rises to hundreds of MB of memory usage when doing things. These are the same things that I have been doing in testing. The only way, I believe, to reduce the memory usage, is to quit the application and restart it.
The only difference that I can tell is the Java Virtual Machine that it is being run on. I do not know about this and I have tried to find the reason why it is acting this way, but a lot of the documentation assumes a great knowledge of Java and Virtual Machines. Could someone please help m with some reasons why this may be happening and perhaps some things to try to stop it.
I have looked at using GCJ to compile the application, but I don't know if this is something I should be putting a lot of time in to and whether it will actually help.
Thanks for the help!
UPDATE: Developing on Mac OS 10.6.3 and server is on a unix OS but I don't know what. (Server is from WebFaction)
I think it is due to the fact that I
call System.GC() fairly regularly
You should not do that, it's almost never useful.
A garbage collector works most efficiently when it has lots of memory to play with, so it will tend to use a large part of what it can get. I think all you need to do is to set the max heap size to something like 32MB with an -Xmx32m command line parameter - the default depends on whether the JVM believes it's running on a "server class" system, in which case it assumes that you want the application to use as much memory as it can in order to give better throughput.
BTW, if you're running on a 64 bit JVM on the server, it will legitimately need more memory (usually about 30%) than on a 32bit JVM due to larger references.
Two points you might consider:
Calls of System.gc can be disabled by a commandline parameter (-XX:-DisableExplicitGC), I think the behaviour also depends on the gc algorithm the vm uses. Normally invoking the gc should be left to the jvm
As long as there is enough memory available for the jvm I don't see anything wrong in using this memory to increase application and gc performance. As Michael Borgwardt said you can restrict the amount of memory the vm uses at the command line.
Also you may want to look at what mode the JVM has been started when you deploy it online. My guess its a server VM.
Take a look at the differences between the two right here on stackoverflow. Also, see what garbage collector is actually running on the actual deployment. See if you can tweek the GC behaviour, or change the GC algorithm.See the -X options if its a Sun JVM.
Basically the JVM takes the amount of memory it is allowed to as needed, in order to make the "new" operation as fast as possible (this is a science in itself).
So if you have a lot of objects being used, and then discarded, you will slowly and surely fill up the available memory. Then you can ask for garbage collection, but it is just a hint, and the JVM may choose not to listen.
So, you need another mechanism to keep memory usage down. The typical approach is to limit the amount of memory with -Xoptions, but be careful since the JVM you use on your pc may be very different from the one you deploy on, and the memory need may therefore be different.
Is there a deliberate requirement for low memory usage? If not, then just let it run and see how the JVM behaves. Use jvisualvm to attach and monitor.
Perhaps the server uses more memory because there is a higher load on your app and so more threads are in use? Jetty will use a number of threads to spread out the load if there are a lot of requests. Its worth a look at the thread count on the server versus on your test machine.