What are some examples of non-critical resources? - java

A quote from Effective Java states that:
A second legitimate use of finalizers concerns objects with native peers. A
native peer is a native object to which a normal object delegates via native methods. Because a native peer is not a normal object, the garbage collector doesn’t
know about it and can’t reclaim it when its Java peer is reclaimed. A finalizer is an
appropriate vehicle for performing this task, assuming the native peer holds no
critical resources.
I've not done C++ before, though I'm vaguely aware that file handlers and database connections are critical resources. But what exactly does it mean for a resource to be non-critical?
Or rather, what are some examples of non-critical resources?

“Non-critical resources” don’t exist. The quote isn’t talking about non-critical resources, it’s merely talking about the absence of critical resources.
In a way, you could say that memory is a non-critical resource in a garbage-collected system. However, I’m not convinced that this would be correct (quite the opposite, in fact: managed resources can still be critical if they run out), and I’ve never heard this being said.

I don't think it's really the resource that's critical, despite the phrase used. I think it's recovering the resource that may or may not be critical, and the quote could be rephrased, "assuming it is not critical that the resource is freed".
If it's critical that the resource is freed by a particular point in program execution, after the object is unreachable but before the resource is needed for some other purpose, then a finalizer is inadequate. Instead you need some program logic to make sure it happens.
So, file handles or db connections are critical if you're worried that you might run out, they're not critical otherwise. If you've reached some limit of open DB connections, because the finalizers that would close your old ones haven't been run yet, and you try to open another DB connection, chances are it'll fail. The situation with memory is rather better, since if you've run out of memory because of unreachable objects, and try to create a new object, then the GC will at least make an effort to find something to finalize and free.
Thus, file handles and db connections should have a close() function that the user can call to free all resources in cases where the program logic is able to determine that the object will not be used again. Expecting the GC to close the connection via a finalizer isn't reliable enough. It also doesn't deal well with the possibility of a flush or commit failing, although that's a separate issue.

Related

Java - How is memory leak in Java harmful? How can it possibly get used for a bad cause?

Creating a memory leak with Java
I was going through above "interview" question. After reading it's answers I myself ended up having a few questions.
Let's guess there is already a memory leak in the code.
How is that harmful? How can the data go in wrong hands?
I am pretty sure that System.read(); (or something like that) is not going to read the data from the memory leak. Is that even possible?
Please help with some reference/code/documents.
Memory leak is really a broad argument, to be honest I've voted to close your question (because too broad) but on the other hand I would try to give you a little spark of what behind this problem.
Consider that you're creating a session in memory for every user connected to your web service, but you don't throw away the session after some time, simply because you forget or because a bad design of your application, this would cause a memory leak.
And again, consider that you don't close your open files or sockets.
Or consider that somewhere you save a reference to all the intermediate data structures produced by your process. In this case there is no way for the garbage collector to free the allocated memory.
Memory leaks mostly happens in long running application, because in the short run a memory leak have little chances to generate a out of memory exception. But in the long run the thing changes, there are applications that runs for months or even years.
There are so many situation where a memory leaks could happen. Many framework or libraries and even the languages try to save the programmers by this "bad" situations, but I personally think that is the experience of the programmer that does the difference.
For example in Java the Try with resource Statement is an example of language features born to help programmers in such situation (this helps to not forget).
So when designing your own objects that should close some resource at end of their life, try to implement java.lang.AutoCloseable interface and add the appropriate methods. Have a look at how many classes are now implementing the Autocloseable interface, this also explain how is important the memory (leak) and resource handling.
I would also suggest to study the difference between Java stack and heap memory management.
Once I experienced a Tomcat instance that hanged a server every three months. After some time the server had to be restarted every three week, till the time the server had to be restarted every day.
Comes out that "someone", wrote a for loop instead to add a while clause in a sql query.
So, there are programmers that does this as full time job, that are expert in this kind of investigations and that are able to find and correct memory leaks.

What is a resource in java?why we have to close it after it is used?

What is the meaning of the word "resource" in java?
Why we have to close it after usage,even though garbage collector runs in jvm ?
Why we have to write resource cleanup code in finally block?
A resource is something that has a limited ammount, such as DB connections and file descriptors. The GC frees memory, but you still have to release resources such as DB connections, open files, etc..., to allow other threads to use them.
By the way, it's better to free the resources immediately after you finish using them, and not just in the finalize method, which might take a long time until it is called by the GC.
A database connection, a thread, a file handle, a socket - all are finite resources.
The operating system you run on only allows so many threads - 1 MB overhead per thread. You're limited by the available RAM. Same for file handles and sockets.
Database connections are interesting because they involve client and server. If the client gc's the connection, what tells the server to close the connection? If you fail to close in finally blocks, you'll soon find out that your database server will run out of connections under heavy load.
Finalize is not the right way to go. Don't depend on the VM to call it. Write a close() method and call it in a finally block when your method is done with the resource. Close in the narrowest scope possible.
Say you have a file, you can write to it and not close the resource and eventually it will be closed by the GC. The problem is that while the file is open you can't delete it in windows, and in Linux you can delete it but it doesn't free any space. If you want to delete a file you don't want to be waiting until the GC feels like running perhaps hours later.
What is the meaning of the word "resource" in java?
The typical Java application manipulates several types of resources such as files, streams, sockets, and database connections.
Why we have to write resource cleanup code in finally block?
The Oracle article presents the Java 7 answer to the automatic resource management problem.
Such resources must be handled with great care, because they acquire system resources for their operations. Thus, you need to ensure that they get freed even in case of errors.
Indeed, incorrect resource management is a common source of failures in production applications, with the usual pitfalls being database connections and file descriptors remaining opened after an exception has occurred somewhere else in the code.
This leads to application servers being frequently restarted when resource exhaustion occurs, because operating systems and server applications generally have an upper-bound limit for resources.
Use Java 7 The try-with-resources Statement

Why do some webservers complain about memory leaks they create?

The title might be a bit strong, but let me explain how I understand what happens. I guess this happened with Tomcat (and the message cited comes from Tomcat), but I'm not sure anymore.
TL;DR At the bottom there's a summary why I'm claiming that it is the web servers' fault.
I might be wrong (but without the possibility of being wrong there would be no reason to ask):
An application uses a library
the library uses a ThreadLocal
the ThreadLocal refers to an object from the library
each object refers to its ClassLoader
The webserver
pools its worker threads for efficiency
lends an arbitrary thread to an application
does nothing special (w.r.t. the thread pool) when an application stops or redeploys
If I understand it correctly, after a redeploy the old "dirty" threads continue to be reused. Their ThreadLocals refer to the old classes which refer to their ClassLoader which refer to the whole old class hierarchy. So a lot of stuff stays in the PermGen space which over time leads to an OutOfMemoryError. Is this right so far?
I'm assuming two things:
the redeploy frequency is a few time per hour
the thread creation overhead is a fraction of a millisecond
So a complete thread pool renewal upon each redeploy costs a fraction of a millisecond a few times per hour, i.e., there's a time overhead of 0.0001 * 12/3600 * 100% i.e. 0.000033%.
But instead of accepting this tiny overhead, there are countless problems. Is my calculation wrong or what am I overlooking?
As a warning we get the message
The web application ... created a ThreadLocal with key of type ... and a value of type ... but failed to remove it when the web application was stopped.
which should be better stated as
The web server ... uses a thread pool but failed to renew it after stopping (or redeploying) an application.
Or am I wrong? The time overhead is negligible even when all threads get recreated from time to time. But clearing their ThreadLocals before they are provided to the applications would suffice and be even faster.
Summary
There are some real problems (recently this one) and the user can do nothing about it. The library writers sometimes can and sometimes can not. IMHO the web servers could solve it pretty easily. The thing happens and has a cause. So I'm blaming the only one party which could do anything about it.
Proposal for what the web server should exactly do
The title of this question is more provocative than correct, but it has its point. And so does the answer by raphw. This linked question has another open bounty.
I think the web servers could solve it as follows:
ensure that each thread gets reused (or killed) sometime
store a LastCleanupTimestamp in a ThreadLocal (for new threads it's the creation time)
when re-using a thread, check if the cleanup timestamp is below some threshold (e.g., now minus some delta, e.g., 1 hour)
if so, clean all ThreadLocals and set a new LastCleanupTimestamp
This would assure that no such leak exists longer than delta plus the duration of the longest request plus the thread turnaround time. The cost would compose as follows:
checking a single ThreadLocal (i.e., some nanoseconds) per request
cleaning all ThreadLocals reflectively (i.e., some more nanoseconds once each delta per thread)
the cost from removing the data possibly useful for the application which stored them. This can't break an application as no application can assume to see a thread containing the thread locals it has set (since it can't even assume to see the thread itself anymore), but it may cost time needed to recreate the data (e.g., a cached DateFormat instance if someone still uses such a terrible thing).
It could be switched off by simply setting the thresold, if no app has been undeployed or redeployed recently.
TL;DR It's not web servers that create memory leaks. It's you.
Let me first state the problem more explicitly: ThreadLocal variables often refer to an instance of a Class that was loaded by a ClassLoader that was meant to be exclusively used by a container's application. When this application gets undeployed, the ThreadLocal reference gets orphaned. Since each instance keeps a reference to its Class and since each Class keeps a reference to its ClassLoader and since each ClassLoader keeps a reference to all classes it ever loaded, the entire class tree of the undeployed application cannot get garbage collected and the JVM instance suffers a memory leak.
Looking at this problem, you can optimize for either:
Allow as many requests per second as possible even throughout a redeploy (thus keep response time short and reuse threads from a thread pool)
Make sure that threads stay clean by discarding threads once they were used when a redeploy occurred (thus patch forgotten manual cleaning)
Most developers of web applications would argue that the first is more important since the second can be achieved by writing good code. And what would happen when a redeploy would happen concurrently to long lasting requests? You cannot shut down the old thread pool since this would interrupt running requests. (There is no globally defined maximum for how long a request cycle can take.) In the end, you would need a quite complex protocol for that and that would bring its own problems.
The ThreadLocal induced leak can however be avoided by always writing:
myThreadLocal.set( ... );
try {
// Do something here.
} finally {
myThreadLocal.remove();
}
That way, your thread will always turn out clean. (On a side note, this is almost like creating global variables: It is almost always a terrible idea. There are some web frameworks like for example Wicket that make a lot of use of this. Web frameworks like this are terrible to use when you need to do things concurrently and get very unintuitive for others to use. There is a trend away from the typical Java one thread per request model such as demonstrated with Play and Netty. Do not get stuck with this anti-pattern. Do use ThreadLocal sparingly! It is almost always a sign of bad design.)
You should further be aware that memory leaks that are induced by ThreadLocal are not always detected. Memory leaks are detected by scanning the web server's worker thread pool for ThreadLocal variables. If a ThreadLocal variable was found the variable's Class reveals its ClassLoader. If this ClassLoader or one of its parents is that of the web application that just got undeployed, the web server can safely assume a memory leak.
However, imagine that you stored some large array of Strings in a ThreadLocal variable. How can the web server assume that this array belongs to your application? The String.class was of course loaded with the JVM's bootstrap ClassLoader instance and cannot be associated with a particular web application. By removing the array, the web server might break some other application that is running in the same container. By not removing it, the web server might leak a large amount of memory. (This time, it is not a ClassLoader and its Classes that are leaked. Depending on the size of the array, this leak might however even be worse.)
And it gets worse. This time, imagine that you stored an ArrayList in your ThreadLocal variable. The ArrayList is part of the Java standard library and therefore loaded with the system ClassLoader. Again, there is no way of telling that the instance belongs to a particular web application. However, this time your ClassLoader and all its Classes will leak as well as all instances of such classes that are stored in the thread local ArrayList. This time, the web server even cannot certainly determine that a memory leak occurred when it finds that the ClassLoader was not garbage collected since garbage collection can only be recommended to a JVM (via System#gc()) but not enforced.
Renewing the thread pool is not as cheap as you might assume.
A web application cannot just go and throw away all threads in a thread pool whenever an application is undeployed. What if you stored some values in those threads? When a web application recycles a thread, it should (I am not sure if all web servers do this) find all non-leaking thread local variables and reregister them in the replaced Thread. The numbers you stated about efficiency would therefore not longer hold.
At the same time, the web server need to implement some logic that manages the replacement of all thread pool's Threads what does neither work in favor of your proposed time calculation. (You might have to deal with long lasting requests - think of running an FTP server in a servlet container -- such that this thread pool transition logic might be active for quite a long time.)
Furthermore, ThreadLocal is not the only possibility of creating a memory leak in a servlet container.
Setting a shut down hook is another example. (And it is unfortunately a common one. Here, you should manually remove the shut down hook when your application is undeployed. This problem would not be solved by discarding threads.) Shut down hooks are furthermore instances of custom subclasses of Thread that were always loaded by an application's class loader.
In general, any application that keeps a reference to an object that was loaded by a child class loader might create a memory leak. (This is generally possible via Thread#getContextClassLoader().) In the end, it is the developer's resposibility to not cause memory leaks, even in Java applications where many developer's misinterpret the automatic garbage collection as there are no memory leaks. (Think of Jochua Bloch's famous stack implementation example.)
After this general statement, I want to comment on Tomcat's memory leak protection:
Tomcat does not promise you to detect all memory leaks but covers specific types of such leaks as they are listed in their wiki. What Tomcat actually does:
Each Thread in the JVM is examined, and the internal structures of the
Thread and ThreadLocal classes are introspected to see if either the
ThreadLocal instance or the value bound to it were loaded by the
WebAppClassLoader of the application being stopped.
Some versions of Tomcat even try to compensate for the leak:
Tomcat 6.0.24 to 6.0.26 modify internal structures of the JDK
(ThreadLocalMap) to remove the reference to the ThreadLocal instance,
but this is unsafe (see #48895) so that it became optional and
disabled by default from 6.0.27. Starting with Tomcat 7.0.6, the
threads of the pool are renewed so that the leak is safely fixed.
However, you have to properly configure Tomcat to do so. The wiki entry on its memory leak protection even warns you how you can break other applications when TimerThreads are involved or how you might leak memory leaks when starting your own Threads or ThreadPoolExecutors or when using common dependencies for several web applications.
All the clean up work offered by Tomcat is a last resort! Its nothing you want to have in your production code.
Summarized: It is not Tomcat that creates a memory leak, it is your code. Some versions of Tomcat try to compensate for such leaks which are detectable if it is configured to do so. However, it is your responsibility to take care of memory leaks and you should see Tomcat's warnings as an invitation to fix your code rather than to reconfigure Tomcat to clean up your mess. If Tomcat detects memory leaks in your application, there might even be more. So take a heap and thread dump out of your application and find out where your code is leaking.

Practice of exit(0) in C and System.exit(0) in Java

To use exit(0) in C is not a good practice, if there are alternatives, since it does not free resources for example. But to use System.exit(0) in Java - how is it here? Could one trust the garbage collector in this context?
C language:
exit(0);
Java:
System.exit(0)
But to use System.exit(0) in java - how is it here? Could one trust the garbage-collector in this context?
When you call System.exit in Java, the garbage collector is not normally run1. However, in any JVM that I've ever heard of, there is something else that reclaims all of the objects that were allocated. (Typically it is handled at the operating system level.)
The fact that the GC doesn't run is only significant if you are relying on object finalizers to so something important before the JVM terminates.
Hypothetically, if your Java application used JNI (etc) to call native methods, then those methods could access system resources that might be problematic. However:
As a general rule the operating system does take care of such things. At least it does for modern versions of Linux and UNIX, AFAIK.
The garbage collector has no knowledge of those resources anyway. If the OS can't reclaim them, then the Java garbage collector won't help.
If you did need to clean up such resources acquired by a Java program (via native code) then the best approach would be to implement the cleanup in native code methods, and use a "shutdown hook" to run them. The shutdown hooks will be run if you call System.exit.
1 - A garbage collection will be performed on JVM exit if you have previously called runFinalizersOnExit(true). However, this is a deprecated method. The Oracle site explains it like this:
Q: Why is Runtime.runFinalizersOnExit deprecated?
A: Because it is inherently unsafe. It may result in finalizers being called on live objects while other threads are concurrently manipulating those objects, resulting in erratic behavior or deadlock. While this problem could be prevented if the class whose objects are being finalized were coded to "defend against" this call, most programmers do not defend against it. They assume that an object is dead at the time that its finalizer is called.
Further, the call is not "thread-safe" in the sense that it sets a VM-global flag. This forces every class with a finalizer to defend against the finalization of live objects!
In short, this is a dangerous approach, and it won't directly deal with the kind of resources that the OP is worried about.
Think of it like this. In C, you are building your source code into a binary file that will execute on it's own only conforming to the rules of logical programming and the rules set by your OS. The OS however does not manage your memory for you. It handles events and sends information to the hardware that tell it how to run, nothing more, nothing less. In java, all code is compiled into java's own bytecode. Upon execution it does not actually at any time communicate to the OS. The virtual machine designed to run that bytecode is what does the talking. When you call System.exit (0), you are telling the virtual machine that the app you are running is coming to a halt, from there the machine handles IT'S OWN MEMORY which just so happens to include anything you did not already remove via the garbage collector but only if the VM is exiting as well. Hope that helps

Good uses of the finalize() method [duplicate]

This question already has answers here:
Why would you ever implement finalize()?
(21 answers)
Closed 5 years ago.
This is mostly out of curiosity.
I was wandering if anyone has encountered any good usage for Object.finalize() except for debugging/logging/profiling purposes ?
If you haven't encountered any what would you say a good usage would be ?
If your Java object uses JNI to instruct native code to allocate native memory, you need to use finalize to make sure it gets freed.
Late to the party here but thought I would still chime in:
One of the best uses I have found for finalizers is to call explicit termination methods which, for what ever reason, were not called. When this occurs, we also log the issue because it is a BUG!
Because:
There is no guarantee that finalizers will be executed promptly (or technically at all), per the language specification
Execution is largely dependent on the JVM implementation
Execution can sometimes be delayed if the GC has a lower thread priority
This leaves only a handful of tasks that they can address without much risk.
close external connections (db, socket etc)
close open files. may be even try to write some additional information.
logging
if this class runs external processes that should exist only while object exists you can try to kill them here.
But it is just a fallback that is used is "normal" mechanism did not work. Normal mechanism should be initiated explicitly.
Release resources that should be released manually in normal circumstances, but were not released for some reason. Perhaps with write a warning to the log.
I use it to write back data to a database when using soft references for caching database-backed objects.
I see one good use for finalize(): freeing resources that are available in large amounts and are not exclusive.
For example, by default there are 1024 file handles available for a Linux process and about 10000 for Windows. This is pretty much, so for most applications if you open a file, you don't have to call .close() (and use the ugly try...finally blocks), and you'll be OK - finally() will free it for you some time later. However for some pieces of code (like intensive server applications), releasing resources with .close() is a must, otherwise finally() may be called too late for you and you may run out of file handles.
The same technique is used by Swing - operating system resources for displaying windows and drawing aren't released by any .close() method, but just by finalize(), so you don't have to worry about all .close() or .dispose() methods like in SWT for example.
However, when there is very limited number of resources, or you must 'lock' resource to use it, also remember to 'unlock' it. For example if you create a file lock on a file, remember also to remove this lock, otherwise nobody else will be able to read or write this file and this can lead to deadlocks - then you can't rely on finalize() to remove this lock for you - you must do it manually at the right place.

Categories

Resources