I have followed a few examples from various sources, and have the following snippet:
private void registerForMemUsageChanges() {
List<GarbageCollectorMXBean> garbageCollectorMXBeans = ManagementFactory.getGarbageCollectorMXBeans();
for (GarbageCollectorMXBean garbageCollectorMXBean : garbageCollectorMXBeans) {
listenForGarbageCollectionOn(garbageCollectorMXBean);
}
}
private void listenForGarbageCollectionOn(GarbageCollectorMXBean garbageCollectorMXBean) {
NotificationEmitter notificationEmitter = (NotificationEmitter) garbageCollectorMXBean;
GarbageListener listener = new GarbageListener();
notificationEmitter.addNotificationListener(listener, null, null);
}
public class GarbageListener implements NotificationListener {
#Override
public void handleNotification(Notification notification, Object handback) {
if (notification.getType().equals(GarbageCollectionNotificationInfo.GARBAGE_COLLECTION_NOTIFICATION)) {
doSomthing();//irrelevant
}
}
}
I have added a test that does the following (again, based on examples I found) and it seems to work:
private void triggerGc() throws InterruptedException {
Object obj = new Object();
WeakReference ref = new WeakReference<Object>(obj);
obj = null;
while(ref.get() != null) {
System.gc();
}
}
While running in debug mode I see that the listener is registered to ps marksweep and ps scavenge. The while loop finished (which I take as a sign that GC was performed) but no notification is called. Not a GC notification or of any type.
Is the problem that the listener is registered wrong or was GC not really performed? it seems that the weak ref is indeed empty after the while loop.
I am using openjdk 1.8.0_144.
The problem actually was:
As Holger mentioned in a comment, I was only getting notification for GC that affected the old generation, and
Debugger did not stop at my breakpoint, though the code was executed, so even though I couldn't tell, doSomething actually did something.
This is what System.gc() javadoc says
Calling the gc method suggests that the Java Virtual Machine expend
effort toward recycling unused objects in order to make the memory
they currently occupy available for quick reuse. When control returns
from the method call, the Java Virtual Machine has made a best effort
to reclaim space from all discarded objects.
Calling garbage collector does not means that garbage will collected immediately on call. But by calling gc you can only suggest JVM to collect garbage but cannot force it.
Related
I am implementing a feature that reports an error when instances of my Java class are discarded before being "used" (for simplicity, we can define being "used" as having a particular method called).
My first idea is to use phantom references, which is often used as an improvement on finalize() methods. I would have a phantom reference class that would point to my main object (the one I want to detect whether it is discarded before being used) as the referrent, something like this:
class MainObject {
static class MyPhantom extends PhantomReference<MainObject> {
static Set<MyPhantom> phantomSet = new HashSet<>();
MyPhantom(MainObject obj, ReferenceQueue<MainObject> queue) {
super(obj, queue);
phantomSet.add(this);
}
void clear() {
super.clear();
phantomSet.remove(this);
}
}
MyPhantom myPhantom = new MyPhantom(this, referenceQueue);
static ReferenceQueue<MainObject> referenceQueue = new ReferenceQueue<>();
void markUsed() {
myPhantom.clear();
myPhantom = null;
}
static void checkDiscarded() { // run periodically
while ((aPhantom = (MyPhantom) referenceQueue.poll()) != null) {
aPhantom.clear();
// do stuff with aPhantom
}
}
}
However, I am using Java 8, and in Java 8, phantom references are not automatically cleared when they are enqueued into the reference queue. (I know that this is fixed in Java 9, but unfortunately, I must use Java 8.) This means that, once the GC determines that the main object is not strongly reachable, and enqueues the phantom reference, it still cannot reclaim the memory of the main object, until I manually clear the phantom reference after I dequeue it in checkDiscarded(). I am concerned that, during the period of time between the GC enqueuing the phantom reference and me dequeueing it from the queue, the main object will remain in memory when it's unnecessary. My main object references many other objects which take a lot of memory, so I would not want it staying in memory for longer than without this feature.
To avoid this problem of the phantom reference preventing the main object from being reclaimed, I came up with the idea of using a dummy object as the referrent of the phantom reference instead of my main object. This dummy object will be referenced from my main object, so it will become unreachable at the same time as my main object. Since the dummy object will be small, I don't mind it not being reclaimed for longer period of time, as long as my main object will be reclaimed as soon as it's not reachable. Does this seem like a good idea, and is it really better than using the main object as the referrent?
class MainObject {
static class MyPhantom extends PhantomReference<Object> {
static Set<MyPhantom> phantomSet = new HashSet<>();
MyPhantom(Object obj, ReferenceQueue<Object> queue) {
super(obj, queue);
phantomSet.add(this);
}
void clear() {
super.clear();
phantomSet.remove(this);
}
}
Object dummyObject = new Object();
MyPhantom myPhantom = new MyPhantom(dummyObject, referenceQueue);
static ReferenceQueue<Object> referenceQueue = new ReferenceQueue<>();
void markUsed() {
myPhantom.clear();
myPhantom = null;
}
static void checkDiscarded() { // run periodically
while ((aPhantom = (MyPhantom) referenceQueue.poll()) != null) {
aPhantom.clear();
// do stuff with aPhantom
}
}
}
Another idea I am considering is to use weak references instead of phantom references. Unlike phantom references in Java 8, weak references are cleared when they are enqueued, so it does not prevent the referrent from being reclaimed. I understand that the reason why phantom references are usually used for resource cleanup, is that phantom references are only enqueued after the referrent is finalized and guaranteed to not be used anymore, whereas weak references are enqueued before being finalized, and so resources cannot be freed yet, and also the finalizer might resurrect the object. However, that's not a concern in my case, as I am not "cleaning up" any resources, but just making a report that my main object was discarded before being used, which can be done while the object is still in memory. My main objects also do not have a finalize() method, so there is no concern of resurrecting the object. So do you think weak references would be a better match for my case?
Weak and phantom references are indeed equivalent when no finalization is involved. However, a common misconception is to assume that an object is only subject to finalizer reachability and potential resurrection when its own class has a finalize() method.
To demonstrate the behavior, we may use
Object o = new Object();
ReferenceQueue<Object> q = new ReferenceQueue<>();
Reference<?> weak = new WeakReference<>(o, q), phantom = new PhantomReference<>(o, q), r;
// ...
o = null;
for(int cycles = 0, got = 0; got < 2; ) {
while((r = q.remove(100)) == null) {
System.gc();
cycles++;
}
got++;
System.out.println(
(r == weak? "weak": r == phantom? "phantom": "magic unicorn")
+ " ref queued after " + cycles + " cycles");
}
This typically prints either,
phantom ref queued after 1 cycles
weak ref queued after 1 cycles
or
weak ref queued after 1 cycles
phantom ref queued after 1 cycles
as both references are truly treated the same in this case and there’s no preferred order when both are enqueued in the same garbage collection.
But when we replace the // ... line with
class Legacy {
private Object finalizerReachable;
Legacy(Object o) {
finalizerReachable = o;
}
#Override
protected void finalize() throws Throwable {
System.out.println("Legacy.finalize()");
}
}
new Legacy(o);
The output changes to something like
Legacy.finalize()
weak ref queued after 1 cycles
phantom ref queued after 2 cycles
as Legacy’s finalizer is enough to make the the object finalizer reachable and open the possibility to resurrect the object during finalization.
This doesn’t have to stop you from using this approach. You may decide that there’s no such finalizer in your entire application or accept this scenario as known limitation, to only apply if someone intentionally adds such a finalizer. JDK 18 has marked the finalize() method as deprecated, for removal, so this issue will disappear in the future without requiring you to take any action.
Still, your other approach using a dummy object with a PhantomReference will work as intended, having the phantom reference only enqueued when the dummy object and hence, also the outer object, is not even finalizer reachable anymore. The drawback is the (very) slightly higher memory consumption due to the additional dummy object.
Mind that the markUsed() method may set dummyObject to null to.
Another possible point of view is that when your feature is intended to log a wrong usage of your class, which should normally not happen, it doesn’t matter when it might temporarily consume more memory when it happens. When markUsed() has been called, the phantom reference is cleared and left to garbage collection without getting enqueued, so in the case of a correct usage, the memory is not held longer than necessary.
As you can see in the first block SAVE_HOOK = null; before going inside the if branch, I think SAVE_HOOK == null, so it should not go into SAVE_HOOK.isAlive();. But actually, I test it in the Eclipse:
Eclipse IDE for Java Developers
Version: Neon.3 Release (4.6.3)
Build id: 20170314-1500
Why does this happen?
public class FinalizeEscapeGC {
public static FinalizeEscapeGC SAVE_HOOK = null;
public void isAlive() {
System.out.println("yes, i am still alive :)");
}
#Override
protected void finalize() throws Throwable {
super.finalize();
System.out.println("finalize mehtod executed!");
FinalizeEscapeGC.SAVE_HOOK = this;
}
public static void main(String[] args) throws Throwable {
SAVE_HOOK = new FinalizeEscapeGC();
// ---------------block 1----------------//
SAVE_HOOK = null;
System.gc();
Thread.sleep(500);
if (SAVE_HOOK != null) { // "SAVE_HOOK = null;"
SAVE_HOOK.isAlive(); // why the frist time it can go inside this if branch?
} else {
System.out.println("no, i am dead :(");
}
// ---------------block 1----------------//
// the same as the above block
// ---------------block 2----------------//
SAVE_HOOK = null;
System.gc();
Thread.sleep(500);
if (SAVE_HOOK != null) {
SAVE_HOOK.isAlive();
} else {
System.out.println("no, i am dead :(");
}
// ---------------block 2----------------//
}
}
Result:
finalize mehtod executed!
yes, i am still alive :)
no, i am dead :(
It sounds like you're trying to ask two questions:
Why is SAVE_HOOK not null after the first block?
Why is it null after the second, identical block?
The simple answer to the first question is: Because System.gc() triggered the garbage collector, which in turn ran the finalizer, which explicitly reassigns the reference. But I assume you know that already, and what you mean to ask is: Why wasn't the object reclaimed after the finalizer completed?
The answer is in the documentation to finalize():
After the finalize method has been invoked for an object, no further action is taken until the Java virtual machine has again determined that there is no longer any means by which this object can be accessed [...] at which point the object may be discarded.
So as long as you still have a strong reference to the object, it will never be garbage collected.
The answer to the second question is: You set it to null and it's never reassigned because the finalizer for a particular instance will only ever run once. This is also in the documentation, right after the previous quote:
The finalize method is never invoked more than once by a Java virtual machine for any given object.
It boils down to this: an object can only be finalized once in its lifetime. The finalize method can cause the object to escape deletion the first time, but next time the object is detected as unreachable, it will simply be deleted.
In an object's life lifecycle, it will go through reachable->finalizable->finalized->reclaimed phases. After finalize() method was execute, the object will get in finalized phase, and wait for the next gc trigger, it will be reclaimed. So in the lifecycle, the finalize() method will be exectued only once. Your SAVE_HOOK is a static variable and it was assigned value in finalize() method again. That's why in first if block, it's not null. but the second time, it's dead.
My application is running out of memory while performing an operation with a large data. The data is a Java List and is around 100K elements in size.
PersistData is the class which implements the operation and PersistDataIntoDB is the class which does the actual operation. Because the operation time consuming, the caller to PersistData gets a response saying the operation is started and there are additional APIs to get the status of the operation.
Also, the entire operation in concurrent and there are multiple callers to the operation.
Here is what the code looks like (I hope its readable).
public class PersistData {
public Boolean persistData(List<ClassA> dataRecs) {
//some checks (smaller operation)
persistDataInDifferentThread(dataRecs);
//if no errors in checks return true
return true;
}
private void persistDataInDifferentThread(List<ClassA> dataRecs) {
Thread runnerThread = new Thread(new Runnable() {
public void run() {
try {
List convertedList = constructClassBUsingClassA(dataRecs);
PersistDataIntoDB dbPersist = new PersistDataIntoDB();
dbPersist.persistDataInDB(convertedList);
}
catch (Exception e) {
}
}
});
}
private List<ClassB> constructClassBUsingClassA(List<ClassA> dataRecs) {
List<ClassB> tempList = new ArrayList<ClassB>();
for (int i = 0; i < dataRecs.size(); i++) {
ClassA tempRec = dataRecs.get(i);
ClassB tempRecB = new ClassB();
//put stuff from tempRec to tempRecB
tempList.add(tempRecB);
}
return tempList;
}
}
Class which does the persistance.
public class PersistDataIntoDB {
public Boolean persistDataInDB(List<ClassB> dataRecs){
//if all goes well return true
return true;
}
}
My question is if my method persistDataInDifferentThread can be refactored ?because while it is running, there are two large Lists in memory and the call to persistDataInDB take long time to finish and the garbage collector may not be unloading the List<ClassA> even though I don’t need it after calling persistDataInDB.
Is my above analysis wrong? I just have to increase the max heap because I am dealing with large data?
Is my above analysis wrong? I just have to increase the max heap because I am dealing with large data?
Yes, and yes.
1) Using multiple threads does not increase or reduce the amount of heap space used.
2) If the heap fills up, then the JVM will make every effort to reclaim space before throwing an OOME.
The only thing that might make a difference is if one thread creates the list and passes it to the second instance to be persisted ... and also hangs onto a reference to the list. That might cause the list to remain reachable longer than it needs to be.
I guess you could also get into trouble if you have multiple runner threads persisting multiple lists, and the work is arriving faster than you can process it. If that is the problem, then you need to do something to control the rate at which you accept the requests.
Good afternoon all,
I was taught that when a function returns, The variables (within the scope of that function) automatically go out of scope so we do not have to set them to null.
However, this doesn't seem to be true.
I have a test code that creates a java.lang.ref.PhantomReference pointing to an instance of a java.lang.Object. The only strong reference to that object is within the scope of a function F.
In other words, when that function returns, there should no longer be any strong reference to that object, and the object should now be collectible by the the GC.
However, no matter how hard I try to starve the JVM of memory, the GC simply refuses to collect the object. What is surprising is that if I set the variable to null (obj = null;), the GC now collects the object.
What is the explanation behind this oddity?
public class Test {
public static void main(String args[]) {
// currently testing on a 64-bit HotSpot Server VM, but the other JVMs should probably have the same behavior for this use case
Test test = new Test();
test.F(new Object());
}
public <T> void F(T obj) {
java.lang.ref.ReferenceQueue<T> ref_queue = new java.lang.ref.ReferenceQueue<T>();
java.lang.ref.PhantomReference<T> ref = new java.lang.ref.PhantomReference<T>(obj, ref_queue); // if this line isn't an assignment, the GC wouldn't collect the object no matter how hard I force it to
obj = null; // if this line is removed, the GC wouldn't collect the object no matter how hard I force it to
StartPollingRef(ref_queue);
GoOom();
}
private <T> void StartPollingRef(final java.lang.ref.ReferenceQueue<T> ref_queue) {
new java.lang.Thread(new java.lang.Runnable() {
#Override
public void run() {
System.out.println("Removing..");
boolean removed = false;
while (!removed) {
try {
ref_queue.remove();
removed = true;
System.out.println("Removed.");
} catch (InterruptedException e) { // ignore
}
}
}
}).start();
}
private void GoOom() {
try {
int len = (int) java.lang.Math.min(java.lang.Integer.MAX_VALUE, Runtime.getRuntime().maxMemory());
Object[] arr = new Object[len];
} catch (Throwable e) {
// System.out.println(e);
}
}
}
A standards-compliant JVM is never obligated to collect memory. That is to say, you cannot write a program whose correctness depends on a particular bit of memory being collected at a certain time: you can neither force the JVM to collect (even via System.gc()!) nor rely on it doing so.
So, the behavior you're observing cannot, definitionally, be wrong: you're purposefully trying to make the environment do something it is under no onus to do.
That all said, your issue is that your object has not gone out of scope. It is created in main, then passed - in the normal Java referential manner - to F. Until F returns, the T obj name is still a reference to your object.
Make goOom static and put a call to it in main, and you should see the object get collected. But, then again, you might still not, and that wouldn't be wrong...
Consider an object declared in a method:
public void foo() {
final Object obj = new Object();
// A long run job that consumes tons of memory and
// triggers garbage collection
}
Will obj be subject to garbage collection before foo() returns?
UPDATE:
Previously I thought obj is not subject to garbage collection until foo() returns.
However, today I find myself wrong.
I have spend several hours in fixing a bug and finally found the problem is caused by obj garbage collected!
Can anyone explain why this happens? And if I want obj to be pinned how to achieve it?
Here is the code that has problem.
public class Program
{
public static void main(String[] args) throws Exception {
String connectionString = "jdbc:mysql://<whatever>";
// I find wrap is gc-ed somewhere
SqlConnection wrap = new SqlConnection(connectionString);
Connection con = wrap.currentConnection();
Statement stmt = con.createStatement(ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
ResultSet rs = stmt.executeQuery("select instance_id, doc_id from
crawler_archive.documents");
while (rs.next()) {
int instanceID = rs.getInt(1);
int docID = rs.getInt(2);
if (docID % 1000 == 0) {
System.out.println(docID);
}
}
rs.close();
//wrap.close();
}
}
After running the Java program, it will print the following message before it crashes:
161000
161000
********************************
Finalizer CALLED!!
********************************
********************************
Close CALLED!!
********************************
162000
Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:
And here is the code of class SqlConnection:
class SqlConnection
{
private final String connectionString;
private Connection connection;
public SqlConnection(String connectionString) {
this.connectionString = connectionString;
}
public synchronized Connection currentConnection() throws SQLException {
if (this.connection == null || this.connection.isClosed()) {
this.closeConnection();
this.connection = DriverManager.getConnection(connectionString);
}
return this.connection;
}
protected void finalize() throws Throwable {
try {
System.out.println("********************************");
System.out.println("Finalizer CALLED!!");
System.out.println("********************************");
this.close();
} finally {
super.finalize();
}
}
public void close() {
System.out.println("********************************");
System.out.println("Close CALLED!!");
System.out.println("********************************");
this.closeConnection();
}
protected void closeConnection() {
if (this.connection != null) {
try {
connection.close();
} catch (Throwable e) {
} finally {
this.connection = null;
}
}
}
}
I'm genuinely astonished by this, but you're right. It's easily reproducible, you don't need to muck about with database connections and the like:
public class GcTest {
public static void main(String[] args) {
System.out.println("Starting");
Object dummy = new GcTest(); // gets GC'd before method exits
// gets bigger and bigger until heap explodes
Collection<String> collection = new ArrayList<String>();
// method never exits normally because of while loop
while (true) {
collection.add(new String("test"));
}
}
#Override
protected void finalize() throws Throwable {
System.out.println("Finalizing instance of GcTest");
}
}
Runs with:
Starting
Finalizing instance of GcTest
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at test.GcTest.main(GcTest.java:22)
Like I said, I can hardly believe it, but there's no denying the evidence.
It does make a perverse kind of sense, though, the VM will have figured out that the object is never used, and so gets rid of it. This must be permitted by the spec.
Going back to the question's code, you should never rely on finalize() to clean up your connections, you should always do it explicitly.
As your code is written the object pointed to by "wrap" shouldn't be eligible for garbage collection until "wrap" pops off the stack at the end of the method.
The fact that it is being collected suggests to me your code as compiled doesn't match the original source and that the compiler has done some optimisation such as changing this:
SqlConnection wrap = new SqlConnection(connectionString);
Connection con = wrap.currentConnection();
to this:
Connection con = new SqlConnection(connectionString).currentConnection();
(Or even inlining the whole thing) because "wrap" isn't used beyond this point. The anonymous object created would be eligible for GC immediately.
The only way to be sure is to decompile the code and see what's been done to it.
There are really two different things happening here. obj is a stack variable being set to a reference to the Object, and the Object is allocated on the heap. The stack will just be cleared (by stack pointer arithmetic).
But yes, the Object itself will be cleared by garbage collection. All heap-allocated objects are subject to GC.
EDIT: To answer your more specific question, the Java spec does not guarantee collection by any particular time (see the JVM spec) (of course it will be collected after its last use). So it's only possible to answer for specific implementations.
EDIT: Clarification, per comments
As I'm sure you're aware, in Java Garbage Collection and Finialization are non-deterministic. All you can determine in this case is when wrap is eligible for garbage collection. I'm assuming you are asking if wrap only becomes eligible for GC when the method returns (and wrap goes out of scope). I think that some JVMs (e.g. HotSpot with -server) won't wait for the object reference to be popped from the stack, it will make it eligible for GC as soon as nothing else references it. It looks like this is what you are seeing.
To summarise, you are relying on finalization being slow enough to not finalize the instance of SqlConnection before the method exits. You finalizer is closing a resource that the SqlConnection is no longer responsible for. Instead, you should let the Connection object be responsible for its own finalization.
Will obj be subject to garbage collection before foo() returns?
You cannot be sure obj will be collected before foo returns but it is certainly eligible for collection before foo returns.
Can anyone explain why this happens?
GCs collect unreachable objects. Your object is likely to become unreachable before foo returns.
Scope is irrelevant. The idea that obj stays on the stack until foo returns is an overly-simplistic mental model. Real systems don't work like that.
Here, obj is a local variable in the method and it is popped off the stack as soon as the method returns or exits. This leaves no way to reach the Object object on the heap and hence it will be garbage collected. And the Object object on the heap will be GC'd only after its reference obj is popped off the stack,ie, only after the method finishes or returns.
EDIT:
To answer your update,
UPDATE: Let me make the question more clear.
Will obj be subject to garbage collection before foo() returns?
obj is just a reference to the actual object on the heap. Here obj is declared inside the method foo(). So your question Will obj be subject to garbage collection before foo() returns? doesnot apply as obj goes inside the stack frame when the method foo() is running and is gone when the method finishes.
According to the current spec, there isn't even a happens-before ordering from finalisation to normal use. So, to impose order, you actually need to use a lock, a volatile or, if you are desperate, stashing a reference reachable from a static. There is certainly nothing special about scope.
It should be rare that you actually need to write a finaliser.