Concurrent cache using WeakReference's throws an NPE - java

I need a concurrent cache of objects where each instance wraps a unique id (and maybe some extra information, which is omitted for simplicity in the code fragment below) and no more objects can be created than the number of corresponding ids,
and
I also need the objects to be GC'ed as soon as no other object references them (i. e. keep the memory foorprint as low as possible), so I want to use WeakReference's, not SoftReference's.
In the below example of a factory method, T is not a generic type -- instead, it can be thought of as some arbitrary class with an id field of type String, where all ids are unique. Each value (of type Reference<T>) is mapped to the corresponding id:
static final ConcurrentMap<String, WeakReference<T>> INSTANCES = new ConcurrentHashMap<>();
#NotNull
public static T from(#NotNull final String id) {
final AtomicReference<T> instanceRef = new AtomicReference<>();
final T newInstance = new T(id);
INSTANCES.putIfAbsent(id, new WeakReference<>(newInstance));
/*
* At this point, the mapping is guaranteed to exist.
*/
INSTANCES.computeIfPresent(id, (k, ref) -> {
final T oldInstance = ref.get();
if (oldInstance == null) {
/*
* The object referenced by ref has been GC'ed.
*/
instanceRef.set(newInstance);
return new WeakReference<>(newInstance);
}
instanceRef.set(oldInstance);
return ref;
});
return instanceRef.get();
}
The subject of WeakReference's needing to be GC'ed once they're cleared (i. e. the referrant object GC'ed) is out of scope of this question -- in the production code, this is implemented using reference queues.
AtomicReference is used solely for the purpose of returning a value from outside the lambda (which is executed in the same thread as the factory method itself).
Now, the question.
After a couple of weeks of the code running successfully, I've received an NPE which originates from the extra null checks IntelliJ IDEA added thanks to #NotNull annotations:
java.lang.IllegalStateException: #NotNull method com/example/T.from must not return null
In practice, this means that instanceRef value wasn't set in either of the branches, or the whole computeIfPresent(...) method wasn't called.
The only possiblity for a race condition I see is the map entry being removed (from a separate thread processing reference queues to GC'ed instances) somewhere between putIfAbsent(...) and computeIfPresent(...) calls.
Is there any extra room for a race condition I am missing?

You must remember that not only can other threads be happening but also GC. Consider this fragment:
instanceRef.set(oldInstance);
return ref;
});
// Here!!!!!
return instanceRef.get();
What do you think would be the effect if a GC kicked in at the Here point?
I suspect your fault is in the #NotNull because this method can return null.
Added - Logic
If the final instanceRef.get() is returning null (as is implied) then the following statements can be made.
The key was present and the oldInstance had been GCd. A certainly non-null newInstance is recorded.
// This line MUST be executed.
instanceRef.set(newInstance);
The key was present and the oldInstance had not been GCd. A certainly non-null oldInstance is recorded.
// This line MUST be executed.
instanceRef.set(oldInstance);
The key was NOT present.
Therefore the problem could occur when the instance is present when putIfAbsent is called but gone by the time computeIfPresent is executed. This scenario could occur if an item is deleted between the putIfAbsent and the computeIfPresent. However, finding a route that returns null when no deletion is occuring is difficult.
Possible Solution
You could, perhaps, ensure that the item being referenced is always recorded in the reference.
#NotNull
public static Thing fromMe(#NotNull final String id) {
// Keep track of the thing I've created (if any)
// Use AtomicReference as a mutable final.
// NB: Also delays GC as a hard reference is held.
final AtomicReference<Thing> thing = new AtomicReference<>();
// Make the map entry if not exists.
INSTANCES.computeIfAbsent(id,
// New one only made if not present.
r -> new WeakReference<>(newThing(thing, id)));
// Grab it - whatever it's contents.
// NB: Parallel deletions will cause a NPE here.
trackThing(thing, INSTANCES.get(id).get());
// Has it been GC'd
if (thing.get() == null) {
// Make it again!
INSTANCES.put(id, new WeakReference<>(newThing(thing, id)));
}
return thing.get();
}
// Makes a new Thing - keeping track of the new one in the reference.
static Thing newThing(AtomicReference<Thing> thing, String id) {
// Make the new Thing.
return trackThing(thing, new Thing(id));
}
// Tracks the Thing in the Atomic.
static Thing trackThing(AtomicReference<Thing> thing, Thing it) {
// Keep track of it.
thing.set(it);
return it;
}

Related

Java - Check if reference to object in Map exists

A few weeks back I wrote a Java class with the following behavior:
Each object contains a single final integer field
The class contains a static Map (Key: Integer, Content: MyClass)
Whenever an object of the class is instantiated a look-up is done, if an object with the wanted integer field already exists in the static map: return it, otherwise create one and put it in the map.
As code:
public class MyClass
{
private static Map<Integer, MyClass> map;
private final int field;
static
{
map = new HashMap<>();
}
private MyClass(int field)
{
this.field = field;
}
public static MyClass get(int field)
{
synchronized (map)
{
return map.computeIfAbsent(field, MyClass::new);
}
}
}
This way I can be sure, that only one object exists for each integer (as field). I'm currently concerned, that this will prevent the GC to collect objects, which I no longer need, since the objects are always stored in the map (a reference exists)...
If I wrote a loop like function like this:
public void myFunction() {
for (int i = 0; i < Integer.MAX_VALUE; i++) {
MyClass c = MyClass.get(i);
// DO STUFF
}
}
I would end up with Integer.MAX_VALUE objects in memory after calling the method. Is there a way I can check, whether references to objects in the map exists and otherwise remove them?
This looks like a typical case of the multiton pattern: You want to have at most one instance of MyClass for a given key. However, you also seem to want to limit the amount of instances created. This is very easy to do by lazily instantiating your MyClass instances as you need them. Additionally, you want to clean up unused instances:
Is there a way I can check, whether references to objects in the map exists and otherwise remove them?
This is exactly what the JVM's garbage collector is for; There is no reason to try to implement your own form of "garbage collection" when the Java core library already provides tools for marking certain references as "not strong", i.e. should refer to a given object only if there is a strong reference (i.e. in Java, a "normal" reference) somewhere referring to it.
Implementation using Reference objects
Instead of a Map<Integer, MyClass>, you should use a Map<Integer, WeakReference<MyClass>> or a Map<Integer, SoftReference<MyClass>>: Both WeakReference and SoftReference allow the MyClass instances they refer to to be garbage-collected if there are no strong (read: "normal") references to the object. The difference between the two is that the former releases the reference on the next garbage collection action after all strong references are gone, while the latter one only releases the reference when it "has to", i.e. at some point which is convenient for the JVM (see related SO question).
Plus, there is no need to synchronize your entire Map: You can simply use a ConcurrentHashMap (which implements ConcurrentMap), which handles multi-threading in a way much better than by locking all access to the entire map. Therefore, your MyClass.get(int) could look like this:
private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = new ConcurrentHashMap<>();
public static MyClass get(final int field) {
// ConcurrentHashMap.compute(...) is atomic <https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#compute-K-java.util.function.BiFunction->
final Reference<MyClass> ref = INSTANCES.compute(field, (key, oldValue) -> {
final Reference<MyClass> newValue;
if (oldValue == null) {
// No instance has yet been created; Create one
newValue = new SoftReference<>(new MyClass(key));
} else if (oldValue.get() == null) {
// The old instance has already been deleted; Replace it with a
// new reference to a new instance
newValue = new SoftReference<>(new MyClass(key));
} else {
// The existing instance has not yet been deleted; Re-use it
newValue = oldValue;
}
return newValue;
});
return ref.get();
}
Finally, in a comment above, you mentioned that you would "prefer to cache maybe up to say 1000 objects and after that only cache, what is currently required/referenced". Although I personally see little (good) reason for it, it is possible to perform eager instantiation on the "first"† 1000 objects by adding them to the INSTANCES map on creation:
private static final ConcurrentMap<Integer, Reference<MyClass>> INSTANCES = createInstanceMap();
private static ConcurrentMap<Integer, Reference<MyClass>> createInstanceMap() {
// The set of keys to eagerly initialize instances for
final Stream<Integer> keys = IntStream.range(0, 1000).boxed();
final Collector<Integer, ?, ConcurrentMap<Integer, Reference<MyClass>>> mapFactory = Collectors
.toConcurrentMap(Function.identity(), key -> new SoftReference<>(new MyClass(key)));
return keys.collect(mapFactory);
}
†How you define which objects are the "first" ones is up to you; Here, I'm just using the natural order of the integer keys because it's suitable for a simple example.
Your function for examining your cache is cringe worthy. First, as you said, it creates all the cache objects. Second, it iterates Integer.MAX_VALUE times.
Better would be:
public void myFunction() {
for(MyClass c : map.values()) {
// DO STUFF
}
}
To the issue at hand: Is it possible to find out whether an Object has references to it?
Yes. It is possible. But you won't like it.
http://docs.oracle.com/javase/1.5.0/docs/guide/jvmti/jvmti.html
jvmtiError
IterateOverReachableObjects(jvmtiEnv* env,
jvmtiHeapRootCallback heap_root_callback,
jvmtiStackReferenceCallback stack_ref_callback,
jvmtiObjectReferenceCallback object_ref_callback,
void* user_data)
Loop over all reachable objects in the heap. If a MyClass object is reachable, then, well, it is reachable.
Of course, by storing the object in your cache, you are making it reachable, so you'd have to change your cache to WeakReferences, and see if you can exclude those from the iteration.
And you're no longer using pure Java, and jvmti may not be supported by all VM's.
As I said, you won't like it.

Shared thread access to array in servlet. Which implementation to use?

Lets say I have a code like this in my servlet:
private static final String RESOURCE_URL_PATTERN = "resourceUrlPattern";
private static final String PARAM_SEPARATOR = "|";
private List<String> resourcePatterns;
#Override
public void init() throws ServletException {
String resourcePatterns = getInitParameter(RESOURCE_URL_PATTERN);
this.resourcePatterns = com.google.common.base.Splitter.on(PARAM_SEPARATOR).trimResults().splitToList(resourcePatterns);
}
Is this thread safe to use 'resourcePatterns' if it will never be modified?
Lets say like this:
private boolean isValidRequest(String servletPath) {
for (String resourcePattern : resourcePatterns) {
if (servletPath.matches(resourcePattern)) {
return true;
}
}
return false;
}
Should I use CopyOnWriteArrayList or ArrayList is OK in this case?
Yes, List is fine to read from multiple threads concurrently, so long as nothing's writing.
For more detailed information on this, please see this answer that explains this further. There are some important gotchas.
From java concurrency in practice we have:
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
Initializing an object reference from a static initializer. Storing a
reference to it into a volatile field. Storing a reference to it into
a final field. Storing a reference to it into a field that is properly
guarded by a (synchronized) lock.
your list is neither of these. I suggest making it final as this will make your object effectively immutable which in this case would be enough. If init() is called several times you should make it volatile instead. With this I of course assume that NO changes to the element of the list occur and that you don't expose any elements of the list either (as in a getElementAtPosition(int pos) method or the like.

How can we set all references to null associated with an object in java?

I have N reference to an java object. I wanted to set all reference pointing to this object as null. I don't know how many references are pointing to this object.
This is something which is specifically not possible in Java and most other modern languages. If it were, it would be a very dangerous thing to do which could break the invariants of other objects, like collections containing your object (for example, if a hash table contained it as a key).
However, if you want to ensure that an expensive object goes away promptly when you want it to, there is a simple pattern you can use to ensure this.
// This could also be some built-in class rather than one you write
class ExpensiveObject {
// ... fields holding giant arrays, external resources, etc etc go here ...
void someOperation() {
// ... code ...
}
}
class ExpensiveObjectHolder {
ExpensiveObject target;
ExpensiveObjectHolder(ExpensiveObject target) {
this.target = target;
}
void someOperation() {
if (target == null) {
throw new IllegalStateException(
"This ExpensiveObject has been deleted.");
} else {
target.someOperation();
}
// You can also omit this if/else and just let target.someOperation()
// throw NullPointerException, but that might be annoying to debug.
}
void delete() {
target = null;
}
}
Then you create the wrapper around the object:
ExpensiveObjectHolder h = new ExpensiveObjectHolder(new ExpensiveObject());
and when you want it to go away, you do:
h.delete();
and since target was the only reference to the ExpensiveObject, which is now gone, we know that it is now garbage and will be discarded when the garbage collector notices.
Further notes:
It might be useful to make the ExpensiveObject and ExpensiveObjectHolder implement the same interface, but this can also make it easier to forget to use the holder where it should be.
If the ExpensiveObject has methods which do something like return this;, make sure that the method in the holder returns the holder instead.

Are there any cases where a private boolean (primitive) field variable defaults to something other than false?

Say I have a class, DisplaySpecificType.java which extends the abstract class DisplayBase.java. In my class I have a few fields:
private boolean noData;
private List<String> selections;
private List<Integer> intSelections;
And in the overridden method, the first thing I do is retrieve some data:
#Override
protected void initResponse(DataHolder holder) {
String dataString = holder.getDataString();
intSelections = prepareSelections(dataString);
...
}
Here's the prepare method:
protected List<Integer> prepareSelections(String dataString) {
String trimmedData = StringUtils.trimToNull(dataString); // Empty strings -> null
selections = new ArrayList<String>();
if (trimmedData == null) {
noData = true;
} else {
// Do some operations on trimmedData, convert to integers
// (in String format) and add to selections array
}
intSelections = new ArrayList<Integer>();
if (!noData) {
for (String index : selections) {
int num = Integer.parseInt(index) - 1;
intSelections.add(num);
}
}
return intSelections;
}
I can't really run the application from Eclipse as it interacts with several others, but I have several unit tests (Junit) that test the functionality. Creating a DataHolder object and setting the dataString in the test causes no issues and everything comes out dandy.
In practice, however, running the application that invokes this class caused intSelections to turn up empty, even when dataString was not. I logged the value of selections right after it was populated with the data from trimmedValues (outside of the if/else statement), which was being filled with the correct data when it should have been. I then logged the value of intSelections right after the for loop that populated it, and to my surprise, it was empty.
So I'd have something like this:
logger: selections = [1,2,3]
logger: intSelections = []
The noData boolean can't be set to true if selections has data. But clearly the if statement that populates intSelections is being passed over, leading me to think that noData has somehow been set to true.
Initializing noData to false at the beginning of the prepare method fixed the problem, but I'm still stumped as to why.
We're using spring, maven, and MOM queues, but I'm not terribly familiar with how it all executes. I do know that no properties are being set in the spring configs, it just initializes the DisplaySpecificType class via spring bean in the service context.
Sorry for the long-winded question, does anyone know what's going on here?
EDIT:
As I mentioned in the comments, here's what I think happened:
I was assuming that every command issued to the main application (commands go in and are eventually parsed by my project, which is where DataHolder and dataString come from) would create a new instance of this class, but this might not be the case.
If input was passed that caused dataString to be null or empty, noData would be set to true, and if the instance was not destroyed it would remain true when the next input was passed through. It's not very clear to me how the application interacts with all the other programs, so this is just my best guess.
Are there any cases where a private boolean (primitive) field variable
defaults to something other than false?
No. As others have mentioned, something else must be going on with your code. Per the Java Language Specification
4.12.5. Initial Values of Variables
Every variable in a program must have a value before its value is
used:
Each class variable, instance variable, or array component is
initialized with a default value when it is created (§15.9, §15.10):
...
For type boolean, the default value is false.

What does AtomicReference.compareAndSet() use for determination?

Say you have the following class
public class AccessStatistics {
private final int noPages, noErrors;
public AccessStatistics(int noPages, int noErrors) {
this.noPages = noPages;
this.noErrors = noErrors;
}
public int getNoPages() { return noPages; }
public int getNoErrors() { return noErrors; }
}
and you execute the following code
private AtomicReference<AccessStatistics> stats =
new AtomicReference<AccessStatistics>(new AccessStatistics(0, 0));
public void incrementPageCount(boolean wasError) {
AccessStatistics prev, newValue;
do {
prev = stats.get();
int noPages = prev.getNoPages() + 1;
int noErrors = prev.getNoErrors;
if (wasError) {
noErrors++;
}
newValue = new AccessStatistics(noPages, noErrors);
} while (!stats.compareAndSet(prev, newValue));
}
In the last line while (!stats.compareAndSet(prev, newValue)) how does the compareAndSet method determine equality between prev and newValue? Is the AccessStatistics class required to implement an equals() method? If not, why? The javadoc states the following for AtomicReference.compareAndSet
Atomically sets the value to the given updated value if the current value == the expected value.
... but this assertion seems very general and the tutorials i've read on AtomicReference never suggest implementing an equals() for a class wrapped in an AtomicReference.
If classes wrapped in AtomicReference are required to implement equals() then for objects more complex than AccessStatistics I'm thinking it may be faster to synchronize methods that update the object and not use AtomicReference.
It compares the refrerences exactly as if you had used the == operator. That means that the references must be pointing to the same instance. Object.equals() is not used.
Actually, it does not compare prev and newValue!
Instead it compares the value stored within stats to prev and only when those are the same, it updates the value stored within stats to newValue. As said above it uses the equals operator (==) to do so. This means that anly when prev is pointing to the same object as is stored in stats will stats be updated.
It simply checks the object reference equality (aka ==), so if object reference held by AtomicReference had changed after you got the reference, it won't change the reference, so you'll have to start over.
Following are some of the source code of AtomicReference. AtomicReference refers to an object reference. This reference is a volatile member variable in the AtomicReference instance as below.
private volatile V value;
get() simply returns the latest value of the variable (as volatiles do in a "happens before" manner).
public final V get()
Following is the most important method of AtomicReference.
public final boolean compareAndSet(V expect, V update) {
return unsafe.compareAndSwapObject(this, valueOffset, expect, update);
}
The compareAndSet(expect,update) method calls the compareAndSwapObject() method of the unsafe class of Java. This method call of unsafe invokes the native call, which invokes a single instruction to the processor. "expect" and "update" each reference an object.
If and only if the AtomicReference instance member variable "value" refers to the same object is referred to by "expect", "update" is assigned to this instance variable now, and "true" is returned. Or else, false is returned. The whole thing is done atomically. No other thread can intercept in between. As this is a single processor operation (magic of modern computer architecture), it's often faster than using a synchronized block. But remember that when multiple variables need to be updated atomically, AtomicReference won't help.
I would like to add a full fledged running code, which can be run in eclipse. It would clear many confusion. Here 22 users (MyTh threads) are trying to book 20 seats. Following is the code snippet followed by the full code.
Code snippet where 22 users are trying to book 20 seats.
for (int i = 0; i < 20; i++) {// 20 seats
seats.add(new AtomicReference<Integer>());
}
Thread[] ths = new Thread[22];// 22 users
for (int i = 0; i < ths.length; i++) {
ths[i] = new MyTh(seats, i);
ths[i].start();
}
Following is the github link for those who wants to see the running full code which is small and concise.
https://github.com/sankar4git/atomicReference/blob/master/Solution.java

Categories

Resources