Queue isEmpty and poll incosistency? - java

Please see the following code segment:
private static Queue<Message> m_Queue;
public boolean isQueueEmpty()
{
if (m_Queue.isEmpty())
return true;
else
return false;
}
public WgwConferenceMessage dequeue(){
try{
if(!isQueueEmpty())
{
Message message = m_Queue.poll();
if (message != null)
{
if (!message.getMessage().equals(""))
Log4jWrapper.writeLog("Retrieved " + message.getMessage() + " from queue");
else
Log4jWrapper.writeLog(LogLevelEnum.ERROR, "<Queue> dequeue", "Message empty");
return message;
}
else
{
Log4jWrapper.writeLog(LogLevelEnum.TRACE, "<Queue> dequeue", " Q is empty!");
return null;
}
}
else
return null;
}
catch (Exception e)
{
ExceptionHandler.printException(e, "<Queue>", "dequeue");
return null;
}
}
public void enqueue(Message a_Message) throws Exception
{
try
{
if (m_Queue.offer(a_Message))
Log4jWrapper.writeLog(LogLevelEnum.TRACE, "<Queue> enqueue", "Pushed " + a_Message.getMessage() + " to queue");
else
throw new Exception("Queue - Could not push message to queue");
}
catch (Exception e)
{
ExceptionHandler.printException(e, "Queue", "enqueue");
}
}
My problem is that eventually I get " Q is empty!" log line.
And I can't understand how can it be?
isQueueEmpty() says the Q is not empty, and poll says it is!
Can you advice please?
Thank you.

Assuming this code is accessed by multiple threads, the reason is that the check for emptiness and the subsequent polling are not done atomically: they are two separate actions. This means that it is possible for a different thread to call poll on the queue in between the first thread checking if it is empty and calling poll itself; if there happens only to have been one element in the queue, one of these threads is going to get null back from the call to poll.
Quoting the Javadoc for Queue:
Queue implementations generally do not allow insertion of null elements, although some implementations, such as LinkedList, do not prohibit insertion of null. Even in the implementations that permit it, null should not be inserted into a Queue, as null is also used as a special return value by the poll method to indicate that the queue contains no elements.
This means that you should use the fact that null is returned by poll as an indication that the queue was empty - you don't need to do the calls separately.
poll may be atomic - depending on the implementation of Queue you are actually using:
If you're using a non-synchronized implementation like LinkedList, you should be synchronizing it anyway if multiple threads are modifying the list, making poll atomic;
Concurrent implementations like BlockingQueue implements poll atomically, so you don't need to worry about doing anything explicitly.
TL;DR:
Remove the !isQueueEmpty() check
Ensure that your poll method is atomic either by choosing a concurrent implementation, or by synchronizing mutations of the queue.

your initialization of Queue is as follows:
m_Queue = new LinkedList<Message>();
LinkedinList is an implementation of Queue which allows null to be added.
so basically you are adding null values into your m_Queue
And as #Andy mentioned, such implementation should not be used when using poll() method.
There are two ways to avoid that
before adding Message to m_Queue you can check if its null or not
new LinkedList<Message>(); to new ArrayDequeue<Message>(); where it throws Exception if you are adding null to your Queue
I prefer second one as it makes it as a Queue should really be.

Related

Caching values in ConcurrentHashmap to avoid database read

Is my code below correct at using a Map as a simple threadsafe cache to avoid reading from the database? I just want to know the correctness of the code below rather than suggestions to use framework X instead.
public class Foo {
private static final Map<String, String> CACHE = new ConcurrentHashMap<>();
public void doWork(String key) {
String value = CACHE.get(key);
if (value == null) {
synchronized (CACHE) {
value = CACHE.get(key);
if (value == null) {
value = database.getValue();
CACHE.put(key, value);
}
}
}
// do work with value
}
}
Other Questions:
Instead of using CACHE in synchronized(), would it be better if I have a Object lock in my class and use synchronized on that instead?
Would using HashMap for CACHE instead work?
There is a fairly standard "pattern" for using ConcurrentHashMap in this way (in this case, you do not want to use a synchronized block or other locking mechanism):
String value = CACHE.get(key);
if (value == null) {
/* 3 */ String newValue = calculateValueForKey(key);
/* 4 */ value = CACHE.putIfAbsent(key, newValue);
if (value == null) {
value = newValue;
}
}
/* Work with 'value' */
This approach works well when calculateValueForKey() runs quickly and doesn't have any side effects - it could be invoked multiple times for the same key depending on timing. The downside is that if calculateValueForKey() takes a long time and is I/O bound (as it is in your case) you could have multiple threads that are all running calculateValueForKey() for the same key at the same time. If there are 3 threads executing line 3 for the same key, 2 of them will "lose" at line 4 and have their results thrown away which is not very efficient. For these situations I would recommend something along these lines which is mostly lifted from the Memoizer example in Java Concurrency in Practice (Goetz, B. (2006)) which I highly recommend:
private static final ConcurrentMap<String, Future<String>> CACHE
= new ConcurrentHashMap<>();
public void doWork(String key)
{
String value;
try {
value = calculateValueForKey(key);
} catch (InterruptedException e) {
// Restore interrupted status and return
Thread.currentThread.interrupt();
return;
}
// do work with value
}
private String calculateValueForKey(final String key)
throws InterruptedException
{
while (true) {
Future<String> f = CACHE.get(key);
if (f == null) {
FutureTask<String> newCalc = new FutureTask<>(new Callable<String>() {
#Override
public String call()
{
return database.getValue(key);
}
)};
f = CACHE.putIfAbsent(key, newCalc);
if (f == null) {
f = newCalc;
newCalc.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
CACHE.remove(key, f);
} catch (ExecutionException e) {
Throwable cause = e.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
} else {
throw new IllegalStateException("Not unchecked", cause);
}
}
}
}
Obviously this code is more complex, which is why I've extracted the meat of it into another method, but it is very powerful. Rather than putting the value into the map, you are putting a Future that represents the calculation of that value into the map. Calling get() on that future will block until the computation is complete. This means that if 3 threads were simultaneously trying to retrieve the value for a given key, only a single computation would be run while all 3 threads waiting on the same result. Subsequent requests for the same key would return immediately with the calculated result.
To answer your specific questions:
Is my code below correct at using a Map as a simple threadsafe cache to avoid reading from the database? I'm going to say no. You're use of a synchronized block here is unnecessary. Furthermore if multiple threads are simultaneously trying to access the values for different keys that are not yet in the Map, they will block each other during their respective database queries, meaning that they will run in serial rather than in parallel.
Instead of using CACHE in synchronized(), would it be better if I have a Object lock in my class and use synchronized on that instead? No. You would typically use a surrogate object for synchronization when you want to read/write multiple mutable fields and you don't want consumers of your class to be able to affect the synchronization semantics of your object "from the outside."
Would using HashMap for CACHE instead work? I guess you could? But then you would need to adjust your synchronization policies so that CACHE (or a surrogate lock object) is always synchronized when the Map is read from or written to. I'm not sure why you would want to do that given better alternatives.
CACHE.get(key) will throw a NullPointerException if the key is null. Read the manual:
Like Hashtable but unlike HashMap, this class does not allow null to be used as a key or value.
Furthermore it doesn't really make sense to synchronize over your map and try to retrieve the value again. The method should rather return that it cannot get a value for that key and that's it!
Also, no need to synchronize over a ConcurrentHashMap hence the name.
Create an additional method which retrieves the value from the database if the value is not in the map!
I strongly suggest to test your methods with unit tests!
Be careful with custom cache-es. Sometimes they only make things worse. I.e. they are a great source of reference leak, e.g. when the last reference to the object comes from the cache. WeakReference-s or PhantomReference-s can solve this problem. Check this post for further details.
Another issue is the synchronization hit that comes from the ConcurrentHashMap. Sometimes it's worth the cost, sometimes not.
You might want to limit the cache size and remove the least used references - but that will cause some overhead too.
So, you'll have to measure performance carefully.

Null pointer exception for a multithreaded java code

I am having a problem with a Java code that is using SimJava simulation library. The library basically helps with creating independent entities that can run as a java thread. The problem I am having is that I have a code segment that is used as the body of each of the entities that are running as a thread. These entities/threads are sharing an event (that transfers a ConcurrentLinkedQueue) between these threads. I used the concurrentlinkedqueue as I had problems with LinkedList concurrency control.
The problem is that if I run the model for 100 repetitions or less, it works fine. If i go 100 or more i get a null pointer exception for the concurrent queue. Here is the code segment of the body that has the problem.
Null pointer exception happens at the line where I am trying to pull from queue even though the line before is checking if the queue is empty or null. The exception is thrown at the line
"nextNode = dcPath.poll().intValue();”
For some reason the poll call is returning a null and the intValue() is being applied to a null object. My question is that how is this possible with the if statement before is already checking for the queue content? How can I control this race condition?
public void body() {
synchronized (this){
ConcurrentLinkedQueue<Integer> dcPath = new ConcurrentLinkedQueue<Integer> ();
int nextNode;
int distance = 0;
while (Sim_system.running()) {
Sim_event e = new Sim_event();
sim_get_next(e); // Get the next event
dcPath = (ConcurrentLinkedQueue<Integer>) e.get_data();
if ((dcPath != null) && (!dcPath.isEmpty())){
nextNode = dcPath.poll().intValue(); // THIS LINE IS THROWING NPE Exception
if ((dcPath != null) && (!dcPath.isEmpty())){
int outPort = findMatchingOutPort(dcPath.peek().intValue());
if (outPort != -1){
sim_schedule(out[outPort], 0.0, 0, dcPath);
distance = this.calculateSensorToSensorDistance (out[outPort].get_dest());
}
}
}
}
I think the problem would be that when you retrieve the dc_path from e.get_data(), another thread is concurrently read/write into that queue. So you have your code with:
if ((dcPath != null) && (!dcPath.isEmpty()) {
Which, at the meantime, dcPath is not empty. But when executing the next line, another thread pop up the remaining element and make the queue empty, and that's why you get the dcPath.poll() a null value.
To prevent this, you need to synchronize your dcPath reference instead of this. Like following:
if (dcPath != null) {
synchronized (dcPath) {
//do something
}
}
And also, in any other thread which involve in read/write with the object, you need also sync it to make sure behave as you expected.
I figured out how to resolve the issue, i wouldn't say this is the most efficient way (especially in terms of memory management) but it will do the job for my simulation model and the number of instances I need to run.
I basically created a copy of the ConcurrentLinkedQueue dcPath object at each thread instance and moved and when the object. This eliminated the race condition when manipulating the object. I can now run more than a thousand iterations without any exceptions being thrown with extended threads that can exceed 500 instances.
if (e.get_data() != null){
ConcurrentLinkedQueue<Integer> dcPath = new ConcurrentLinkedQueue<Integer>
((ConcurrentLinkedQueue<Integer>) e.get_data());

Will the use of a Bounded Buffer (Producer/Consumer) avoid the pain of having sync methods/deadlocks?

I'm coding a simple Bank simulator where users would login from different locations at once, using sockets. In the Bank server I keep a bounded buffer to store every incoming request, ex: transfer funds, get account balance etc and there's a Background Thread running at Server end (Buffer Reader) to pull out each request from this Request Queue (assume it works as a Thread Scheduler in OS), in FCFS basis.
I have made buffer's put() and get() methods to have conditional synchronization.
ex:
// put method
while(total_buffer_size == current_total_requests) {
System.out.println("Buffer is full");
wait();
}
So my question is, do we have to synchronize methods like get-balance or transfer-funds to avoid corruption of data? I believe it is not necessary since the Buffer Reader takes each request one-by-one and relevant action. Have I avoided any deadlock situations through this? What do you think? Thanks
EDIT2:
public synchronized boolean put(Messenger msg, Thread t, Socket s) throws InterruptedException {
while(total_buffer_size == current_total_requests) {
System.out.println("Buffer is full");
wait();
}
current_total_requests++;
requests[cur_req_in] = new Request(msg, s); // insert into Queue
cur_req_in = (cur_req_in + 1) % total_buffer_size ;
notifyAll();
return true;
}
// take each incoming message in queue. FIFO rule followed
public synchronized Request get() throws InterruptedException {
while(current_total_requests==0) wait();
Request out = requests[cur_req_out];
requests[cur_req_out] = null;
cur_req_out = (cur_req_out + 1) % total_buffer_size ;
current_total_requests--;
notifyAll(); //wake all waiting threads to continue put()
return out;
}
If there is only one consumer (i.e. one thread that consumes the requests from the "buffer") , then you don't need to use any synchronization on the methods relating to the bank account. However, I don't believe that your current implementation of a "bounded buffer" is valid. To be more specific:
while(total_buffer_size == current_total_requests) {
System.out.println("Buffer is full");
wait();
}
There is absolutely no guarantee how many threads will get past the while loop, perform a context switch just before current_total_requests is incremented and queue more requests than what's allowed the buffer size. Unless your put method is synchronized, this approach will be extremely unreliable and prone to race conditions.
If you want a bounded buffer, then just use one of Java's already existing "bounded buffers" or more specifically: the BlockingQueue. The BlockingQueue blocks on put(...):
Inserts the specified element into this queue, waiting if necessary for space to become available.
It also blocks on take() if there is no data in the queue. I don't know if you can use one of the items in the concurrency library, but if you can't then you have to fix your BoundedBuffer.

Observer - BlockingQueue

I'm using the observer pattern and a BlockingQueue to add some instances. Now in another method I'm using the queue, but it seems take() is waiting forever, even though I'm doing it like this:
/** {#inheritDoc} */
#Override
public void diffListener(final EDiff paramDiff, final IStructuralItem paramNewNode,
final IStructuralItem paramOldNode, final DiffDepth paramDepth) {
final Diff diff =
new Diff(paramDiff, paramNewNode.getNodeKey(), paramOldNode.getNodeKey(), paramDepth);
mDiffs.add(diff);
try {
mDiffQueue.put(diff);
} catch (final InterruptedException e) {
LOGWRAPPER.error(e.getMessage(), e);
}
mEntries++;
if (mEntries == AFTER_COUNT_DIFFS) {
try {
mRunner.run(new PopulateDatabase(mDiffDatabase, mDiffs));
} catch (final Exception e) {
LOGWRAPPER.error(e.getMessage(), e);
}
mEntries = 0;
mDiffs = new LinkedList<>();
}
}
/** {#inheritDoc} */
#Override
public void diffDone() {
try {
mRunner.run(new PopulateDatabase(mDiffDatabase, mDiffs));
} catch (final Exception e) {
LOGWRAPPER.error(e.getMessage(), e);
}
mDone = true;
}
whereas mDiffQueue is a LinkedBlockingQueue and I'm using it like this:
while (!(mDiffQueue.isEmpty() && mDone) || mDiffQueue.take().getDiff() == EDiff.INSERTED) {}
But I think the first expression is checked whereas mDone isn't true, then maybe mDone is set to true (an observer always is multithreaded?), but it's already invoking mDiffQueue.take()? :-/
Edit: I really don't get it right now. I've recently changed it to:
synchronized (mDiffQueue) {
while (!(mDiffQueue.isEmpty() && mDone)) {
if (mDiffQueue.take().getDiff() != EDiff.INSERTED) {
break;
}
}
}
If I wait in the debugger a little time it works, but it should also work in "real time" since mDone is initialized to false and therefore the while-condition should be true and the body should be executed.
If the mDiffQueue is empty and mDone is true it should skip the body of the while-loop (which means the queue isn't filled anymore).
Edit: Seems it is:
synchronized (mDiffQueue) {
while (!(mDiffQueue.isEmpty() && mDone)) {
if (mDiffQueue.peek() != null) {
if (mDiffQueue.take().getDiff() != EDiff.INSERTED) {
break;
}
}
}
}
Even though I don't get why the peek() is mandatory.
Edit:
What I'm doing is iterating over a tree and I want to skip all INSERTED nodes:
for (final AbsAxis axis = new DescendantAxis(paramRtx, true); axis.hasNext(); axis.next()) {
skipInserts();
final IStructuralItem node = paramRtx.getStructuralNode();
if (node.hasFirstChild()) {
depth++;
skipInserts();
...
Basically computing the maximum depth or level in the tree without considering nodes which have been deleted in another revision of the tree (for a comparsion Sunburst visualization), but ok, that's maybe out of scope. Just to illustrate that I'm doing something with nodes which haven't been inserted, even if it's just adjusting the maximum depth.
regards,
Johannes
take() is a "blocking call". That means it will block (wait forever) until something is on the queue then it will return what was added. Of course, if something is on the queue, it will return immediately.
You can use peek() to return what would be returned by take() - that is, peek() returns the next item without removing it from the queue, or returns null if there's nothing on the queue. Try using peek() instead in your test (but check for null too).
First advice: don't synchronized (mDiffQueue). You would get deadlock if the LinkedBlockingQueue had some synchronized method; it's not the case here, but it's a practice that you should avoid. Anyway, I don't see why you are synchronizing at that point.
You have to "wake up" periodically while waiting to check if mDone has been set:
while (!(mDiffQueue.isEmpty() && mDone)) {
// poll returns null if nothing is added in the queue for 0.1 second.
Diff diff = mDiffQueue.poll(0.1, TimeUnit.SECONDS);
if (diff != null)
process(diff);
}
This is about the same as using peek, but peek basically waits for a nanosecond instead. Using peek is called "busy waiting" (your thread runs the while loop non-stop) and using pool is called "semi-busy waiting" (you let the thread sleep at intervals).
I guess in your case process(diff) would be to get out of the loop if diff is not of type EDiff.INSERTED. I'm not sure if that is what you are trying to accomplish. This seems odd since you are basically just stalling the consumer thread until you get a single element of the right type, and then you do nothing with it. And you cannot receive the future incoming elements since you are out of the while loop.

Producer-consumer problem with a twist

The producer is finite, as should be the consumer.
The problem is when to stop, not how to run.
Communication can happen over any type of BlockingQueue.
Can't rely on poisoning the queue(PriorityBlockingQueue)
Can't rely on locking the queue(SynchronousQueue)
Can't rely on offer/poll exclusively(SynchronousQueue)
Probably even more exotic queues in existence.
Creates a queued seq on another (presumably lazy) seq s. The queued
seq will produce a concrete seq in the background, and can get up to
n items ahead of the consumer. n-or-q can be an integer n buffer
size, or an instance of java.util.concurrent BlockingQueue. Note
that reading from a seque can block if the reader gets ahead of the
producer.
http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/seque
My attempts so far + some tests: https://gist.github.com/934781
Solutions in Java or Clojure appreciated.
class Reader {
private final ExecutorService ex = Executors.newSingleThreadExecutor();
private final List<Object> completed = new ArrayList<Object>();
private final BlockingQueue<Object> doneQueue = new LinkedBlockingQueue<Object>();
private int pending = 0;
public synchronized Object take() {
removeDone();
queue();
Object rVal;
if(completed.isEmpty()) {
try {
rVal = doneQueue.take();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
pending--;
} else {
rVal = completed.remove(0);
}
queue();
return rVal;
}
private void removeDone() {
Object current = doneQueue.poll();
while(current != null) {
completed.add(current);
pending--;
current = doneQueue.poll();
}
}
private void queue() {
while(pending < 10) {
pending++;
ex.submit(new Runnable() {
#Override
public void run() {
doneQueue.add(compute());
}
private Object compute() {
//do actual computation here
return new Object();
}
});
}
}
}
Not exactly an answer I'm afraid, but a few remarks and more questions. My first answer would be: use clojure.core/seque. The producer needs to communicate end-of-seq somehow for the consumer to know when to stop, and I assume the number of produced elements is not known in advance. Why can't you use an EOS marker (if that's what you mean by queue poisoning)?
If I understand your alternative seque implementation correctly, it will break when elements are taken off the queue outside your function, since channel and q will be out of step in that case: channel will hold more #(.take q) elements than there are elements in q, causing it to block. There might be ways to ensure channel and q are always in step, but that would probably require implementing your own Queue class, and it adds so much complexity that I doubt it's worth it.
Also, your implementation doesn't distinguish between normal EOS and abnormal queue termination due to thread interruption - depending on what you're using it for you might want to know which is which. Personally I don't like using exceptions in this way — use exceptions for exceptional situations, not for normal flow control.

Categories

Resources