Remove from a collection during iteration - java

I have set of connection objects (library code I cannot change) that have a send method. If the sending fails, they call back a generic onClosed listener which I implement that calls removeConnection() in my code, which will remove the connection from the collection.
The onClosed callback is generic and can be called at any time. It is called when the peer closes the connection, for example, and not just when a write fails.
However, if I have some code that loops over my connections and sends, then the onClosed callback will attempt to modify a collection during iteration.
My current code creates a copy of the connections list before each iteration over it; however, in profiling this has shown to be very expensive.
Set<Connection> connections = new ....;
public void addConnection(Connection conn) {
connections.add(conn);
conn.addClosedListener(this);
}
#Override void onClosed(Connection conn) {
connections.remove(conn);
}
void send(Message msg) {
// how to make this so that the onClosed callback can be safely invoked, and efficient?
for(Connection conn: connections)
conn.send(msg);
}
How can I efficiently cope with modifying collections during iteration?

To iterate a collection with the concurrent modification without any exceptions use List Iterator.
http://www.mkyong.com/java/how-do-loop-iterate-a-list-in-java/ - example
If you use simple for or foreach loops, you will receive ConcurrentModificationException during the element removing - be careful on that.
As an addition, you could override the List Iterator with your own one and add the needed logic. Just implement the java.util.Iterator interface.

A ConcurrentSkipListSet is probably what you want.
You could also use a CopyOnWriteArraySet. This of course will still make a copy, however, it will only do so when the set is modified. So as long as Connection objects are not added or removed regularly, this would be more efficient.

You can also use ConcurrentHashMap.
ConcurrentHashMap is thread-safe, so you don't need to make a copy in order to be able to iterate.
Take a look at this implementation.. http://www.java2s.com/Tutorial/Java/0140__Collections/Concurrentset.htm

I would write a collection wrapper that:
Keeps a set of objects that are to be removed. If the iteration across the underlying collection comes across one of these it is skipped.
On completion of iteration, takes a second pass across the list to remove all of the gathered objects.
Perhaps something like this:
class ModifiableIterator<T> implements Iterator<T> {
// My iterable.
final Iterable<T> it;
// The Iterator we are walking.
final Iterator<T> i;
// The removed objects.
Set<T> removed = new HashSet<T>();
// The next actual one to return.
T next = null;
public ModifiableIterator(Iterable<T> it) {
this.it = it;
i = it.iterator();
}
#Override
public boolean hasNext() {
while ( next == null && i.hasNext() ) {
// Pull a new one.
next = i.next();
if ( removed.contains(next)) {
// Not that one.
next = null;
}
}
if ( next == null ) {
// Finished! Close.
close();
}
return next != null;
}
#Override
public T next() {
T n = next;
next = null;
return n;
}
// Close down - remove all removed.
public void close () {
if ( !removed.isEmpty() ) {
Iterator<T> i = it.iterator();
while ( i.hasNext() ) {
if ( removed.contains(i.next())) {
i.remove();
}
}
// Clear down.
removed.clear();
}
}
#Override
public void remove() {
throw new UnsupportedOperationException("Not supported.");
}
public void remove(T t) {
removed.add(t);
}
}
public void test() {
List<String> test = new ArrayList(Arrays.asList("A","B","C","D","E"));
ModifiableIterator i = new ModifiableIterator(test);
i.remove("A");
i.remove("E");
System.out.println(test);
while ( i.hasNext() ) {
System.out.println(i.next());
}
System.out.println(test);
}
You may need to consider whether your list could contain null values, in which case you will need to tweak it somewhat.
Please remember to close the iterator if you abandon the iteration before it completes.

Related

Java DeepCopy Iterator without consuming it

How can I copy my iterator to another one without consuming it ? Or at least if I can reset the index back to first element after I can consume it.
I am looking for something like below, where it should still print the values after copying it;
Iterator iter2=copy(iter1);
while(iter1.hasNext())
{
System.out.println(iter1.next()); // Should print this, even after copy
}
The contract of an Iterator is to be an "only forward" way of iterating through a series of objects.
As mentioned in your comment, you are trying to log the values of an Iterator, yet still the use the Iterator elsewhere.
You could though do something tricky by wrapping the Iterator with a custom class which calls the wrapped Iterator and logs the values as the next method is called.
A bit hacky. Not recommended in general but could be useful in a debugging situation.
You would construct this WrappedIterator using the original Iterator as parameter and then pass the WrappedIterator to the code which consumes it.
public class WrappedIterator<T> implements Iterator<T> {
private Iterator<T> iterator;
public WrappedIterator(Iterator<T> iterator) {
this.iterator = iterator;
}
#Override
public void remove() {
this.iterator.remove();
}
#Override
public boolean hasNext() {
return this.iterator.hasNext();
}
#Override
public T next() {
T next = iterator.next();
System.out.println(next);
return next;
}
}

ArrayList iterator throwing ConcurrentModificationException

I have an ArrayList with two accessor methods and a notifier. My list:
private final List<WeakReference<LockListener>> listeners = new ArrayList<>();
All subscribe operations use this:
public void subscribe(#NonNull LockListener listener) {
for (Iterator<WeakReference<LockListener>> it = listeners.iterator(); it.hasNext(); ) {
// has this one already subscribed?
if (listener.equals(it.next().get())) {
return;
}
}
listeners.add(new WeakReference<>(listener));
}
All unsubscribe operations use this:
public void unsubscribe(#NonNull LockListener listener) {
if (listeners.isEmpty()) {
return;
}
for (Iterator<WeakReference<LockListener>> it = listeners.iterator(); it.hasNext(); ) {
WeakReference<LockListener> ref = it.next();
if (ref == null || ref.get() == null || listener.equals(ref.get())) {
it.remove();
}
}
}
And the notifier:
private void notifyListeners() {
if (listeners.isEmpty()) {
return;
}
Iterator<WeakReference<LockListener>> it = listeners.iterator();
while (it.hasNext()) {
WeakReference<LockListener> ref = it.next();
if (ref == null || ref.get() == null) {
it.remove();
} else {
ref.get().onLocked();
}
}
}
What I'm seeing in my testing is that it.next() in notifyListeners() occasionally throws a ConcurrentModificationException. My guess is that this is due to listeners.add() in the subscriber method.
I guess I had a misunderstanding of the iterator here. I was under the assumption that iterating over the list protected me from concurrency issues caused by add/remove operations.
Apparently I'm wrong here. Is it that the iterator is only a protection from ConcurrentModificationException while changing the collection you're iterating? For example, calling remove() on your list while iterating would throw an error, but calling it.remove() is safe.
In my case, subscribing calls add() on the same list as it is being iterated. Is my understanding here correct?
If I read your last sentence correctly, the three methods in your example are called concurrently from several threads. If this is indeed the case, then this is your problem.
ArrayList is not thread-safe. Modifying it concurrently without additional synchronization results in undefined behavior, no matter if you modify it directly or using an iterator.
You could either synchronize access to the list (e.g. making the three methods synchronized), or use a thread-safe collection class like ConcurrentLinkedDeque. In case of the latter, make sure to read the JavaDoc (especially the part about iterators being weekly consistent) to understand what is guaranteed and what is not.

Is it safe clear a Set in a loop if it finds the correct value?

I'm in this situation: if I find a specific value in a HashSet, I have to update a field, clear the set and return the field.
Here one example:
static Set<Integer> testSet = new HashSet<>();
static Integer myField = null; // the field could be already != null
public static int testClearSet()
{
for (int i = 0; i < 100; i++) { // this is just for the test
testSet.add(i);
}
for (Integer n : testSet) {
if (n == 50) {
myField = n;
testSet.clear();
return myField;
}
}
return -1;
}
I'm wondering if doing this to the set it's safe, considering the fact that later on I should reuse the set.
I'm asking this, because I knew that to make changes over a Collection while iterating, is not a "good practice", but this case I think is a little bit different.
A possible solution would be:
boolean clear = false;
for (Integer n : testSet) {
if (n == 50) {
myField = n;
clear = true;
break;
}
}
if (clear) {
testSet.clear();
return myField;
}
So, which one is the right way?
It should be safe to remove elements from a set when using an explicit iterator. Hence the following should be safe:
Iterator<Integer> iterator = testSet.iterator();
while (iterator.hasNext()) {
Integer element = iterator.next();
if (element.intValue() == 50) {
testSet.clear();
break;
}
}
A ConcurrentModificationException is only thrown if you continue iterating after changing it manually.
What you do is change it and abort iterating, so it should be 100% safe (regardless of the for-each implementation).
The real issue is, the readability of the code. A piece of code should ideally do one job, and if this job is complicated, split it up. In particular, your code has two parts, a condition and an action:
if (some condition) do some action
So:
public static int testClearSet() {
if (setConatins(50)) {
myField = 50;
testSet.clear();
return myField;
}
return -1;
}
private static boolean setConatins(int searchFor) {
for (Integer n : testSet) {
if (n == searchFor) {
return true;
}
}
return false;
}
The latter method can be replaced with a single API call, for you to figure out.
If you know that your Set changing only in one thread, so you can clean it like in first example.
Method clear() does not throw ConcurrentModificationException.
Both your code will work.
There is indeed a restriction in modifying the collection when u iterate using fail fast iterators. That means, iterating using fail fast iterator will fail if there is any modification in the collection after the iterator was created. All the default iterators that is returned by java collection classes are fail-fast iterators.
private void removeDataTest (Collection<String> c, String item) {
Iterator<String> iter = c.iterator(); //Iterator is created.
while (iter.hasNext()) {
String data = iter.next();
if (data.equals(item)) {
//c.remove(data); //Problem. Directly modifying collection after this iterator is created. In the next iteration it will throw concurrent modification exception.
iter.remove(); //This is fine. Modify collection through iterator.
//c.clear(); break; //This is also should be okay. Modifying the collection directly, but after that it is breaking out and not using the iterator.
}
}
}
In your code, u don't continue iteration after the set is modified. So should be fine.

Java: Implement stack with one queue, any problems?

So I tried to implement a stack with just one queue and it appears to work, but I'm not sure if there's something wrong with it since most of the solutions I've seen online use two queues. Can anyone tell if me if there are problems with my implementation?
public class MyStack<T> {
/**
* #param args
*/
private Queue<T> q = new LinkedList<T>();
public MyStack(){
}
public static void main(String[] args) {
// TODO Auto-generated method stub
MyStack<String> s = new MyStack<String>();
s.push("1");
s.push("2");
s.push("3");
s.push("4");
System.out.println(s.pop());
System.out.println(s.pop());
System.out.println(s.pop());
System.out.println(s.pop());
System.out.println(s.pop());
}
public void push(T s){
q.offer(s);
}
public T pop(){
int n = q.size();
for(int i = 0; i < n-1; i++){
q.offer(q.poll());
}
return q.poll();
}
}
Output:
4
3
2
1
null
Your solution is inefficient because you have to loop through the whole stack every time you pop something from it. (Effectively you have to traverse the whole linked list, before removing the element that was at the end.)
Edit: Java's linked list is doubly linked anyway, so this is entirely pointless.
You should use either a Stack or a Deque or even a LinkedList.
Implementing your own is just ... pointless. Unless of course (as #bas suggests) you are doing a course on data structures in which case you should go Commando and implement your own structure from scratch. Using another structure because it is nearly like the one you are trying to make is like using a hammer with screws.
If you really need to implement something yourself something like this should work:
public class Stack<T> {
private Entry top = null;
private class Entry {
final Entry up;
final T it;
public Entry(Entry up, T it) {
this.up = up;
this.it = it;
}
}
public void push ( T it ) {
top = new Entry(top, it);
}
public T pop () {
if ( top == null ) {
throw new EmptyStackException();
}
T it = top.it;
top = top.up;
return it;
}
}
NB: This may not be thread safe.
There is absolutely no reason a stack should use two queues. As a matter of fact, it only needs to keep track of one top-node that references the nodes below it.
The code seems to work, but as nachokk said, this is not the site for code review. This site is ment if you run into errors and require assistance.
You must use two queues ONLY when you have basic queues operations, like enqueue and dequeue. When you can use other methods, especially iterating over queue, you can do it with only one queue, like you did.

Advice for efficient blocking queries

I would like to store tuples objects in a concurent java collection and then have an efficient, blocking query method that returns the first element matching a pattern. If no such element is available, it would block until such element is present.
For instance if I have a class:
public class Pair {
public final String first;
public final String Second;
public Pair( String first, String second ) {
this.first = first;
this.second = second;
}
}
And a collection like:
public class FunkyCollection {
public void add( Pair p ) { /* ... */ }
public Pair get( Pair p ) { /* ... */ }
}
I would like to query it like:
myFunkyCollection.get( new Pair( null, "foo" ) );
which returns the first available pair with the second field equalling "foo" or blocks until such element is added. Another query example:
myFunkyCollection.get( new Pair( null, null ) );
should return the first available pair whatever its values.
Does a solution already exists ? If it is not the case, what do you suggest to implement the get( Pair p ) method ?
Clarification: The method get( Pair p) must also remove the element. The name choice was not very smart. A better name would be take( ... ).
Here's some source code. It basically the same as what cb160 said, but having the source code might help to clear up any questions you may still have. In particular the methods on the FunkyCollection must be synchronized.
As meriton pointed out, the get method performs an O(n) scan for every blocked get every time a new object is added. It also performs an O(n) operation to remove objects. This could be improved by using a data structure similar to a linked list where you can keep an iterator to the last item checked. I haven't provided source code for this optimization, but it shouldn't be too difficult to implement if you need the extra performance.
import java.util.*;
public class BlockingQueries
{
public class Pair
{
public final String first;
public final String second;
public Pair(String first, String second)
{
this.first = first;
this.second = second;
}
}
public class FunkyCollection
{
final ArrayList<Pair> pairs = new ArrayList<Pair>();
public synchronized void add( Pair p )
{
pairs.add(p);
notifyAll();
}
public synchronized Pair get( Pair p ) throws InterruptedException
{
while (true)
{
for (Iterator<Pair> i = pairs.iterator(); i.hasNext(); )
{
Pair pair = i.next();
boolean firstOk = p.first == null || p.first.equals(pair.first);
boolean secondOk = p.second == null || p.second.equals(pair.second);
if (firstOk && secondOk)
{
i.remove();
return pair;
}
}
wait();
}
}
}
class Producer implements Runnable
{
private FunkyCollection funkyCollection;
public Producer(FunkyCollection funkyCollection)
{
this.funkyCollection = funkyCollection;
}
public void run()
{
try
{
for (int i = 0; i < 10; ++i)
{
System.out.println("Adding item " + i);
funkyCollection.add(new Pair("foo" + i, "bar" + i));
Thread.sleep(1000);
}
}
catch (InterruptedException e)
{
Thread.currentThread().interrupt();
}
}
}
public void go() throws InterruptedException
{
FunkyCollection funkyCollection = new FunkyCollection();
new Thread(new Producer(funkyCollection)).start();
System.out.println("Fetching bar5.");
funkyCollection.get(new Pair(null, "bar5"));
System.out.println("Fetching foo2.");
funkyCollection.get(new Pair("foo2", null));
System.out.println("Fetching foo8, bar8");
funkyCollection.get(new Pair("foo8", "bar8"));
System.out.println("Finished.");
}
public static void main(String[] args) throws InterruptedException
{
new BlockingQueries().go();
}
}
Output:
Fetching bar5.
Adding item 0
Adding item 1
Adding item 2
Adding item 3
Adding item 4
Adding item 5
Fetching foo2.
Fetching foo8, bar8
Adding item 6
Adding item 7
Adding item 8
Finished.
Adding item 9
Note that I put everything into one source file to make it easier to run.
I know of no existing container that will provide this behavior. One problem you face is the case where no existing entry matches the query. In that case, you'll have to wait for new entries to arrive, and those new entries are supposed to arrive at the tail of the sequence. Given that you're blocking, you don't want to have to examine all the entries that precede the latest addition, because you've already inspected them and determined that they don't match. Hence, you need some way to record your current position, and be able to search forward from there whenever a new entry arrives.
This waiting is a job for a Condition. As suggested in cb160's answer, you should allocate a Condition instance inside your collection, and block on it via Condition#await(). You should also expose a companion overload to your get() method to allow timed waiting:
public Pair get(Pair p) throws InterruptedException;
public Pair get(Pair p, long time, TimeUnit unit) throws InterruptedException;
Upon each call to add(), call on Condition#signalAll() to unblock the threads waiting on unsatisfied get() queries, allowing them to scan the recent additions.
You haven't mentioned how or if items are ever removed from this container. If the container only grows, that simplifies how threads can scan its contents without worrying about contention from other threads mutating the container. Each thread can begin its query with confidence as to the minimum number of entries available to inspect. However, if you allow removal of items, there are many more challenges to confront.
In your FunkyCollection add method you could call notifyAll on the collection itself every time you add an element.
In the get method, if the underlying container (Any suitable conatiner is fine) doesn't contain the value you need, wait on the FunkyCollection. When the wait is notified, check to see if the underlying container contains the result you need. If it does, return the value, otherwise, wait again.
It appears you are looking for an implementation of Tuple Spaces. The Wikipedia article about them lists a few implementations for Java, perhaps you can use one of those. Failing that, you might find an open source implementation to imitate, or relevant research papers.

Categories

Resources