Random beginning index iterator for HashSet

Random beginning index iterator for HashSet - java

I use HashSet for add(); remove(); clear(); iterator(); methods. So far everything worked like a charm. However, now I need to fulfill a different requirement.
I'd like to be able to start iterating from a certain index. For example, I'd like the following two programs to have same output.
Program 1
Iterator it=map.iterator();
for(int i=0;i<100;i++)
{
it.next();
}
while (it.hasNext())
{
doSomethingWith(it.next());
}
Program 2
Iterator it=map.iterator(100);
while (it.hasNext())
{
doSomethingWith(it.next());
}
The reason I don't want to use the Program 1 is that it creates unnecessary overhead. From my research, I couldn't not find a practical way of creating an iterator with beginning index.
So, my question is, what would be a good way to achieve my goal while minimizing the overhead?
Thank you.

There is a reason why add(), remove(), are fast in a HashSet. You are trading the ability to treat the elements in the set as a random access list for speed and memory costs.
I'm afraid you can't really do that unless you convert your Set into a List first. This is simple to do but it usually involved a complete processing of all the elements in a Set. If you want the ability to start the iterator from a certain place more than once form the same state it might make sense. If not then you will probably be better with your current approach.
And now for the code (assuming that Set<Integer> set = new HashSet<Integer>(); is your declared data structure:
List<Integer> list = new ArrayList<Integer>(set);
list.subList(100, list.size()).iterator(); // this will get your iterator.

HashSet does not have order. So you could put the set into a List which could use index.
Example:
HashSet set = new HashSet();
//......
ArrayList list = new ArrayList(set);

Since a HashSet's iterator produces items in no particular order, it doesn't really make any difference whether you drop 100 items from the beginning or from the end.
Dropping items from the end would be faster.
Iterator it = map.iterator();
int n = map.size() - 100;
for (int i = 0; i < n; i++)
doSomethingWith(it.next());

You can make use of a NavigableMap. If you can rely on keys (and start from a certain key), that's out of the box.
Map<K,V> submap = navigableMap.tailMap(fromKey);
Then you'll use the resulting submap to simply get the iterator() and do your stuff.
Otherwise, if you must start at some index, you may need make use of a temporary list.
K fromKey = new ArrayList<K>( navigableMap.keySet() ).get(index);
and then get the submap as above.

Following #toader's suggestion.
for(Integer i : new ArrayList<Integer>(set).subList(100, set.size())) {
// from the 100'th value.
}
Note: the nth value has no meaning with a HashSet. Perhaps you need a SortedSet in which case the 100th would be 100th largest value.

Related

ConcurrentModificationException, how to avoid? [duplicate]

AFAIK, there are two approaches:
Iterate over a copy of the collection
Use the iterator of the actual collection
For instance,
List<Foo> fooListCopy = new ArrayList<Foo>(fooList);
for(Foo foo : fooListCopy){
// modify actual fooList
}
and
Iterator<Foo> itr = fooList.iterator();
while(itr.hasNext()){
// modify actual fooList using itr.remove()
}
Are there any reasons to prefer one approach over the other (e.g. preferring the first approach for the simple reason of readability)?

Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.
Suppose we have the following collection of books
List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));
Collect and Remove
The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.
ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);
This is supposing that the operation you want to do is "delete".
If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.
Using ListIterator
If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.
ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}
Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.
Using JDK >= 8
For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.
You could use the new removeIf method in the Collection base class:
ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));
Or use the new stream API:
ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());
In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).
Use Sublist or Subset
There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:
books.subList(0,5).clear();
Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.
Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.
Considerations:
What method you use might depend on what you are intending to do
The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
The Iterator approach would work with any type of collection, but it only supports remove operations.
With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.

Old Timer Favorite (it still works):
List<String> list;
for(int i = list.size() - 1; i >= 0; --i)
{
if(list.get(i).contains("bad"))
{
list.remove(i);
}
}
Benefits:
It only iterates over the list once
No extra objects created, or other unneeded complexity
No problems with trying to use the index of a removed item, because... well, think about it!

In Java 8, there is another approach. Collection#removeIf
eg:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.removeIf(i -> i > 2);

Are there any reasons to prefer one approach over the other
The first approach will work, but has the obvious overhead of copying the list.
The second approach will not work because many containers don't permit modification during iteration. This includes ArrayList.
If the only modification is to remove the current element, you can make the second approach work by using itr.remove() (that is, use the iterator's remove() method, not the container's). This would be my preferred method for iterators that support remove().

Only second approach will work. You can modify collection during iteration using iterator.remove() only. All other attempts will cause ConcurrentModificationException.

You can't do the second, because even if you use the remove() method on Iterator, you'll get an Exception thrown.
Personally, I would prefer the first for all Collection instances, despite the additional overheard of creating the new Collection, I find it less prone to error during edit by other developers. On some Collection implementations, the Iterator remove() is supported, on other it isn't. You can read more in the docs for Iterator.
The third alternative, is to create a new Collection, iterate over the original, and add all the members of the first Collection to the second Collection that are not up for deletion. Depending on the size of the Collection and the number of deletes, this could significantly save on memory, when compared to the first approach.

I would choose the second as you don't have to do a copy of the memory and the Iterator works faster. So you save memory and time.

You can see this sample; If we think remove odd value from a list:
public static void main(String[] args) {
Predicate<Integer> isOdd = v -> v % 2 == 0;
List<Integer> listArr = Arrays.asList(5, 7, 90, 11, 55, 60);
listArr = listArr.stream().filter(isOdd).collect(Collectors.toList());
listArr.forEach(System.out::println);
}

why not this?
for( int i = 0; i < Foo.size(); i++ )
{
if( Foo.get(i).equals( some test ) )
{
Foo.remove(i);
}
}
And if it's a map, not a list, you can use keyset()

Java get element from List while the size is changing [duplicate]

AFAIK, there are two approaches:
Iterate over a copy of the collection
Use the iterator of the actual collection
For instance,
List<Foo> fooListCopy = new ArrayList<Foo>(fooList);
for(Foo foo : fooListCopy){
// modify actual fooList
}
and
Iterator<Foo> itr = fooList.iterator();
while(itr.hasNext()){
// modify actual fooList using itr.remove()
}
Are there any reasons to prefer one approach over the other (e.g. preferring the first approach for the simple reason of readability)?

Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.
Suppose we have the following collection of books
List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));
Collect and Remove
The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.
ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);
This is supposing that the operation you want to do is "delete".
If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.
Using ListIterator
If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.
ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}
Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.
Using JDK >= 8
For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.
You could use the new removeIf method in the Collection base class:
ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));
Or use the new stream API:
ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());
In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).
Use Sublist or Subset
There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:
books.subList(0,5).clear();
Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.
Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.
Considerations:
What method you use might depend on what you are intending to do
The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
The Iterator approach would work with any type of collection, but it only supports remove operations.
With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.

Old Timer Favorite (it still works):
List<String> list;
for(int i = list.size() - 1; i >= 0; --i)
{
if(list.get(i).contains("bad"))
{
list.remove(i);
}
}
Benefits:
It only iterates over the list once
No extra objects created, or other unneeded complexity
No problems with trying to use the index of a removed item, because... well, think about it!

In Java 8, there is another approach. Collection#removeIf
eg:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.removeIf(i -> i > 2);

Are there any reasons to prefer one approach over the other
The first approach will work, but has the obvious overhead of copying the list.
The second approach will not work because many containers don't permit modification during iteration. This includes ArrayList.
If the only modification is to remove the current element, you can make the second approach work by using itr.remove() (that is, use the iterator's remove() method, not the container's). This would be my preferred method for iterators that support remove().

Only second approach will work. You can modify collection during iteration using iterator.remove() only. All other attempts will cause ConcurrentModificationException.

You can't do the second, because even if you use the remove() method on Iterator, you'll get an Exception thrown.
Personally, I would prefer the first for all Collection instances, despite the additional overheard of creating the new Collection, I find it less prone to error during edit by other developers. On some Collection implementations, the Iterator remove() is supported, on other it isn't. You can read more in the docs for Iterator.
The third alternative, is to create a new Collection, iterate over the original, and add all the members of the first Collection to the second Collection that are not up for deletion. Depending on the size of the Collection and the number of deletes, this could significantly save on memory, when compared to the first approach.

I would choose the second as you don't have to do a copy of the memory and the Iterator works faster. So you save memory and time.

You can see this sample; If we think remove odd value from a list:
public static void main(String[] args) {
Predicate<Integer> isOdd = v -> v % 2 == 0;
List<Integer> listArr = Arrays.asList(5, 7, 90, 11, 55, 60);
listArr = listArr.stream().filter(isOdd).collect(Collectors.toList());
listArr.forEach(System.out::println);
}

why not this?
for( int i = 0; i < Foo.size(); i++ )
{
if( Foo.get(i).equals( some test ) )
{
Foo.remove(i);
}
}
And if it's a map, not a list, you can use keyset()

How do I remove all the same integers from an arraylist of objects? [duplicate]

AFAIK, there are two approaches:
Iterate over a copy of the collection
Use the iterator of the actual collection
For instance,
List<Foo> fooListCopy = new ArrayList<Foo>(fooList);
for(Foo foo : fooListCopy){
// modify actual fooList
}
and
Iterator<Foo> itr = fooList.iterator();
while(itr.hasNext()){
// modify actual fooList using itr.remove()
}
Are there any reasons to prefer one approach over the other (e.g. preferring the first approach for the simple reason of readability)?

Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.
Suppose we have the following collection of books
List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));
Collect and Remove
The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.
ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);
This is supposing that the operation you want to do is "delete".
If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.
Using ListIterator
If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.
ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}
Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.
Using JDK >= 8
For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.
You could use the new removeIf method in the Collection base class:
ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));
Or use the new stream API:
ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());
In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).
Use Sublist or Subset
There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:
books.subList(0,5).clear();
Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.
Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.
Considerations:
What method you use might depend on what you are intending to do
The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
The Iterator approach would work with any type of collection, but it only supports remove operations.
With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.

Old Timer Favorite (it still works):
List<String> list;
for(int i = list.size() - 1; i >= 0; --i)
{
if(list.get(i).contains("bad"))
{
list.remove(i);
}
}
Benefits:
It only iterates over the list once
No extra objects created, or other unneeded complexity
No problems with trying to use the index of a removed item, because... well, think about it!

In Java 8, there is another approach. Collection#removeIf
eg:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.removeIf(i -> i > 2);

Are there any reasons to prefer one approach over the other
The first approach will work, but has the obvious overhead of copying the list.
The second approach will not work because many containers don't permit modification during iteration. This includes ArrayList.
If the only modification is to remove the current element, you can make the second approach work by using itr.remove() (that is, use the iterator's remove() method, not the container's). This would be my preferred method for iterators that support remove().

Only second approach will work. You can modify collection during iteration using iterator.remove() only. All other attempts will cause ConcurrentModificationException.

You can't do the second, because even if you use the remove() method on Iterator, you'll get an Exception thrown.
Personally, I would prefer the first for all Collection instances, despite the additional overheard of creating the new Collection, I find it less prone to error during edit by other developers. On some Collection implementations, the Iterator remove() is supported, on other it isn't. You can read more in the docs for Iterator.
The third alternative, is to create a new Collection, iterate over the original, and add all the members of the first Collection to the second Collection that are not up for deletion. Depending on the size of the Collection and the number of deletes, this could significantly save on memory, when compared to the first approach.

I would choose the second as you don't have to do a copy of the memory and the Iterator works faster. So you save memory and time.

You can see this sample; If we think remove odd value from a list:
public static void main(String[] args) {
Predicate<Integer> isOdd = v -> v % 2 == 0;
List<Integer> listArr = Arrays.asList(5, 7, 90, 11, 55, 60);
listArr = listArr.stream().filter(isOdd).collect(Collectors.toList());
listArr.forEach(System.out::println);
}

why not this?
for( int i = 0; i < Foo.size(); i++ )
{
if( Foo.get(i).equals( some test ) )
{
Foo.remove(i);
}
}
And if it's a map, not a list, you can use keyset()

How to iterate over Set and add elements to it in java?

I want to iterate over a Set and if some condition meet I want to add elements to it.
While I am doing this, I am getting "ConcurrentModificationException".
When I looked for the answer, I found that in case of listIterator we have add() and remove() method but I can't use list as I also have to take care of duplicates.
Please suggest a way to achieve this.
Edit:
int[] A = {1,2,3,4,5,10,6,7,9};
Set<Integer> s = new HashSet<>();
s.add(1);
Iterator i = s.iterator();
while(i.hasNext()){
int temp = i.next();
int x = next element of array A;
if(x%2==0){
s.add(temp*x);
}
}
But it is throwing ConcurrentModificationException.

How to iterate over Set and add elements to it in java?
It cannot be done. Certainly, not with a HashSet or TreeSet. You will probably need to find an alternative way of coding your algorithm that doesn't rely on doing that.
The normal solution is to create a temporary list, add elements to that list, then when you have finished iterating use addAll to add the list elements to the set. But that won't work here because you appear to want your iterator to see the new elements that you have added.
A second approach would be use a ConcurrentHashMap and Collections::newSetFromMap instead of a HashSet. Iterating a concurrent collection won't give a ConcurrentModificationException. However, the flipside is that there are no guarantees that the iterator will see all of the elements that were added during the iteration. So this probably wouldn't work (reliably) for your example.

Remove list elements - my approach for best performance in Java

If I need to remove elements in a list, will the following be better than using LinkedList:
int j = 0;
List list = new ArrayList(1000000);
...
// fill in the list code here
...
for (Iterator i = list.listIterator(); i.hasNext(); j++) {
if (checkCondition) {
i.remove();
i = list.listIterator(j);
}
}
?
LinkedList does "remove and add elements" more effectively than ArrayList, but LinkedList as a doubly-linked list needs more memory, since each element is wrapped as an Entry object. While I need a one-direction List interface, because I'm running over in ascending order of index.

The answer is: it depends on the frequency and distribution of your add and removes. If you have to do only a single remove infrequently, then you might use a linked list. However, the main killer for an ArrayList over a LinkedList is constant time random access. You can't really do this with a normal linked list (however, look at a skip list for some inspiration..). Instead, if you're removing elements relative to other elements (where, you need to remove the next element) then you should use a linked list.

There is no simple answer to this:
It depends on what you are optimizing for. Do you care more about the time taken to perform the operations, or the space used by the lists?
It depends on how long the lists are.
It depends on the proportion of elements that you are removing from the lists.
It depends on the other things that you do to the list.
The chances are that one or more of these determining factors is not predictable up-front; i.e. you don't really know. So my advice would be to put this off for now; i.e. just pick one or the other based on gut feeling (or a coin toss). You can revisit the decision later, if you have a quantifiable performance problem in this area ... as demonstrated by cpu or memory usage profiling.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Random beginning index iterator for HashSet - java

HashSet does not have order. So you could put the set into a List which could use index. Example: HashSet set = new HashSet(); //...... ArrayList list = new ArrayList(set);

Following #toader's suggestion. for(Integer i : new ArrayList<Integer>(set).subList(100, set.size())) { // from the 100'th value. } Note: the nth value has no meaning with a HashSet. Perhaps you need a SortedSet in which case the 100th would be 100th largest value.

Related

ConcurrentModificationException, how to avoid? [duplicate]

Java get element from List while the size is changing [duplicate]

How do I remove all the same integers from an arraylist of objects? [duplicate]

How to iterate over Set and add elements to it in java?

Remove list elements - my approach for best performance in Java

Categories

Resources