Filter java.util.Collection in Java - java

I wrote a util class to filter elements in java.util.Collection as follows:
public class Util{
public static <T> void filter(Collection<T> l, Filter<T> filter) {
Iterator<T> it= l.iterator();
while(it.hasNext()) {
if(!filter.match(it.next())) {
it.remove();
}
}
}
}
public interface Filter<T> {
public boolean match(T o);
}
Questions:
Do you think it's necessary to write the method?
Any improvement about the method?

You should allow any Filter<? super T> not just Filter<T>.
Clients might also want to have a method that returns a new Collection instead:
public static <T> Collection<T> filter(Collection<T> unfiltered,
Filter<? super T> filter)

No. The guava-libraries already have this functionality. See Iterables.filter(iterableCollection, predicate) where the Predicate implements the filtering

Whether it's necessary depends on what you want to achieve. If you can use other third party libs like Google Collections, then no. If it's planned to be a one-off, then probably not. If you plan on creating different filters, then yep, looks like a good approach to keep things modular and cohesive.
One suggestion - you might want to return a Collection - that way, you have the option of returning a new filtered Collection rather than mutating the original Collection. That could be handy if you need to use it in a concurrent context.
You might also look at the responses to this similar question.

Regarding question 1 there are already a lot of collection libraries. Filtering is offered by instance by apache common-collections CollectionUtils and google collections Iterables .

Looks nice - but we can't decide, if it's 'necessary' to write it (OK, you actually wrote it ;) )
The remove() method is not always implemented, it is labelled (optional). Some Iterators just throw an UnsupportedOperationException. You should catch it or convert it to a custom exception saying, that this collection can't be filtered.
And then you could change the method signature to
public static <T> void filter(Iterable<T> i, Filter<T> filter)
because iterators are not limited to Collections. With this utility method you could filter every 'container' that provides an iterator which allows remove operations.

Do you think it's necessary to write the method?
If you don't mind using a third party library then no.
Some suggestions for third party libraries that provide this functionality:
You might want to look at Functional Java which provides filter plus many other higher order functions found in true-blue functional languages.
Example:
List<Person> adults = filter(people, new F1<Person, Boolean>() {
public Boolean f(Person p) {
return p.getAge() > 18;
}
});
Another alternative is using lambdaj - a library with similar goals but is much more concise than Functional Java. lambdaj doesn't cover as much ground as Functional Java though.
Example:
List<Person> adults = filter(having(on(Person.class).getAge(), greaterThan(18)), people);

I think it would be cool to have a visit(T o) method defined in your Filter<T> interface. That way the filter implementation can decide what action to take on the visited object when there is a match.

Related

any use case of declaring a List<?> [duplicate]

I am refreshing my knowledge on Java generics. So I turned to the excellent tutorial from Oracle ... and started to put together a presentation for my coworkers. I came across the section on wildcards in the tutorial that says:
Consider the following method, printList:
public static void printList(List<Object> list) {
...
The goal of printList is to print a list of any type, but it fails to achieve that goal — it prints only a list of Object instances; it cannot print List<Integer>, List<String>, List<Double>, and so on, because they are not subtypes of List<Object>. To write a generic printList method, use List<?>:
public static void printList(List<?> list) {
I understand that List<Object> will not work; but I changed the code to
static <E> void printObjects(List<E> list) {
for (E e : list) {
System.out.println(e.toString());
}
}
...
List<Object> objects = Arrays.<Object>asList("1", "two");
printObjects(objects);
List<Integer> integers = Arrays.asList(3, 4);
printObjects(integers);
And guess what; using List<E> I can print different types of Lists without any problem.
Long story short: at least the tutorial indicates that one needs the wildcard to solve this problem; but as shown, it can be solved this way too. So, what am I missing?!
(side note: tested with Java7; so maybe this was a problem with Java5, Java6; but on the other hand, Oracle seems to do a good job regarding updates of their tutorials)
Your approach of using a generic method is strictly more powerful than a version with wildcards, so yes, your approach is possible, too. However, the tutorial does not state that using a wildcard is the only possible solution, so the tutorial is also correct.
What you gain with the wildcard in comparison to the generic method: You have to write less and the interface is "cleaner" since a non generic method is easier to grasp.
Why the generic method is more powerful than the wildcard method: You give the parameter a name which you can reference. For example, consider a method that removes the first element of a list and adds it to the back of the list. With generic parameters, we can do the following:
static <T> boolean rotateOneElement(List<T> l){
return l.add(l.remove(0));
}
with a wildcard, this is not possible since l.remove(0) would return capture-1-of-?, but l.add would require capture-2-of-?. I.e., the compiler is not able to deduce that the result of remove is the same type that add expects. This is contrary to the first example where the compiler can deduce that both is the same type T. This code would not compile:
static boolean rotateOneElement(List<?> l){
return l.add(l.remove(0)); //ERROR!
}
So, what can you do if you want to have a rotateOneElement method with a wildcard, since it is easier to use than the generic solution? The answer is simple: Let the wildcard method call the generic one, then it works:
// Private implementation
private static <T> boolean rotateOneElementImpl(List<T> l){
return l.add(l.remove(0));
}
//Public interface
static void rotateOneElement(List<?> l){
rotateOneElementImpl(l);
}
The standard library uses this trick in a number of places. One of them is, IIRC, Collections.java
Technically, there is no difference between
<E> void printObjects(List<E> list) {
and
void printList(List<?> list) {
When you are declaring a type parameter, and using it only once, it essentially becomes a wildcard parameter.
On the other hand, if you use it more than once, the difference becomes significant. e.g.
<E> void printObjectsExceptOne(List<E> list, E object) {
is completely different than
void printObjects(List<?> list, Object object) {
You might see that first case enforces both types to be same. While there is no restriction in second case.
As a result, if you are going to use a type parameter only once, it does not even make sense to name it. That is why java architects invented so called wildcard arguments (most probably).
Wildcard parameters avoid unnecessary code bloat and make code more readable. If you need two, you have to fall back to regular syntax for type parameters.
Hope this helps.
Both solutions are effectively the same, it's just that in the second one you are naming the wildcard. This can come handy when you want to use the wildcard several times in the signature, but want to make sure that both refer to the same type:
static <E> void printObjects(List<E> list, PrintFormat<E> format) {

Are there any Java standard classes that implement Iterable without implementing Collection?

I have a conundrum that's caused me to ponder whether there are any standard java classes that implement Iterable<T> without also implementing Collection<T>. I'm implementing one interface that requires me to define a method that accepts an Iterable<T>, but the object I'm using to back this method requires a Collection<T>.
This has me doing some really kludgy feeling code that give some unchecked warnings when compiled.
public ImmutableMap<Integer, Optional<Site>> loadAll(
Iterable<? extends Integer> keys
) throws Exception {
Collection<Integer> _keys;
if (keys instanceof Collection) {
_keys = (Collection<Integer>) keys;
} else {
_keys = Lists.newArrayList(keys);
}
final List<Site> sitesById = siteDBDao.getSitesById(_keys);
// snip: convert the list to a map
Changing my resulting collection to use the more generified Collection<? extends Integer> type doesn't eliminate the unchecked warning for that line. Also, I can't change the method signature to accept a Collection instead of an Iterable because then it's no longer overriding the super method and won't get called when needed.
There doesn't seem to be a way around this cast-or-copy problem: other questions have been asked here an elsewhere and it seems deeply rooted in Java's generic and type erasure systems. But I'm asking instead if there ever are any classes that can implement Iterable<T> that don't also implement Collection<T>? I've taken a look through the Iterable JavaDoc and certainly everything I expect to be passed to my interface will actually be a collection. I'd like to use an in-the-wild, pre-written class instead as that seems much more likely to actually be passed as a parameter and would make the unit test that much more valuable.
I'm certain the cast-or-copy bit I've written works with the types I'm using it for in my project due to some unit tests I'm writing. But I'd like to write a unit test for some input that is an iterable yet isn't a collection and so far all I've been able to come up with is implementing a dummy-test class implementation myself.
For the curious, the method I'm implementing is Guava's CacheLoader<K, V>.loadAll(Iterable<? extends K> keys) and the backing method is a JDBI instantiated data-access object, which requires a collection to be used as the parameter type for the #BindIn interface. I think I'm correct in thinking this is tangental to the question, but just in case anyone wants to try lateral thinking on my problem. I'm aware I could just fork the JDBI project and rewrite the #BindIn annotation to accept an iterable...
Although there is no class that would immediately suit your needs and be intuitive to the readers of your test code, you can easily create your own anonymous class that is easy to understand:
static Iterable<Integer> range(final int from, final int to) {
return new Iterable<Integer>() {
public Iterator<Integer> iterator() {
return new Iterator<Integer>() {
int current = from;
public boolean hasNext() { return current < to; }
public Integer next() {
if (!hasNext()) { throw new NoSuchElementException(); }
return current++;
}
public void remove() { /*Optional; not implemented.*/ }
};
}
};
}
Demo.
This implementation is anonymous, and it does not implement Collection<Integer>. On the other hand, it produces a non-empty enumerable sequence of integers, which you can fully control.
To answer the question as per title:
Are there any Java standard classes that implement Iterable without implementing Collection?
From text:
If there ever are any classes that can implement Iterable<T> that don't also implement Collection<T>?
Answer:
Yes
See the following javadoc page: https://docs.oracle.com/javase/8/docs/api/java/lang/class-use/Iterable.html
Any section that says Classes in XXX that implement Iterable, will list Java standard classes implementing the interface. Many of those don't implement Collection.
Kludgy, yes, but I think the code
Collection<Integer> _keys;
if (keys instanceof Collection) {
_keys = (Collection<Integer>) keys;
} else {
_keys = Lists.newArrayList(keys);
}
is perfectly sound. The interface Collection<T> extends Iterable<T> and you are not allowed to implement the same interface with 2 different type parameters, so there is no way a class could implement Collection<String> and Iterable<Integer>, for example.
The class Integer is final, so the difference between Iterable<? extends Integer> and Iterable<Integer> is largely academic.
Taken together, the last 2 paragraphs prove that if something is both an Iterable<? extends Integer> and a Collection, it must be a Collection<Integer>. Therefore your code is guaranteed to be safe. The compiler can't be sure of this so you can suppress the warning by writing
#SuppressWarnings("unchecked")
above the statement. You should also include a comment by the annotation to explain why the code is safe.
As for the question of whether there are any classes that implement Iterable but not Collection, as others have pointed out the answer is yes. However I think what you are really asking is whether there is any point in having two interfaces. Many others have asked this. Often when a method has a Collection argument (e.g. addAll() it could, and probably should, be an Iterable.
Edit
#Andreas has pointed out in the comments that Iterable was only introduced in Java 5, whereas Collection was introduced in Java 1.2, and most existing methods taking a Collection could not be retrofitted to take an Iterable for compatibility reasons.
In core APIs, the only types that are Iterable but not Collection --
interface java.nio.file.Path
interface java.nio.file.DirectoryStream
interface java.nio.file.SecureDirectoryStream
class java.util.ServiceLoader
class java.sql.SQLException (and subclasses)
Arguably these are all bad designs.
As mentioned in #bayou.io's answer, one such implementation for Iterable is the new Path class for filesystem traversal introduced in Java 7.
If you happen to be on Java 8, Iterable has been retrofitted with (i.e. given a default method) spliterator() (pay attention to its Implementation Note), which lets you use it in conjunction with StreamSupport:
public static <T> Collection<T> convert(Iterable<T> iterable) {
// using Collectors.toList() for illustration,
// there are other collectors available
return StreamSupport.stream(iterable.spliterator(), false)
.collect(Collectors.toList());
}
This comes at the slight expense that any argument which is already a Collection implementation goes through an unnecessary stream-and-collect operation. You probably should only use it if the desire for a standardized JDK-only approach outweighs the potential performance hit, compared to your original casting or Guava-based methods, which is likely moot since you're already using Guava's CacheLoader.
To test this out, consider this snippet and sample output:
// Snippet
System.out.println(convert(Paths.get(System.getProperty("java.io.tmpdir"))));
// Sample output on Windows
[Users, MyUserName, AppData, Local, Temp]
After reading the excellent answers and provided docs, I poked around in a few more classes and found what looks to be the winner, both in terms of straightforwardness for test code and for a direct question title. Java's main ArrayList implementation contains this gem:
public Iterator<E> iterator() {
return new Itr();
}
Where Itr is a private inner class with a highly optimized, customized implementation of Iterator<E>. Unfortunately, Iterator doesn't itself implement Iterable, so if I want to shoe horn it into my helper method for testing the code path that doesn't do the cast, I have to wrap it in my own junk class that implements Iterable (and not Collection) and returns the Itr. This is a handy way to easily to turn a collection into an Iterable without having to write the iteration code yourself.
On a final note, my final version of the code doesn't even do the cast itself, because Guava's Lists.newArrayList does pretty much exactly what I was doing with the runtime type detection in the question.
#GwtCompatible(serializable = true)
public static <E> ArrayList<E> More ...newArrayList(Iterable<? extends E> elements) {
checkNotNull(elements); // for GWT
// Let ArrayList's sizing logic work, if possible
if (elements instanceof Collection) {
#SuppressWarnings("unchecked")
Collection<? extends E> collection = (Collection<? extends E>) elements;
return new ArrayList<E>(collection);
} else {
return newArrayList(elements.iterator());
}
}

How to subclass Guava's ImmutableList?

When I try to implement my own ImmutableList (actually a wrapper that delegates to the underlying list) I get the following compiler error:
ImmutableListWrapper is not abstract and does not override abstract method isPartialView() in com.google.common.collect.ImmutableCollection
But in fact, it seems to be impossible to override isPartialView() because it is package protected and I'd like to declare the wrapper in my own package.
Why don't I simply extend ImmutableCollection? Because I want ImmutableList.copyOf() to return my instance without making a defensive copy.
The only approach I can think of is declaring a subclass in guava's package which changes isPartialView() from package-protected to public, and then having my wrapper extend that. Is there a cleaner way?
What I am trying to do
I am attempting to fix https://github.com/google/guava/issues/2029 by creating a wrapper that would delegate to the underlying ImmutableList for all methods except spliterator(), which would it override.
I am working under the assumption that users may define variables of type ImmutableList and expect the the wrapper to be a drop-in replacement (i.e. it isn't enough to implement List, they are expecting an ImmutableList).
If you want your own immutable list but don't want to implement it, just use a ForwardingList. Also, to actually make a copy, use Iterator as parameter for the copyOf. Here's a solution that should fulfill all your requirements described in the question and your answer.
public final class MyVeryOwnImmutableList<T> extends ForwardingList<T> {
public static <T> MyVeryOwnImmutableList<T> copyOf(List<T> list) {
// Iterator forces a real copy. List or Iterable doesn't.
return new MyVeryOwnImmutableList<T>(list.iterator());
}
private final ImmutableList<T> delegate;
private MyVeryOwnImmutableList(Iterator<T> it) {
this.delegate = ImmutableList.copyOf(it);
}
#Override
protected List<T> delegate()
{
return delegate;
}
}
If you want different behavior than ImmutableList.copyOf() provides, simply define a different method, e.g.
public class MyList {
public static List<E> copyOf(Iterable<E> iter) {
if (iter instanceof MyList) {
return (List<E>)iter;
return ImmutableList.copyOf(iter);
}
}
Guava's immutable classes provide a number of guarantees and make a number of assumptions about how their implementations work. These would be violated if other authors could implement their own classes that extend Guava's immutable types. Even if you correctly implemented your class to work with these guarantees and assumptions, there's nothing stopping these implementation details from changing in a future release, at which point your code could break in strange or undetectable ways.
Please do not attempt to implement anything in Guava's Imutable* heirarchy; you're only shooting yourself in the foot.
If you have a legitimate use case, file a feature request and describe what you need, maybe it'll get incorporated. Otherwise, just write your wrappers in a different package and provide your own methods and guarantees. There's nothing forcing you, for instance, to use ImmutableList.copyOf(). If you need different behavior, just write your own method.
Upon digging further, it looks like this limitation is by design:
Quoting
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/ImmutableList.html:
Note: Although this class is not final, it cannot be subclassed as it has no public or protected constructors. Thus, instances of this type are guaranteed to be immutable.
So it seems I need to create my wrapper in the guava package.

Interface for a modifiable list

So I have an interface:
public interface
{
List<String> getList();
}
However, I believe that the implementer can return an unmodifiable list like:
List<String> getList()
{
return Collections.unmodifiablelist(new ArrayList<String>());
}
Is there any interface, or a way to ensure than a modifiable list is always returned. I don't want to have to make a copy.
You cannot assure because the return type is Collection.What you can do is a proper doc.
In terms of API design, you should just make your javadoc clear ! People who use a method without reading its doc deserve the surprise.
Is it acceptable to return unmodifiableList or should I return array?
I think what you are looking for is not a interface but a especific implementation, because an interface defines "how thing supose to be" not "how things are". In other worlds, you are trying to establish a especific a behave, so, the best way here is to force the implementers to use a especific version of List, like this:
public interface MyInterface {
MyModifiableList getList();
}
And then, you create your version of a modifiable class that will be used by your implementers. For exemple:
public final class MyModifiableList {
// Implementation...
}

Generic writer/outputter. What Reader is to Iterator, what is Writer to X?

In Java the abstract version of a Reader that works with pulling Objects (instead of characters) is an Iterator.
The question is there an abstract version of Appendable or Writer where I can push objects (ie an interface)?
In the past I just make my own interface like:
public interface Pusher<T> {
public void push(T o);
}
Is there a generic interface that is available in most environments that someone knows about that makes sense so I don't have to keep creating the above interface?
Update:
Here is an example of where it would be useful:
public void findBadCategories(final Appendable a) {
String q = sql.getSql("product-category-bad");
jdbcTemplate.query(q, new RowCallbackHandler() {
#Override
public void processRow(ResultSet rs) throws SQLException {
String id = rs.getString("product_category_id");
String name = rs.getString("category_name");
if (! categoryMap.containsKey(id)) {
try {
a.append(id + "\t" + name + "\n");
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
});
}
I'm using an Appendable here but I would much rather have my Pusher callback. Believe me once Java 8 comes out I would just use closure but that closure still needs an interface.
Finally the other option I have chosen before is to completely violate Guava's Predicate or Function (although that seems even worse). Its violation of the contract because these aim to be idempotent (although I suppose if you return true all the time... ).
What Guava does provide though is sort of analagous to Python's generators thanks to its AbstractIterator.
I added an enhancement issue to Guava but I agree with them that its not really their job to add something fundamental like that.
On several projects now, I've defined for this purpose what I call a sink:
interface Sink<T> {
void put(T contribution);
}
With that, methods that produce objects of type T would demand a parameter of type Sink<? super T>.
Several design questions arise:
As declared, Sink#put() throws no checked exceptions. That doesn't play well with I/O operations that usually throw IOException. To address this, you can add a type parameter that extends Exception and advertise that put() throws this type, but at that point, if you know that much about the nature of value consumption, you're probably better off defining a custom interface for it.
As declared, Sink#put() does not return a value. It's not possible to indicate to the caller whether the value was accepted or not.
With a generic interface like this, you're forced to box contributions of primitive types like int and char, which also means they can be null. Consider annotating the contribution parameter with #NonNull.
To go along with this type, related to the generator concept that Petr Pudlák mentions in his answer, I've defined a source interface:
interface Source<T> {
T get();
}
Methods looking to draw items of type T from such a source demand a parameter of type Source<? extends T>.
For coordination with channels among concurrent processes, I've defined both Sink#put() and Source#get() to throw InterruptedException:
interface Sink<T> {
void put(T contribution) throws InterruptedException;
}
interface Source<T> {
T get() throws InterruptedException;
}
These are analogous to Doug Lea's original Puttable and Takable interfaces that didn't make it into the java.util.concurrent package, though lacking in an equivalent to the timed wait Puttable#offer() and Takable#poll() methods.
All sorts of implementations then arise that can be composed easily, such as exchangers, filters, and transformers.
Beyond my own library, I've seen the Guava library provide the PrimitiveSink and Funnel types for hashing-related purposes. You may find those to be useful abstractions as well.
There can be several views on the subject:
The dual of an iterator is a generator. Iterators "consume" values from a collection, generator "provide" them. But generators are a bit different than writers. For a writer, you decide when you push an element into it. On the other hand, generators provide you with a sequence of values, one by one. Java doesn't have any specific language support for generators. See also What is the difference between an Iterator and a Generator?
The opposite to iterators is something you could push values into. I don't think Java has any abstraction for that. The closes I have seen is Scala's Growable (neglecting the clear() method).
The closest is Observable but it isn't used so much.
public update(Observable o, Object arg)
I would not use Iterable instead of Reader and I would create a consumer of your choice.
A common pattern is to not use an interface but rather an annotation.
e.g.
#Subscriber
public void onUpdate(Update update) { }
#Subscriber
public void onInsert(Insert insert) { }
#Subscriber
public void onDelete(Delete delete) { }
When this class is added as a listener it subscribes to Update, Insert and Delete objects, and ignores any others. This allows one object to subscribe to different type of message in a Type safe way.
Here is what I decided to do (and I think its the best option out of what others gave :P ).
I'm going to backport Java 8's Lambda classes (java.util.functions.*). Particularly this one:
/**
* Performs operations upon an input object which may modify that object and/or
* external state (other objects).
*
* <p>All block implementations are expected to:
* <ul>
* <li>When used for aggregate operations upon many elements blocks
* should not assume that the {#code apply} operation will be called upon
* elements in any specific order.</li>
* </ul>
*
* #param <T> The type of input objects to {#code apply}.
*/
public interface Block<T> {
/**
* Performs operations upon the provided object which may modify that object
* and/or external state.
*
* #param t an input object
*/
void apply(T t);
// Some extension methods that I'll have to do with below.
}
Basically I'll make a new namespace like com.snaphop.backport.java.util.functions.* and move over the interfaces and make them work with Guava. Obviously I won't have the lambda syntax or the extension methods but those I can work around. Then in theory when Java 8 comes out it all I would have to do is a namespace switch.

Categories

Resources