Compare two sets of different types - java

I'm looking for a way to tell if two sets of different element types are identical if I can state one-to-one relation between those element types. Is there a standard way for doing this in java or maybe guava or apache commons?
Here is my own implementation of this task. For example, I have two element classes which I know how to compare. For simplicity, I compare them by id field:
class ValueObject {
public int id;
public ValueObject(int id) { this.id=id; }
public static ValueObject of(int id) { return new ValueObject(id); }
}
class DTO {
public int id;
public DTO(int id) { this.id=id; }
public static DTO of(int id) { return new DTO(id); }
}
Then I define an interface which does the comparison
interface TwoTypesComparator<L,R> {
boolean areIdentical(L left, R right);
}
And the actual method for comparing sets looks like this
public static <L,R> boolean areIdentical(Set<L> left, Set<R> right, TwoTypesComparator<L,R> comparator) {
if (left.size() != right.size()) return false;
boolean found;
for (L l : left) {
found = false;
for (R r : right) {
if (comparator.areIdentical(l, r)) {
found = true; break;
}
}
if (!found) return false;
}
return true;
}
Example of a client code
HashSet<ValueObject> valueObjects = new HashSet<ValueObject>();
valueObjects.add(ValueObject.of(1));
valueObjects.add(ValueObject.of(2));
valueObjects.add(ValueObject.of(3));
HashSet<DTO> dtos = new HashSet<DTO>();
dtos.add(DTO.of(1));
dtos.add(DTO.of(2));
dtos.add(DTO.of(34));
System.out.println(areIdentical(valueObjects, dtos, new TwoTypesComparator<ValueObject, DTO>() {
#Override
public boolean areIdentical(ValueObject left, DTO right) {
return left.id == right.id;
}
}));
I'm looking for the standard solution to to this task. Or any suggestions how to improve this code are welcome.

This is what I would do in your case. You have sets. Sets are hard to compare, but on top of that, you want to compare on their id.
I see only one proper solution where you have to normalize the wanted values (extract their id) then sort those ids, then compare them in order, because if you don't sort and compare you can possibly skip pass over duplicates and/or values.
Think about the fact that Java 8 allows you to play lazy with streams. So don't rush over and think that extracting, then sorting then copying is long. Lazyness allows it to be rather fast compared to iterative solutions.
HashSet<ValueObject> valueObjects = new HashSet<>();
valueObjects.add(ValueObject.of(1));
valueObjects.add(ValueObject.of(2));
valueObjects.add(ValueObject.of(3));
HashSet<DTO> dtos = new HashSet<>();
dtos.add(DTO.of(1));
dtos.add(DTO.of(2));
dtos.add(DTO.of(34));
boolean areIdentical = Arrays.equals(
valueObjects.stream()
.mapToInt((v) -> v.id)
.sorted()
.toArray(),
dtos.stream()
.mapToInt((d) -> d.id)
.sorted()
.toArray()
);
You want to generalize the solution? No problem.
public static <T extends Comparable<?>> boolean areIdentical(Collection<ValueObject> vos, Function<ValueObject, T> voKeyExtractor, Collection<DTO> dtos, Function<DTO, T> dtoKeyExtractor) {
return Arrays.equals(
vos.stream()
.map(voKeyExtractor)
.sorted()
.toArray(),
dtos.stream()
.map(dtoKeyExtractor)
.sorted()
.toArray()
);
}
And for a T that is not comparable:
public static <T> boolean areIdentical(Collection<ValueObject> vos, Function<ValueObject, T> voKeyExtractor, Collection<DTO> dtos, Function<DTO, T> dtoKeyExtractor, Comparator<T> comparator) {
return Arrays.equals(
vos.stream()
.map(voKeyExtractor)
.sorted(comparator)
.toArray(),
dtos.stream()
.map(dtoKeyExtractor)
.sorted(comparator)
.toArray()
);
}
You mention Guava and if you don't have Java 8, you can do the following, using the same algorithm:
List<Integer> voIds = FluentIterables.from(valueObjects)
.transform(valueObjectIdGetter())
.toSortedList(intComparator());
List<Integer> dtoIds = FluentIterables.from(dtos)
.transform(dtoIdGetter())
.toSortedList(intComparator());
return voIds.equals(dtoIds);

Another solution would be to use List instead of Set (if you are allowed to do so). List has a method called get(int index) that retrieves the element at the specified index and you can compare them one by one when both your lists have the same size. More on lists: http://docs.oracle.com/javase/7/docs/api/java/util/List.html
Also, avoid using public variables in your classes. A good practice is to make your variables private and use getter and setter methods.
Instantiate lists and add values
List<ValueObject> list = new ArrayList<>();
List<DTO> list2 = new ArrayList<>();
list.add(ValueObject.of(1));
list.add(ValueObject.of(2));
list.add(ValueObject.of(3));
list2.add(DTO.of(1));
list2.add(DTO.of(2));
list2.add(DTO.of(34));
Method that compares lists
public boolean compareLists(List<ValueObject> list, List<DTO> list2) {
if(list.size() != list2.size()) {
return false;
}
for(int i = 0; i < list.size(); i++) {
if(list.get(i).id == list2.get(i).id) {
continue;
} else {
return false;
}
}
return true;
}

Your current method is incorrect or at least inconsistent for general sets.
Imagine the following:
L contains the Pairs (1,1), (1,2), (2,1).
R contains the Pairs (1,1), (2,1), (2,2).
Now if your id is the first value your compare would return true but are those sets really equal? The problem is that you have no guarantee that there is at most one Element with the same id in the set because you don't know how L and R implement equals so my advise would be to not compare sets of different types.
If you really need to compare two Sets the way you described I would go for copying all Elements from L to a List and then go through R and every time you find the Element in L remove it from the List. Just make sure you use LinkedList instead of ArrayList .

You could override equals and hashcode on the dto/value object and then do : leftSet.containsAll(rightSet) && leftSet.size().equals(rightSet.size())
If you can't alter the element classes, make a decorator and have the sets be of the decorator type.

Related

How to remove all elements that match a certain condition except for N greatest of them with Stream API

My question is: is there a better way to implement this task?
I have a list of orderable elements (in this example by age, the youngest first).
And I want to delete all elements that fulfill a condition (in this example red elements) but keep the first 2 of them.
Stream<ElementsVO> stream = allElements.stream();
Stream<ElementsVO> redStream = stream.filter(elem->elem.getColor()==RED).sorted((c1, c2) -> { return c1.getAge() - c2.getAge();
}).limit(2);
Stream<ElementsVO> nonRedStream=stream.filter(elem->elem.getColor()!=RED);
List<ElementsVO> resultList = Stream.concat(redStream,nonRedStream).sorted((c1, c2) -> { return c1.getAge() - c2.getAge();
}).collect(Collectors.toList());
Any idea to improve this? Any way to implement an accumulator function or something like that with streams?
You can technically do this with a stateful predicate:
Predicate<ElementsV0> statefulPredicate = new Predicate<ElementsV0>() {
private int reds = 0;
#Override public boolean test(ElementsV0 e) {
if (elem.getColor() == RED) {
reds++;
return reds < 2;
}
return true;
}
};
Then:
List<ElementsVO> resultList =
allElements.stream()
.sorted(comparingInt(ElementsV0::getAge))
.filter(statefulPredicate)
.collect(toList());
This might work, but it is a violation of the Stream API: the documentation for Stream.filter says that the predicate should be stateless, which in general allows the stream implementation to apply the filter in any order. For small input lists, streamed sequentially, this will almost certainly be the appearance order in the list, but it's not guaranteed.
Caveat emptor. Your current way works, although you could do the partitioning of the list more efficiently using Collectors.partitioningBy to avoid iterating it twice.
You can implement a custom collector that will maintain two separate collections of RED and non-RED element.
And since you need only two red elements having the greatest age to improve performance, you can introduce a partial sorting. I.e. collection of non-red element needs to maintain an order and always must be of size 2 at most, with that overhead of sorting will be far less significant in comparison to sorting of elements having the property of RED in order to pick only two of them.
In order to create a custom collector, you might make use of the static method Collector.of() which expects the following arguments:
Supplier Supplier<A> is meant to provide a mutable container which store elements of the stream. Because we need to separate elements by color into two groups as a container, we can use a map that will contain only 2 keys (true and false), denoting whether elements mapped to this key are red. In order to store red-elements and perform a partial sorting, we need a collection that is capable of maintaining the order. PriorityQueue is a good choice for that purpose. To store all other elements, I've used ArrayDeque, which doesn't maintain the order and as fast as ArrayList.
Accumulator BiConsumer<A,T> defines how to add elements into the mutable container provided by the supplier. For this task, the accumulator needs to guarantee that the queue, containing red-elements will not exceed the given size by rejecting values that are smaller than the lowest value previously added to the queue and by removing the lowest value if the size has reached the limit and a new value needs to be added. This functionality extracted into a separate method tryAdd()
Combiner BinaryOperator<A> combiner() establishes a rule on how to merge two containers obtained while executing stream in parallel. Here, combiner rely on the same logic that was described for accumulator.
Finisher Function<A,R> is meant to produce the final result by transforming the mutable container. In the code below, finisher dumps the contents of both queues into a stream, sorts them and collects into an immutable list.
Characteristics allow fine-tuning the collector by providing additional information on how it should function. Here a characteristic Collector.Characteristics.UNORDERED is being applied. Which indicates that the order in which partial results of the reduction produced in parallel is not significant, that can improve performance of this collector with parallel streams.
The code might look like this:
public static void main(String[] args) {
List<ElementsVO> allElements =
List.of(new ElementsVO(Color.RED, 25), new ElementsVO(Color.RED, 23), new ElementsVO(Color.RED, 27),
new ElementsVO(Color.BLACK, 19), new ElementsVO(Color.GREEN, 23), new ElementsVO(Color.GREEN, 29));
Comparator<ElementsVO> byAge = Comparator.comparing(ElementsVO::getAge);
List<ElementsVO> resultList = allElements.stream()
.collect(getNFiltered(byAge, element -> element.getColor() != Color.RED, 2));
resultList.forEach(System.out::println);
}
The method below is responsible for creating of a collector that partition the elements based on the given predicate and will sort them in accordance with the provided comparator.
public static <T> Collector<T, ?, List<T>> getNFiltered(Comparator<T> comparator,
Predicate<T> condition,
int limit) {
return Collector.of(
() -> Map.of(true, new PriorityQueue<>(comparator),
false, new ArrayDeque<>()),
(Map<Boolean, Queue<T>> isRed, T next) -> {
if (condition.test(next)) isRed.get(false).add(next);
else tryAdd(isRed.get(true), next, comparator, limit);
},
(Map<Boolean, Queue<T>> left, Map<Boolean, Queue<T>> right) -> {
left.get(false).addAll(right.get(false));
left.get(true).forEach(next -> tryAdd(left.get(true), next, comparator, limit));
return left;
},
(Map<Boolean, Queue<T>> isRed) -> isRed.values().stream()
.flatMap(Queue::stream).sorted(comparator).toList(),
Collector.Characteristics.UNORDERED
);
}
This method is responsible for adding the next red-element into the priority queue. It expects a comparator in order to be able to determine whether the next element should be added or discarded, and a value of the maximum size of the queue (2), to check if it was exceeded.
public static <T> void tryAdd(Queue<T> queue, T next, Comparator<T> comparator, int size) {
if (queue.size() == size && comparator.compare(queue.element(), next) < 0)
queue.remove(); // if the next element is greater than the smallest element in the queue and max size has been exceeded, the smallest element needs to be removed from the queue
if (queue.size() < size) queue.add(next);
}
Output
lementsVO{color=BLACK, age=19}
ElementsVO{color=GREEN, age=23}
ElementsVO{color=RED, age=25}
ElementsVO{color=RED, age=27}
ElementsVO{color=GREEN, age=29}
I wrote a generic Collector with a predicate and a limit of elements to add which match the predicate:
public class LimitedMatchCollector<T> implements Collector<T, List<T>, List<T>> {
private Predicate<T> filter;
private int limit;
public LimitedMatchCollector(Predicate<T> filter, int limit)
{
super();
this.filter = filter;
this.limit = limit;
}
private int count = 0;
#Override
public Supplier<List<T>> supplier() {
return () -> new ArrayList<T>();
}
#Override
public BiConsumer<List<T>, T> accumulator() {
return this::accumulator;
}
#Override
public BinaryOperator<List<T>> combiner() {
return this::combiner;
}
#Override
public Set<Characteristics> characteristics() {
return Stream.of(Characteristics.IDENTITY_FINISH)
.collect(Collectors.toCollection(HashSet::new));
}
public List<T> accumulator(List<T> list , T e) {
if (filter.test(e)) {
if (count >= limit) {
return list;
}
count++;
}
list.add(e);
return list;
}
public List<T> combiner(List<T> left , List<T> right) {
right.forEach( e -> {
if (filter.test(e)) {
if (count < limit) {
left.add(e);
count++;
}
}
});
return left;
}
#Override
public Function<List<T>, List<T>> finisher()
{
return Function.identity();
}
}
Usage:
List<ElementsVO> list = Arrays.asList(new ElementsVO("BLUE", 1)
,new ElementsVO("BLUE", 2) // made color a String
,new ElementsVO("RED", 3)
,new ElementsVO("RED", 4)
,new ElementsVO("GREEN", 5)
,new ElementsVO("RED", 6)
,new ElementsVO("YELLOW", 7)
);
System.out.println(list.stream().collect(new LimitedMatchCollector<ElementsVO>( (e) -> "RED".equals(e.getColor()),2)));

Java Lambda Stream Distinct() on arbitrary key? [duplicate]

This question already has answers here:
Java 8 Distinct by property
(34 answers)
Closed 3 years ago.
I frequently ran into a problem with Java lambda expressions where when I wanted to distinct() a stream on an arbitrary property or method of an object, but wanted to keep the object rather than map it to that property or method. I started to create containers as discussed here but I started to do it enough to where it became annoying and made a lot of boilerplate classes.
I threw together this Pairing class, which holds two objects of two types and allows you to specify keying off the left, right, or both objects. My question is... is there really no built-in lambda stream function to distinct() on a key supplier of some sorts? That would really surprise me. If not, will this class fulfill that function reliably?
Here is how it would be called
BigDecimal totalShare = orders.stream().map(c -> Pairing.keyLeft(c.getCompany().getId(), c.getShare())).distinct().map(Pairing::getRightItem).reduce(BigDecimal.ZERO, (x,y) -> x.add(y));
Here is the Pairing class
public final class Pairing<X,Y> {
private final X item1;
private final Y item2;
private final KeySetup keySetup;
private static enum KeySetup {LEFT,RIGHT,BOTH};
private Pairing(X item1, Y item2, KeySetup keySetup) {
this.item1 = item1;
this.item2 = item2;
this.keySetup = keySetup;
}
public X getLeftItem() {
return item1;
}
public Y getRightItem() {
return item2;
}
public static <X,Y> Pairing<X,Y> keyLeft(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.LEFT);
}
public static <X,Y> Pairing<X,Y> keyRight(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.RIGHT);
}
public static <X,Y> Pairing<X,Y> keyBoth(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.BOTH);
}
public static <X,Y> Pairing<X,Y> forItems(X item1, Y item2) {
return keyBoth(item1, item2);
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
if (keySetup.equals(KeySetup.LEFT) || keySetup.equals(KeySetup.BOTH)) {
result = prime * result + ((item1 == null) ? 0 : item1.hashCode());
}
if (keySetup.equals(KeySetup.RIGHT) || keySetup.equals(KeySetup.BOTH)) {
result = prime * result + ((item2 == null) ? 0 : item2.hashCode());
}
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Pairing<?,?> other = (Pairing<?,?>) obj;
if (keySetup.equals(KeySetup.LEFT) || keySetup.equals(KeySetup.BOTH)) {
if (item1 == null) {
if (other.item1 != null)
return false;
} else if (!item1.equals(other.item1))
return false;
}
if (keySetup.equals(KeySetup.RIGHT) || keySetup.equals(KeySetup.BOTH)) {
if (item2 == null) {
if (other.item2 != null)
return false;
} else if (!item2.equals(other.item2))
return false;
}
return true;
}
}
UPDATE:
Tested Stuart's function below and it seems to work great. The operation below distincts on the first letter of each string. The only part I'm trying to figure out is how the ConcurrentHashMap maintains only one instance for the entire stream
public class DistinctByKey {
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
public static void main(String[] args) {
final ImmutableList<String> arpts = ImmutableList.of("ABQ","ALB","CHI","CUN","PHX","PUJ","BWI");
arpts.stream().filter(distinctByKey(f -> f.substring(0,1))).forEach(s -> System.out.println(s));
}
Output is...
ABQ
CHI
PHX
BWI
The distinct operation is a stateful pipeline operation; in this case it's a stateful filter. It's a bit inconvenient to create these yourself, as there's nothing built-in, but a small helper class should do the trick:
/**
* Stateful filter. T is type of stream element, K is type of extracted key.
*/
static class DistinctByKey<T,K> {
Map<K,Boolean> seen = new ConcurrentHashMap<>();
Function<T,K> keyExtractor;
public DistinctByKey(Function<T,K> ke) {
this.keyExtractor = ke;
}
public boolean filter(T t) {
return seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
}
I don't know your domain classes, but I think that, with this helper class, you could do what you want like this:
BigDecimal totalShare = orders.stream()
.filter(new DistinctByKey<Order,CompanyId>(o -> o.getCompany().getId())::filter)
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
Unfortunately the type inference couldn't get far enough inside the expression, so I had to specify explicitly the type arguments for the DistinctByKey class.
This involves more setup than the collectors approach described by Louis Wasserman, but this has the advantage that distinct items pass through immediately instead of being buffered up until the collection completes. Space should be the same, as (unavoidably) both approaches end up accumulating all distinct keys extracted from the stream elements.
UPDATE
It's possible to get rid of the K type parameter since it's not actually used for anything other than being stored in a map. So Object is sufficient.
/**
* Stateful filter. T is type of stream element.
*/
static class DistinctByKey<T> {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
Function<T,Object> keyExtractor;
public DistinctByKey(Function<T,Object> ke) {
this.keyExtractor = ke;
}
public boolean filter(T t) {
return seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
}
BigDecimal totalShare = orders.stream()
.filter(new DistinctByKey<Order>(o -> o.getCompany().getId())::filter)
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
This simplifies things a bit, but I still had to specify the type argument to the constructor. Trying to use diamond or a static factory method doesn't seem to improve things. I think the difficulty is that the compiler can't infer generic type parameters -- for a constructor or a static method call -- when either is in the instance expression of a method reference. Oh well.
(Another variation on this that would probably simplify it is to make DistinctByKey<T> implements Predicate<T> and rename the method to eval. This would remove the need to use a method reference and would probably improve type inference. However, it's unlikely to be as nice as the solution below.)
UPDATE 2
Can't stop thinking about this. Instead of a helper class, use a higher-order function. We can use captured locals to maintain state, so we don't even need a separate class! Bonus, things are simplified so type inference works!
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
BigDecimal totalShare = orders.stream()
.filter(distinctByKey(o -> o.getCompany().getId()))
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
You more or less have to do something like
elements.stream()
.collect(Collectors.toMap(
obj -> extractKey(obj),
obj -> obj,
(first, second) -> first
// pick the first if multiple values have the same key
)).values().stream();
Another way of finding distinct elements
List<String> uniqueObjects = ImmutableList.of("ABQ","ALB","CHI","CUN","PHX","PUJ","BWI")
.stream()
.collect(Collectors.groupingBy((p)->p.substring(0,1))) //expression
.values()
.stream()
.flatMap(e->e.stream().limit(1))
.collect(Collectors.toList());
A variation on Stuart Marks second update. Using a Set.
public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
Set<Object> seen = Collections.newSetFromMap(new ConcurrentHashMap<>());
return t -> seen.add(keyExtractor.apply(t));
}
We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())
To answer your question in your second update:
The only part I'm trying to figure out is how the ConcurrentHashMap maintains only one instance for the entire stream:
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
In your code sample, distinctByKey is only invoked one time, so the ConcurrentHashMap created just once. Here's an explanation:
The distinctByKey function is just a plain-old function that returns an object, and that object happens to be a Predicate. Keep in mind that a predicate is basically a piece of code that can be evaluated later. To manually evaluate a predicate, you must call a method in the Predicate interface such as test. So, the predicate
t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null
is merely a declaration that is not actually evaluated inside distinctByKey.
The predicate is passed around just like any other object. It is returned and passed into the filter operation, which basically evaluates the predicate repeatedly against each element of the stream by calling test.
I'm sure filter is more complicated than I made it out to be, but the point is, the predicate is evaluated many times outside of distinctByKey. There's nothing special* about distinctByKey; it's just a function that you've called one time, so the ConcurrentHashMap is only created one time.
*Apart from being well made, #stuart-marks :)
You can use the distinct(HashingStrategy) method in Eclipse Collections.
List<String> list = Lists.mutable.with("ABQ", "ALB", "CHI", "CUN", "PHX", "PUJ", "BWI");
ListIterate.distinct(list, HashingStrategies.fromFunction(s -> s.substring(0, 1)))
.each(System.out::println);
If you can refactor list to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<String> list = Lists.mutable.with("ABQ", "ALB", "CHI", "CUN", "PHX", "PUJ", "BWI");
list.distinct(HashingStrategies.fromFunction(s -> s.substring(0, 1)))
.each(System.out::println);
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.
Set.add(element) returns true if the set did not already contain element, otherwise false.
So you can do like this.
Set<String> set = new HashSet<>();
BigDecimal totalShare = orders.stream()
.filter(c -> set.add(c.getCompany().getId()))
.map(c -> c.getShare())
.reduce(BigDecimal.ZERO, BigDecimal::add);
If you want to do this parallel, you must use concurrent map.
It can be done something like
Set<String> distinctCompany = orders.stream()
.map(Order::getCompany)
.collect(Collectors.toSet());

Iterator over multiple SortedSet objects

In Java, I have several SortedSet instances. I would like to iterate over the elements from all these sets. One simple option is to create a new SortedSet, such as TreeSet x, deep-copy the contents of all the individual sets y_1, ..., y_n into it using x.addAll(y_i), and then iterate over x.
But is there a way to avoid deep copy? Couldn't I just create a view of type SortedSet which would somehow encapsulate the iterators of all the inner sets, but behave as a single set?
I'd prefer an existing, tested solution, rather than writing my own.
I'm not aware of any existing solution to accomplish this task, so I took the time to write one for you. I'm sure there's room for improvement on it, so take it as a guideline and nothing else.
As Sandor points out in his answer, there are some limitations that must be imposed or assumed. One such limitation is that every SortedSet must be sorted relative to the same order, otherwise there's no point in comparing their elements without creating a new set (representing the union of every individual set).
Here follows my code example which, as you'll notice, is relatively more complex than just creating a new set and adding all elements to it.
import java.util.*;
final class MultiSortedSetView<E> implements Iterable<E> {
private final List<SortedSet<E>> sets = new ArrayList<>();
private final Comparator<? super E> comparator;
MultiSortedSetView() {
comparator = null;
}
MultiSortedSetView(final Comparator<? super E> comp) {
comparator = comp;
}
#Override
public Iterator<E> iterator() {
return new MultiSortedSetIterator<E>(sets, comparator);
}
MultiSortedSetView<E> add(final SortedSet<E> set) {
// You may remove this `if` if you already know
// every set uses the same comparator.
if (comparator != set.comparator()) {
throw new IllegalArgumentException("Different Comparator!");
}
sets.add(set);
return this;
}
#Override
public boolean equals(final Object o) {
if (this == o) { return true; }
if (!(o instanceof MultiSortedSetView)) { return false; }
final MultiSortedSetView<?> n = (MultiSortedSetView<?>) o;
return sets.equals(n.sets) &&
(comparator == n.comparator ||
(comparator != null ? comparator.equals(n.comparator) :
n.comparator.equals(comparator)));
}
#Override
public int hashCode() {
int hash = comparator == null ? 0 : comparator.hashCode();
return 37 * hash + sets.hashCode();
}
#Override
public String toString() {
return sets.toString();
}
private final static class MultiSortedSetIterator<E>
implements Iterator<E> {
private final List<Iterator<E>> iterators;
private final PriorityQueue<Element<E>> queue;
private MultiSortedSetIterator(final List<SortedSet<E>> sets,
final Comparator<? super E> comparator) {
final int n = sets.size();
queue = new PriorityQueue<Element<E>>(n,
new ElementComparator<E>(comparator));
iterators = new ArrayList<Iterator<E>>(n);
for (final SortedSet<E> s: sets) {
iterators.add(s.iterator());
}
prepareQueue();
}
#Override
public E next() {
final Element<E> e = queue.poll();
if (e == null) {
throw new NoSuchElementException();
}
if (!insertFromIterator(e.iterator)) {
iterators.remove(e.iterator);
}
return e.element;
}
#Override
public boolean hasNext() {
return !queue.isEmpty();
}
private void prepareQueue() {
final Iterator<Iterator<E>> iterator = iterators.iterator();
while (iterator.hasNext()) {
if (!insertFromIterator(iterator.next())) {
iterator.remove();
}
}
}
private boolean insertFromIterator(final Iterator<E> i) {
while (i.hasNext()) {
final Element<E> e = new Element<>(i.next(), i);
if (!queue.contains(e)) {
queue.add(e);
return true;
}
}
return false;
}
private static final class Element<E> {
final E element;
final Iterator<E> iterator;
Element(final E e, final Iterator<E> i) {
element = e;
iterator = i;
}
#Override
public boolean equals(final Object o) {
if (o == this) { return true; }
if (!(o instanceof Element)) { return false; }
final Element<?> e = (Element<?>) o;
return element.equals(e.element);
}
}
private static final class ElementComparator<E>
implements Comparator<Element<E>> {
final Comparator<? super E> comparator;
ElementComparator(final Comparator<? super E> comp) {
comparator = comp;
}
#Override
#SuppressWarnings("unchecked")
public int compare(final Element<E> e1, final Element<E> e2) {
if (comparator != null) {
return comparator.compare(e1.element, e2.element);
}
return ((Comparable<? super E>) e1.element)
.compareTo(e2.element);
}
}
}
}
The inner workings of this class are simple to grasp. The view keeps a list of sorted sets, the ones you want to iterate over. It also needs the comparator that will be used to compare elements (null to use their natural ordering). You can only add (distinct) sets to the view.
The rest of the magic happens in the Iterator of this view. This iterator keeps a PriorityQueue of the elements that will be returned from next() and a list of iterators from the individual sets.
This queue will have, at all times, at most one element per set, and it discards repeating elements. The iterator also discards empty and used up iterators. In short, it guarantees that you will traverse every element exactly once (as in a set).
Here's an example on how to use this class.
SortedSet<Integer> s1 = new TreeSet<>();
SortedSet<Integer> s2 = new TreeSet<>();
SortedSet<Integer> s3 = new TreeSet<>();
SortedSet<Integer> s4 = new TreeSet<>();
// ...
MultiSortedSetView<Integer> v =
new MultiSortedSetView<Integer>()
.add(s1)
.add(s2)
.add(s3)
.add(s4);
for (final Integer i: v) {
System.out.println(i);
}
I do not think that is possible unless it is some special case, which would require custom implementation.
For example take the following two comparators:
public class Comparator1 implements Comparator<Long> {
#Override
public int compare(Long o1, Long o2) {
return o1.compareTo(o2);
}
}
public class Comparator2 implements Comparator<Long> {
#Override
public int compare(Long o1, Long o2) {
return -o1.compareTo(o2);
}
}
and the following code:
TreeSet<Long> set1 = new TreeSet<Long>(new Comparator1());
TreeSet<Long> set2 = new TreeSet<Long>(new Comparator2());
set1.addAll(Arrays.asList(new Long[] {1L, 3L, 5L}));
set2.addAll(Arrays.asList(new Long[] {2L, 4L, 6L}));
System.out.println(Joiner.on(",").join(set1.descendingIterator()));
System.out.println(Joiner.on(",").join(set2.descendingIterator()));
This will result in:
5,3,1
2,4,6
and is useless for any Comparator operating on the head element of the given Iterators.
This makes it impossible to create such a general solution. It is only possible if all sets are sorted using the same Comparator, however that cannot be guaranteed and ensured by any implementation which accept SortedSet objects, given multiple SortedSet instances (e.g. anything that would accept SortedSet<Long> instances, would accept both TreeSet objects).
A little bit more formal approach:
Given y_1,..,y_n are all sorted sets, if:
the intersect of these sets are an empty set
and there is an ordering of the sets where for every y_i, y_(i+1) set it is true that y_i[x] <= y_(i+1)[1] where x is the last element of the y_i sorted set, and <= means a comparative function
then the sets y_1,..,y_n can be read after each other as a SortedSet.
Now if any of the following conditions are not met:
if the first condition is not met, then the definition of a Set is not fulfilled, so it can not be a Set until a deep copy merge is completed and the duplicated elements are removed (See Set javadoc, first paragraph:
sets contain no pair of elements e1 and e2 such that e1.equals(e2)
the second condition can only be ensured using exactly the same comparator <= function
The first condition is the more important, because being a SortedSet implies being a Set, and if the definition of being a Set cannot be fulfilled, then the stronger conditions of a SortedSet definitely cannot be fulfilled.
There is a possibility that an implementation can exists which mimics the working of a SortedSet, but it will definitely not be a SortedSet.
com.google.common.collect.Sets#union from Guava will do the trick. It returns an unmodifiable view of the union of two sets. You may iterate over it. Returned set will not be sorted. You may then create new sorted set from returned set (new TreeSet() or com.google.common.collect.ImmutableSortedSet. I see no API to create view of given set as sorted set.
If your concern is a deep-copy on the objects passed to the TreeSet#addAll method, you shouldn't be. The javadoc does not indicate it's a deep-copy (and it certainly would say so if it was)...and the OpenJDK implementation doesn't show this either. No copies - simply additional references to the existing object.
Since the deep-copy isn't an issue, I think worrying about this, unless you've identified this as a specific performance problem, falls into the premature optimization category.

How to combine two Collections.sort functions

I have a program where i am have a list of Names, and how many people have that name. I want to put the names in alphabetical order while also putting the counts from greatest to least. If the name has the same count it puts the name in alphabetical order. I figured out how to put the names in abc order and figured out how to put the counts in greatest to least but i cant figure out how to combine the two to get list of names greatest to least and if they have the same count in alphabetical order.
Collections.sort(oneName, new OneNameCompare());
for(OneName a: oneName)
{
System.out.println(a.toString());
}
Collections.sort(oneName, new OneNameCountCompare());
for(OneName a: oneName)
{
System.out.println(a.toString());
}
You can make another Comparator that combines the effects of the two other Comparators. If one comparator compares equal, then you can call the second comparator and use its value.
public class CountNameComparator implements Comparator<Name>
{
private OneNameCompare c1 = new OneNameCompare();
private OneNameCountCompare c2 = new OneNameCountCompare();
#Override
public int compare(Name n1, Name n2)
{
int comp = c1.compare(n1, n2);
if (comp != 0) return comp;
return c2.compare(n1, n2);
}
}
Then you can call Collections.sort just once.
Collections.sort(oneName, new CountNameComparator());
This can be generalized for any number of comparators.
You can combine comparators like this
public static <T> Comparator<T> combine(final Comparator<T> c1, final Comparator<T> c2) {
return new Comparator<T>() {
public int compare(T t1, T t2) {
int cmp = c1.compare(t1, t2);
if (cmp == 0)
cmp = c2.compare(t1, t2);
return cmp;
}
};
}
BTW Comparators are a good example of when to use a stateless singleton. All comparators or a type are the same so you only ever need one of them.
public enum OneNameCompare implements Comparator<OneName> {
INSTANCE;
public int compare(OneName o1, OneName o2) {
int cmp = // compare the two objects
return cmp;
}
}
This avoid creating new instances or cache copies. You only ever need one of each type.
Assuming you're using the Apache Commons Collections API, you might want to check out ComparatorUtils.chainedComparator:
Collections.sort(oneName, ComparatorUtils.chainedComparator(new OneNameCompare(), new OneNameCountCompare());
Using lambdas from Java 8:
Collections.sort(Arrays.asList(""),
(e1, e2) -> e1.getName().compareTo(e2.getName()) != 0 ?
e1.getName().compareTo(e2.getName()) :
e1.getCount().compareTo(e2.getCount()));

Java, searching within a list of objects?

I'm a bit lost on the way to make this happen the fastest. I have a large list of objects that have basic variable attributes (with getters / setters) and I need to do a search in this list to find the objects within the list that match a given parameter
I have found how to do a regular list search but I need to, for example search for the value of the result of doing a call getName() for each object in the list and get objects that have a result that matches my input.
Something like below where the third argument is the result of the method call and the second is what I am trying to find.
int index = Collections.binarySearch(myList, "value", getName());
Any advice is appreciated
If you just as a one-off operation need to find the object(s) whose getName() is a particular value, then there's probably not much magic possible: cycle through the list, call getName() on each object, and for those that match, add them to your list of results.
If getName() is an expensive operation and there's some other way of a-priori working out if a given object definitely won't return a matching value, then obviously you can build in this 'filtering' as you cycle through.
If you frequently need to fetch objects for a given getName(), then keep an index (e.g. in a HashMap) of [result of getName()->object -> list of matches]. You'll need to decide how and if you need to keep this "index" in synch with the actual list.
See also the other proposition to use binarySearch() but to keep the list maintained. This way, inserts are more expensive than with a map and unsorted list, but if inserts are infrequent compared to lookups, then it has the advantage of only needing to maintain one structure.
Take a look at the binarySearch that takes a comparator:
public static int binarySearch(List list,
T key,
Comparator c)
So you would do something like:
class FooComparator
implements Comparator<Foo>
{
public int compare(T a, T b)
{
return (a.getName().compareTo(b.getName());
}
}
int index = Collections.binarySearch(myList, "value", new FooComparator());
You will need to first sort the list of course (Collections.sort takes a Comaprator as well...).
I know anonymous inner classes are not fashion anymore, but while Java 8 arrives, you can create something like this:
1.- Create a search method that iterates the collection and pass an object that tells you if your object is to be returned or not.
2.- Invoke that method and create an anonymous inner class with the criteria
3.- Get the new list in separate variable.
Something like this:
result = search( aList, new Matcher(){ public boolean matches( Some some ) {
if( some.name().equals("a")) {
return true;
}
}});
Here's a working demo:
import java.util.*;
class LinearSearchDemo {
public static void main( String ... args ) {
List<Person> list = Arrays.asList(
Person.create("Oscar", 0x20),
Person.create("Reyes", 0x30),
Person.create("Java", 0x10)
);
List<Person> result = searchIn( list,
new Matcher<Person>() {
public boolean matches( Person p ) {
return p.getName().equals("Java");
}});
System.out.println( result );
result = searchIn( list,
new Matcher<Person>() {
public boolean matches( Person p ) {
return p.getAge() > 16;
}});
System.out.println( result );
}
public static <T> List<T> searchIn( List<T> list , Matcher<T> m ) {
List<T> r = new ArrayList<T>();
for( T t : list ) {
if( m.matches( t ) ) {
r.add( t );
}
}
return r;
}
}
class Person {
String name;
int age;
String getName(){
return name;
}
int getAge() {
return age;
}
static Person create( String name, int age ) {
Person p = new Person();
p.name = name;
p.age = age;
return p;
}
public String toString() {
return String.format("Person(%s,%s)", name, age );
}
}
interface Matcher<T> {
public boolean matches( T t );
}
Output:
[Person(Java,16)]
[Person(Oscar,32), Person(Reyes,48)]
To do this in a more scalable way, without simply iterating/filtering objects, see this answer to a similar question: How do you query object collections in Java (Criteria/SQL-like)?
If the objects are immutable (or you at least know their names won't change) you could create an index using a HashMap.
You would have to fill the Map and keep it updated.
Map map = new HashMap();
map.put(myObject.getName(), myObject);
... repeat for each object ...
Then you can use map.get("Some name"); to do lookup using your index.
One library I'm familiar with is Guava -- you can compose its Predicate to pull out items from an Iterable. There's no need for the collection to be pre-sorted. (This means, in turn, that it's O(N), but it's convenient.)

Categories

Resources