Is there anyway to add metadata to Java Collections?

Is there anyway to add metadata to Java Collections? - java

Let's say I have a collection of objects which can be sorted using a number of different comparators based on the different fields of the object.
It would be nice to be able to know later on in the code which comparator was used to sort the Collection with and if it was ascending or descending. Is there anyway to do this elegantly instead of using a bunch of Booleans to keep track of things?

Not for the Collection interface, but if you use a SortedSet there's a comparator() method where you can ask for its comparator.
Otherwise you'll have to subclass the collection class you're using to add the accessors you need.

No there's nothing with the implementations that does this. You would need to track it yourself. You could subclass a Collection implementation to add fields which hold this information.
You could also map the implementations to metadata as you like with a Map -- in particular it seems like you want IdentityHashMap to do this, since you don't want two different collections to be compared for equality as keys with equals().
I would store a boolean (ascending/descending), and a reference to the Comparator used to sort, if that's what completely determines the sort. Or if it's sorted on field, store a String naming the field perhaps.

sure:
define methods for your decorated Collection<Foo>
public List<Comparator<Foo>> getComparators() { ... }
and
public int whichComparator() { ... }
that returns which Comparator is currently in use from the List. You could make it fancier with a Map and some sensible keys (say, enums - perhaps even enums which implement the comparators) if you're modifying which comparators might be used over the life of the object, but I think the above is a good enough start.

Related

Map's equals() for keys that are arrays

I'm using a TreeMap (SortedMap) whose keys are Object[] with elements of varying types.
TreeMap's equals() doesn't work on Object[] like Arrays's equals() would do -- which means it won't work when using its methods like containsKey() and get() unless I workaround it.
Is there somewhere a solution for this that doesn't involve creating a whole new Class?
EDIT :
Just to make it clear, I made a mistaken assumption. Creating a new Comparator(){} also does affect every method that uses equality, such as equals(), not only the tree sorter.

Is there somewhere a solution for this that doesn't involve creating a whole new Class?
No. In fact, you shouldn't be using mutable values for map keys at all.

While I agree with Matt Ball that you generally shouldn't use mutable (changeable) types as your keys, it is possible to use a TreeMap in this manner as long as you are not planning on modifying the arrays once they are in the tree.
This solution does involve the creation of a class, but not a new Map class, which is what it seems you are asking. Instead, you would need to create your own class which implements Comparator<Object[]> that can compare arrays. The class could use the Arrays.equals() method to determine if they are equal, but would need to also have a consistent rule to determine which array comes before another array when the arrays are not equal.

How to make class usable in different HashMaps in Java

I have a class Attribute which has 2 variables say int a,b;
I want to use class Attribute in two different HashSet.
The first hash set considers objects as equal when the value of a is same.
But the second hash set considers objects as equal when the value of b is same.
I know if I override the equals method the hashset will use the overriden version of equals to compare two objects but in this case I would need two different implementations of equals()
One way is to create two subclasses of attribute and provide them with different equals method but I want to know if there is a better way to do it such that I dont have to create subclass of Attribute.
Thanks.

One possible solution is to not use HashSet, but use TreeSet instead. It's the same Set interface, but there is a TreeSet constructor that lets you pass in a Comparator. That way you could leave the Attribute class unchanged- just create two different comparators and use it like
Set<Attribute> setA = new TreeSet<Attribute>(comparatorForA);
Set<Attribute> setB = new TreeSet<Attribute>(comparatorForB);
The comparator takes care of the equality check (e.g. if compare returns 0, the objects are equal)

Unfortunately there's no "Equalizer" class that can override the equals logic. There is such a thing for sorting, where you can either use natural sorting based on the Comparable implementation or provide your own Comparator. I've actually wondered why there's no such thing for equality checks.
Since the semantics of equality are defined by a class and could be considered a trait of that class, the two subclasses approach seems the most natural. Maybe someone knows a useful pattern for doing this in a more simple manner, but I've never encountered it.
EDIT: just thought of something... you could use two Map instances, like HashMap, with the first one using a as key and the second using b as key. It'd let you detect collisions. You could then simply link the attribute to the associated instance.

I did some thing different, Instead of using the HashSet, I have used HashMap where I have used int a as a key in first HashMap and the object is stored as value.
And in the other HashMap I have kept the key as int b and the object as value.
This provides me a way to Hash on both the variables a and b so I dont have to make any sub classes.
And also, I get O(1) time instead of O(log n). But I know I am paying the price by using some more memory but my main concern was time so I chose HashMap over TreeSet.
Thank you all for your comments and suggestions.

It would be very easy to modify HashMap and HashSet to accept hashing and equality-testing strategies.
public interface Hasher {
int hashCode(Object o);
}
public interface Equalizer {
int areEqual(Object o1, Object o2);
}

A simple solution is to bypass HashSet and use HashMap directly. For the first, store each Attribute using its a property as the key, and for the other use b.

I can propose a bit hacky but lesser effort solution :)
Swap the values of a and b when storing in second hashset so that uniqueness is defined by value of b and then when reading the class from hashset then swap the value of a and b again to retain the original state. So the same equals/hascode methods will serve the purpose.

Removing duplicates without overriding hash method

I have a List which contains a list of objects and I want to remove from this list all the elements which have the same values in two of their attributes. I had though about doing something like this:
List<Class1> myList;
....
Set<Class1> mySet = new HashSet<Class1>();
mySet.addAll(myList);
and overriding hash method in Class1 so it returns a number which depends only in the attributes I want to consider.
The problem is that I need to do a different filtering in another part of the application so I can't override hash method in this way (I would need two different hash methods).
What's the most efficient way of doing this filtering without overriding hash method?
Thanks

Overriding hashCode and equals in Class1 (just to do this) is problematic. You end up with your class having an unnatural definition of equality, which may turn out to be other for other current and future uses of the class.
Review the Comparator interface and write a Comparator<Class1> implementation to compare instances of your Class1 based on your criteria; e.g. based on those two attributes. Then instantiate a TreeSet<Class>` for duplicate detection using the TreeSet(Comparator) constructor.
EDIT
Comparing this approach with #Tom Hawtin's approach:
The two approaches use roughly comparable space overall. The treeset's internal nodes roughly balance the hashset's array and the wrappers that support the custom equals / hash methods.
The wrapper + hashset approach is O(N) in time (assuming good hashing) versus O(NlogN) for the treeset approach. So that is the way to go if the input list is likely to be large.
The treeset approach wins in terms of the lines of code that need to be written.

Let your Class1 implements Comparable. Then use TreeSet as in your example (i.e. use addAll method).

As an alternative to what Roman said you can have a look at this SO question about filtering using Predicates. If you use Google Collections anyway this might be a good fit.

I would suggest introducing a class for the concept of the parts of Class1 that you want to consider significant in this context. Then use a HashSet or HashMap.

Sometimes programmers make things too complicated trying to use all the nice features of a language, and the answers to this question are an example. Overriding anything on the class is overkill. What you need is this:
class MyClass {
Object attr1;
Object attr2;
}
List<Class1> list;
Set<Class1> set=....
Set<MyClass> tempset = new HashSet<MyClass>;
for (Class1 c:list) {
MyClass myc = new MyClass();
myc.attr1 = c.attr1;
myc.attr2 = c.attr2;
if (!tempset.contains(myc)) {
tempset.add(myc);
set.add(c);
}
}
Feel free to fix up minor irregulairites. There will be some issues depending on what you mean by equality for the attributes (and obvious changes if the attributes are primitive). Sometimes we need to write code, not just use the builtin libraries.

Duplicate values in the Set collection?

Is it possible to allow duplicate values in the Set collection?
Is there any way to make the elements unique and have some copies of them?
Is there any functions for Set collection for having duplicate values in it?

Ever considered using a java.util.List instead?
Otherwise I would recommend a Multiset from Google Guava (the successor to Google Collections, which this answer originally recommended -ed.).

The very definition of a Set disallows duplicates. I think perhaps you want to use another data structure, like a List, which will allow dups.
Is there any way to make the elements unique and have some copies of them?
If for some reason you really do need to store duplicates in a set, you'll either need to wrap them in some kind of holder object, or else override equals() and hashCode() of your model objects so that they do not evaluate as equivalent (and even that will fail if you are trying to store references to the same physical object multiple times).
I think you need to re-evaluate what you are trying to accomplish here, or at least explain it more clearly to us.

From the javadocs:
"sets contain no pair of elements e1
and e2 such that e1.equals(e2), and at
most one null element"
So if your objects were to override .equals() so that it would return different values for whatever objects you intend on storing, then you could store them separately in a Set (you should also override hashcode() as well).
However, the very definition of a Set in Java is,
"A collection that contains no
duplicate elements. "
So you're really better off using a List or something else here. Perhaps a Map, if you'd like to store duplicate values based on different keys.

Sun's view on "bags" (AKA multisets):
We are extremely sympathetic to the desire for type-safe collections. Rather than adding a "band-aid" to the framework that enforces type-safety in an ad hoc fashion, the framework has been designed to mesh with all of the parameterized-types proposals currently being discussed. In the event that parameterized types are added to the language, the entire collections framework will support compile-time type-safe usage, with no need for explicit casts. Unfortunately, this won't happen in the the 1.2 release. In the meantime, people who desire runtime type safety can implement their own gating functions in "wrapper" collections surrounding JDK collections.
(source; note it is old and possibly obsolete -ed.)
Apart from Google's collections API, you can use Apache Commons Collections.
Apache Commons Collections:
http://commons.apache.org/collections/
Javadoc for Bag

I don't believe that you can have duplicate values within a set. A set is defined as a collection of unique values. You may be better off using an ArrayList.

These sound like interview questions, so I'll answer them like interview questions...
Is it possible to allow duplicate values in the Set collection?
Yes, but it requires that the person implementing the Set violate the design contract upon which Set is built. Basically, I could write a class that extends Set and doesn't enforce Set's promises.
In addition, other violations are possible. I could use a Set implementation that relies upon Java's hashCode() contract. Then if I provided an Object that violates Java's hashcode contract, I might be able to place two objects into the set which are equal, but yeild different hashcodes (because they might not be checked in equality against each other due to being in different hash bucket chains.
Is there any way to make the elements unique and have some copies of them?
It basically depends on how you define uniqueness. If an object's uniqueness is determined by its value, then one can have multiple copies of the same unique object; however, if the object's uniqueness is determined by its instance, then by definition it would not be possible to have multiple copies of the same object. You could however have multiple references to them.
Is there any functions for Set collection for having duplicate values in it?
The Set interface doesn't have any functions for detecting / reporting duplicates; however, it is based on the Collections interface, which has to support the List interface, so it is possible to pass duplicates into a Set; however, a properly implemented Set will just ignore the duplicates, and present one copy of every element determined to be unique.

I don't think so. The only way would be to use a List. You can also trick with function equals(), hashcode() or compareTo() but it is going to be ankward.

NO chance.... you can not have duplicate values in SET interface...
If you want duplicates then you can try Array-List

As mentioned choose the right collection for the task and likely a List will be what you need. Messing with the equals(), hashcode() or compareTo() to break identity is generally a bad idea simply to wedge an instance into the wrong collection to start with. Worse yet it may break code in other areas of the application that depend on these methods producing valid comparison results and be very difficult to debug or track down such errors.

This question was asked to me also in an interview. I think the answer is, ofcourse Set will not allow duplicate elements and instead ArrayList or other collections should be used for the same, however overriding equals() for the type of the object being stored in the set will allow you to manipulate on the comparison logic. And hence you may be able to store duplicate elements in the Set. Its more of a hack, which would allow non-unique elements in the Set and ofcourse is not recommended in production level code.

You can do so by overriding hashcode as given below:
public class Test
{
static int a=0;
#Override
public int hashCode()
{
a++;
return a;
}
public static void main(String[] args)
{
Set<Test> s=new HashSet<Test>();
Test t1=new Test();
Test t2=t1;
s.add(t1);
s.add(t2);
System.out.println(s);
System.out.println("--Done--");
}
}

Well, In this case we are trying to break the purpose of specific collection. If we want to allow duplicate records simply use list or multimap.

Set will store unique values and if you wants to store duplicate values then for list,but still if you want duplicate values in set then create set of ArrayList so that you can put duplicate elements into it.
Set<ArrayList> s = new HashSet<ArrayList>();
ArrayList<String> arr = new ArrayList<String>();
arr.add("First");
arr.add("Second");
arr.add("Third");
arr.add("Fourth");
arr.add("First");
s.add(arr);

You can use Tree Map instead :
Key can be used as element you wish to store
and Value will be the frequency of input element.
The insertion and removal will require custom handling.
Insertion : Check if the map already contains the element , if yes then increment its frequency. O(log N)
Removal : if the element's frequency is 1 then remove it , else decrease frequency by 1. O(log N)
More details can be found in the java docs of tree map
Overall time complexity will remain same as TreeSet O(log N) but worse than a HashSet O(1)
firstEntry() -> provides smallest element entry, Time Complexity : O(Log N)
lastEntry() -> provides greatest element entry, Time Complexity : O(Log N)

public class SET {
public static void main(String[] args) {
Set set=new HashSet();
set.add(new AB(10, "pawan#email"));
set.add(new AB(10, "pawan#email"));
set.add(new AB(10, "pawan#email"));
Iterator it=set.iterator();
while(it.hasNext()){
Object o=it.next();
System.out.println(o);
}
}
}
public class AB{
int id;
String email;
public AB() {
System.out.println("DC");
}
AB(int id,String email){
this.id=id;
this.email=email;
}
#Override public String toString() {
// TODO Auto-generated method stub return ""+id+"\t"+email;}
}
}

Java: SortedMap, TreeMap, Comparable? How to use?

I have a list of objects I need to sort according to properties of one of their fields. I've heard that SortedMap and Comparators are the best way to do this.
Do I implement Comparable with the class I'm sorting, or do I create a new class?
How do I instantiate the SortedMap and pass in the Comparator?
How does the sorting work? Will it automatically sort everything as new objects are inserted?
EDIT:
This code is giving me an error:
private TreeMap<Ktr> collection = new TreeMap<Ktr>();
(Ktr implements Comparator<Ktr>). Eclipse says it is expecting something like TreeMap<K, V>, so the number of parameters I'm supplying is incorrect.

The simpler way is to implement Comparable with your existing objects, although you could instead create a Comparator and pass it to the SortedMap.
Note that Comparable and Comparator are two different things; a class implementing Comparable compares this to another object, while a class implementing Comparator compares two other objects.
If you implement Comparable, you don't need to pass anything special into the constructor. Just call new TreeMap<MyObject>(). (Edit: Except that of course Maps need two generic parameters, not one. Silly me!)
If you instead create another class implementing Comparator, pass an instance of that class into the constructor.
Yes, according to the TreeMap Javadocs.
Edit: On re-reading the question, none of this makes sense. If you already have a list, the sensible thing to do is implement Comparable and then call Collections.sort on it. No maps are necessary.
A little code:
public class MyObject implements Comparable<MyObject> {
// ... your existing code here ...
#Override
public int compareTo(MyObject other) {
// do smart things here
}
}
// Elsewhere:
List<MyObject> list = ...;
Collections.sort(list);
As with the SortedMap, you could instead create a Comparator<MyObject> and pass it to Collections.sort(List, Comparator).

1.
That depends on the situation. Let's say the object A should sort before the object B in your set. If it generally makes sense to consider A less than B, then implementing Comparable would make sense. If the order only makes sense in the context in which you use the set, then you should probably create a Comparator.
2.
new TreeMap(new MyComparator());
Or without creating a MyComparator class:
new TreeMap(new Comparator<MyClass>() {
int compare(MyClass o1, MyClass o2) { ... }
});
3. Yes.

Since you have a list and get an error because you have one argument on the map I suppose you want a sorted set:
SortedSet<Ktr> set = new TreeSet<Ktr>(comparator);
This will keep the set sorted, i.e. an iterator will return the elements in their sort order. There are also methods specific to SortedSet which you might want to use. If you also want to go backwards you can use NavigableSet.

My answer assumes you are using the TreeMap implementation of SortedMap.
1.) If using TreeMap, you have a choice. You can either implement Comparable directly on your class or pass a separate Comparator to the constructor.
2.) Example:
Comparator<A> cmp = new MyComparator();
Map<A,B> map = new TreeMap<A,B>(myComparator);
3.) Yes that's correct. Internally TreeMap uses a red-black tree to store elements in order as they are inserted; the time cost of performing an insert (or retrieval) is O(log N).

You make a Comparator<ClassYouWantToSort>. Then the Comparator compares the field that you want to sort on.
When you create the TreeMap, you create a TreeMap<ClassYouWantToSort>, and you pass in the Comparator as an argument. Then, as you insert objects of type ClassYouWantToSort, the TreeMap uses your Comparator to sort them properly.
EDIT: As Adamski notes, you can also make ClassYouWantToSort itself Comparable. The advantage is that you have fewer classes to deal with, the code is simpler, and ClassYouWantToSort gets a convenient default ordering. The disadvantage is that ClassYouWantToSort may not have a single obvious ordering, and so you'll have to implement Comparables for other situations anyway. You also may not be able to change ClassYouWantToSort.
EDIT2: If you only have a bunch of objects that you're throwing into the collection, and it's not a Map (i.e. it's not a mapping from one set of objects to another) then you want a TreeSet, not a TreeMap.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.