Difference between these declaration of TreeSet - java

So I have to use a TreeSet in my code.
As TreeSet<E> extends AbstractSet<E> implements NavigableSet<E>, Cloneable, java.io.Serializable
and interface NavigableSet<E> extends SortedSet<E> which extends Set<E>
I can use any of these three declaration:
NavigableSet<String> myTreeSet= new TreeSet<>();
SortedSet<String> myTreeSet= new TreeSet<>();
Set<String> myTreeSet= new TreeSet<>();
I know I will be having access to only those method which are exposed by the Interface I am using in the declaration. Is there any other reason to consider for selecting a particular declaration for a TreeSet?

its basically what you allow others (or yourself) to use, as you are stated. Other methods you like to use with your TreeSet might depend on the actual declaration. So there might be a method requiring a SortedSet, but when you define your TreeSet as Set, it will not be able to proceed

Please, don't make preliminary decisions. If you don't need methods from NavigableSet don't use it. Just use Set<String>.

You should use which ever interface you like to program. But what makes the difference is to which interface you are programming. I mean, programming to an interface (declaration type) and not to the actual TreeSet collection.
Here is the best answer which explains Programming to an interface.

As a guiding rule always choose the most top level type possible because this allows greater decoupling from the client code towards the concrete implementation. Most of the time it is not important to the calling code to know which implementation is used, you just want to provide the behaviour not expose it. Said that, sometimes you will feel like needing a little more control, then you should go through the interfaces hierarchy until you find the necessary level of control, but this should be the exception, not the rule

Related

Why LinkedList implements List interface again in java [duplicate]

Why do many Collection classes in Java extend the Abstract class and also implement the interface (which is also implemented by the given abstract class)?
For example, class HashSet extends AbstractSet and also implements Set, but AbstractSet already implements Set.
It's a way to remember that this class really implements that interface.
It won't have any bad effect and it can help to understand the code without going through the complete hierarchy of the given class.
From the perspective of the type system the classes wouldn't be any different if they didn't implement the interface again, since the abstract base classes already implement them.
That much is true.
The reason they do implement it anyways is (probably) mostly documentation: a HashSet is-a Set. And that is made explicit by adding implements Set to the end, although it's not strictly necessary.
Note that the difference is actually observable using reflection, but I'd be hard-pressed to produce some code that would break if HashSet didn't implement Set directly.
This may not matter much in practice, but I wanted to clarify that explicitly implementing an interface is not exactly the same as implementing it by inheritance. The difference is present in compiled class files and visible via reflection. E.g.,
for (Class<?> c : ArrayList.class.getInterfaces())
System.out.println(c);
The output shows only the interfaces explicitly implemented by ArrayList, in the order they were written in the source, which [on my Java version] is:
interface java.util.List
interface java.util.RandomAccess
interface java.lang.Cloneable
interface java.io.Serializable
The output does not include interfaces implemented by superclasses, or interfaces that are superinterfaces of those which are included. In particular, Iterable and Collection are missing from the above, even though ArrayList implements them implicitly. To find them you have to recursively iterate the class hierarchy.
It would be unfortunate if some code out there uses reflection and depends on interfaces being explicitly implemented, but it is possible, so the maintainers of the collections library may be reluctant to change it now, even if they wanted to. (There is an observation termed Hyrum's Law: "With a sufficient number of users of an API, it does not matter what you promise in the contract; all observable behaviors of your system will be depended on by somebody".)
Fortunately this difference does not affect the type system. The expressions new ArrayList<>() instanceof Iterable and Iterable.class.isAssignableFrom(ArrayList.class) still evaluate to true.
Unlike Colin Hebert, I don't buy that people who were writing that cared about readability. (Everyone who thinks standard Java libraries were written by impeccable gods, should take look it their sources. First time I did this I was horrified by code formatting and numerous copy-pasted blocks.)
My bet is it was late, they were tired and didn't care either way.
From the "Effective Java" by Joshua Bloch:
You can combine the advantages of interfaces and abstract classes by adding an abstract skeletal implementation class to go with an interface.
The interface defines the type, perhaps providing some default methods, while the skeletal class implements the remaining non-primitive interface methods atop the primitive interface methods. Extending a skeletal implementation takes most of the work out of implementing an interface. This is the Template Method pattern.
By convention, skeletal implementation classes are called AbstractInterface where Interface is the name of the interface they implement. For example:
AbstractCollection
AbstractSet
AbstractList
AbstractMap
I also believe it is for clarity. The Java Collections framework has quite a hierarchy of interfaces that defines the different types of collection. It starts with the Collection interface then extended by three main subinterfaces Set, List and Queue. There is also SortedSet extending Set and BlockingQueue extending Queue.
Now, concrete classes implementing them is more understandable if they explicitly state which interface in the heirarchy it is implementing even though it may look redundant at times. As you mentioned, a class like HashSet implements Set but a class like TreeSet though it also extends AbstractSet implements SortedSet instead which is more specific than just Set. HashSet may look redundant but TreeSet is not because it requires to implement SortedSet. Still, both classes are concrete implementations and would be more understandable if both follow certain convention in their declaration.
There are even classes that implement more than one collection type like LinkedList which implements both List and Queue. However, there is one class at least that is a bit 'unconventional', the PriorityQueue. It extends AbstractQueue but doesn't explicitly implement Queue. Don't ask me why. :)
(reference is from Java 5 API)
Too late for answer?
I am taking a guess to validate my answer. Assume following code
HashMap extends AbstractMap (does not implement Map)
AbstractMap implements Map
Now Imagine some random guy came, Changed implements Map to some java.util.Map1 with exactly same set of methods as Map
In this situation there won't be any compilation error and jdk gets compiled (off course test will fail and catch this).
Now any client using HashMap as Map m= new HashMap() will start failing. This is much downstream.
Since both AbstractMap, Map etc comes from same product, hence this argument appears childish (which in all probability is. or may be not.), but think of a project where base class comes from a different jar/third party library etc. Then third party/different team can change their base implementation.
By implementing the "interface" in the Child class, as well, developer's try to make the class self sufficient, API breakage proof.
In my view,when a class implements an interface it has to implement all methods present in it(as by default they are public and abstract methods in an interface).
If we don't want to implement all methods of interface,it must be an abstract class.
So here if some methods are already implemented in some abstract class implementing particular interface and we have to extend functionality for other methods that have been unimplemented,we will need to implement original interface in our class again to get those remaining set of methods.It help in maintaining the contractual rules laid down by an interface.
It will result in rework if were to implement only interface and again overriding all methods with method definitions in our class.
I suppose there might be a different way to handle members of the set, the interface, even when supplying the default operation implementation does not serve as a one-size-fits-all. A circular Queue vs. LIFO Queue might both implement the same interface, but their specific operations will be implemented differently, right?
If you only had an abstract class you couldn't make a class of your own which inherits from another class too.

TreeSet in Java

I understand Java best practice suggests that while declaring a variable the generic set interface is used to declare on the left and the specific implementation on the right.
Thus if I have to declare the Set interfaces, the right way is,
Set<String> set = new TreeSet<>();
However with this declaration I'm not able to access the set.last() method.
However when I declare it this way,
TreeSet<String> set = new TreeSet<>();
I can access, last() and first().
Can someone help me understand why?
Thelast() and first() are specific methods belonging to TreeSet and not the generic interface Set. When you refer to the variable, it's looking at the source type and not the allocated type, therefore if you're storing a TreeSet as a Set, it may only be treated as a Set. This essentially hides the additional functionality.
As an example, every class in Java extends Object. Therefore this is a totally valid allocation:
final Object mMyList = new ArrayList<String>();
However, we'll never be able to use ArrayList style functionality when referring directly to mMyList without applying type-casting, since all Java can tell is that it's dealing with a generic Object; nothing more.
You can use the SortedSet interface as your declaration (which has the first() and last() methods).
The interface of Set does not provide last() and first() because it does not always make sense for a Set.
Java is a static typing language. When compiler see you doing set.last(), as it is expecting set to be a Set. It will not know whether set is a TreeSet that provides last(), or it is a HashSet that does not. That's why it is complaining.
Hold on. You do not need to declare it using the concrete class TreeSet here. TreeSet bears a SortedSet interface which provides such methods. In another word: because you need to work with a Set that is sorted (for which it make sense to provide first() and last()), the interface that you should work against should be SortedSet instead of Set
Hence what you should be doing is
SortedSet<String> set = new TreeSet<>();
To add to the existing answers here, the reason is that TreeSet implements the interface NavigableSet which inherits from the interface java.util.SortedSet the methods comparator, first, last and spliterator.
Just the Set, on the other side, doesn't have SortedSet methods because it's not inherit from it.
Check it out from more Java TreeSet examples.

About the Java interface and polymorphism

I just met an strange case when reading the Java doc. Here is the link to Oracle's java doc on Arrays.asList method, http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#asList(T...)
There is an example in the doc
List<String> stooges = Arrays.asList("Larry", "Moe", "Curly");
My question is, as List is an interface, why can we declare stooges as a 'List', rather than a concrete subclass implementing List(e.g. ArrayList or LinkedList)?
So does it mean that we can have a reference variable of interface type? It looks quit weird to me as I always think that interface stands only for polymorphism, and we should never really use a interface type variable.
Could anyone please give me some clue on this?
Think of the List interface as a guarantee. Any class that implements List will be guaranteed to have the methods of the interface. When Arrays.asList() returns a List you're not actually getting an interface, you're getting a concrete class that is guaranteed to implement the methods listed in the List interface.
As to your "we should never really use a interface type variable" you're actually suppose to do that. It's called "programming to the interface". It's much more flexible if you can return a List as opposed to something like a LinkedList. The caller of your method isn't coupled to your specific implementation internal implementation which might use, and return, a LinkedList. If at some point you wanted to return a ArrayList instead of the LinkedList the caller would not have to change any code because they only care about the interface.
What does it mean to "program to an interface"?
Just a word of note, Serializable is a marker interface and a little odd because of that. It doesn't guarantee that methods are there, but instead guarantees that the creator of the class that implements serializable has thought about the many issues associated with serializing a class (overriding readObject/writeObject, compatiblity with other serialized forms, and other issues http://www.javapractices.com/topic/TopicAction.do?Id=45). So Serializable is still offering a guarantee, like List is, but it isn't about method signatures, it's about an extralinguistic feature of the language.
http://en.wikipedia.org/wiki/Marker_interface_pattern
Using an Interface as a reference type is a perfectly valid practice in Java. For example, the Serializable interface will do this inside it's class, so that any object that is passed to it can be serialized.
This is also how Java provides something that resembles Multiple Inheritance. For example:
public interface A { }
public class B implements A {}
public class program {
B bClass = new B();
A aObject = (A)bClass;
}
That way the same object can be referenced with different reference types, and all without messing up an inheritance chain!
The interface defines a contract or a specification for an implementation. Which is the methods and their signature. So a class that implements an interface has to respect that contract. This way you can change implementation without affecting the code that uses interfaces for declaring variables.
In the example you mentioned:
You don't know what implementation of the List interface Arrays.asList is using unless you look into the code. So how would you know which one to use? (see javadoc for list interface to see what implementations it has)
The implementation is subject for change, what if Arrays.asList decides to use another implementation? Your code will be broken.
The signature of the method Arrays.asList is that it returns List<T> so if you want to have a concrete implementation as variable you'll have to cast that return value which is bad practice or to create new - let's say ArrayList - and copy all the elements into it, which is just an unnecessary overhead.
Effective Java by Bloch is a great book on Java best practices. In particular, item #52 talks about this: "If the appropriate interface types exist ... declared using the interface types."
The general notion is that, for greatest flexibility and understandability, you should use the type that best reflects the context, which is usually the interface. In the example, you provided, does the exact implementation matter or just that it is a List. Of course, if the code requires an ArrayList-specific method or if the code is relies on an ArrayList-specific behavior, then use the concrete class.
There are occasional exceptions, such as when using GWT-RPC, but this is for implementation reasons.
This is really good example of polymorphism power, if you like you can look at the source code of Arrays.asList() here Arrays.asList(T...a) ,you will find that it takes varibale length input and defines its own private static concrete class ArrayList that implements List interface rather than using the well known java.util.ArrayList or other java Collection type,
this may be to make it more efficient or something, you want to implement your own class and you return it to the user without overwhelming him by implementation details since there is an interface he can deal with your private class through.

Why do we first declare subtypes as their supertype before we instantiate them?

Reading other people's code, I've seen a lot of:
List<E> ints = new ArrayList<E>();
Map<K, V> map = new HashMap<K, V>();
My question is: what is the point/advantage of instantiating them that way as opposed to:
ArrayList<E> ints = new ArrayList<E>();
HashMap<K, V> map = new HashMap<K, V>();
What also makes it odd is that I've never seen anything like:
CharSequence s = new String("String");
or
OutputStream out = new PrintStream(OutputStream);
Duplicates (of the first part of the question):
When/why to use/define an interface
Use interface or type for variable definition in java?
When should I use an interface in java?
why are interfaces created instead of their implementations for every class
What's the difference between these two java variable declarations?
Quick answer? Using interfaces and superclasses increases the portability and maintainability of your code, principally by hiding implementation detail. Take the following hypothetical example:
class Account {
private Collection<Transaction> transactions;
public Account() {
super();
transactions = new ArrayList<Transaction>(4);
}
public Collection<Transaction> getTransactions() {
return transactions;
}
}
I've declared a contract for an Account that states that the transactions posted to the account can be retrieved as a Collection. The callers of my code don't have to care what kind of collection my method actually returns, and shouldn't. And that frees me to change up the internal implementation if I need to, without impacting (aka breaking) unknown number of clients. So to wit, if I discover that I need to impose some kind of uniqueness on my transactions, I can change the implementation shown above from an ArrayList to a HashSet, with no negative impact on anyone using my class.
public Account() {
super();
transactions = new HashSet<Transaction>(4);
}
As far as your second question, I can say that you use the principal of portability and encapsulation wherever they make sense. There are not a terrible lot of CharSequence implementations out there, and String is by far the most used common. So you just won't see alot of developers declaring CharSequence variables in their code.
Using interfaces has the main advantage that you can later change the implementation (the class) without the need to change more than the single line where you create the instance and do the assignment.
For
List<E> ints = new ArrayList<E>();
Map<K, V> map = new HashMap<K, V>();
List and Map are the interfaces, so any class implementing those interfaces can be assigned to these references.
ArrayList is one of the several classes (another is LinkedList) which implement List interface.
Same with Map. HashMap, LinkedHashMap, TreeMap all implement Map.
It is a general principle To program for interfaces and not for implementations. Due to this, the programming task becomes easier. You can dynamically change the behavior of the references.
If you write
ArrayList<E> ints = new ArrayList<E>();
HashMap<K, V> map = new HashMap<K, V>();
ints and map will be ArrayList and HashMap only, forever.
Is a design principle that you program to the interface and not to the implementation.
That way you may provide later a new implementation to the same interface.
From the above link Eric Gamma explains:
This principle is really about dependency relationships which have to be carefully managed in a large app. It's easy to add a dependency on a class. It's almost too easy; just add an import statement and modern Java development tools like Eclipse even write this statement for you. Interestingly the inverse isn't that easy and getting rid of an unwanted dependency can be real refactoring work or even worse, block you from reusing the code in another context. For this reason you have to develop with open eyes when it comes to introducing dependencies. This principle tells us that depending on an interface is often beneficial.
Here, the termin interface refers not only to the Java artifact, but the public interface a given object has, which is basically composed of the methods it has, so, it could be a Java interface ( like List in your example ) or a concrete superclass.
So in your example if you ever want to use a LinkedList instead it would be harder because the type is already declared as ArrayList when just list would've been enough.
Of course, if you need specific methods from a given implementation, you have to declare it of that type.
I hope this helps.
#Bhushan answered why.
To answer your confusion Why nobody uses
CharSequence s = new String("String");
or
OutputStream out = new PrintStream(OutputStream);
CharSequence contains only few common methods. Other classes that implement this interface are mostly buffers and only String is immutable. CharSequence defines common api for classes backed by char array and This interface does not refine the general contracts of the equals and hashCode methods (see javadoc).
OutputStream is low-level api for writing data. Because PrintStream adds extra convenient methods for writing - higher level of abstraction, it's used over OutputStream.
You do this to make sure later when working with the variable you (or anyone using your classes) won't rely on methods specific for the implementation chosen (ArrayList, HashMap, etc.)
The reason behind this is not technical but the stuff you have to read between the lines of code: The List and Map examples says: "I'm only interested in basic list/map stuff, basically you can use anything here." An extreme example of that would be
Iterable<Foo> items = new ArrayList<Foo>();
when you really only want to do some stuff for each thing.
As an added bonus this makes it a little easier to refactor the code later into common utility classes/methods where the concrete type is not required. Or do you want to code your algorithm multiple times for each kind of collection?
The String example on the other hand is not seen wildly, because a) String is special class in Java - each "foo" literal is automatically a String and sooner or later you have to give the characters to some method which only accepts String and b) the CharSequence is really ahh minimal. It does not even support Unicode beyond the BMP properly and it misses most query/manipulation methods of String.
This (good) style of declaring the type as the Interface the class implements is important because it forces us to use methods only defined in the Interface.
As a result, when we need to change our class implementations (i.e. we find our ArraySet is better than the standard HashSet) we are guaranteed that if we change the class our code will work because both classes implement the strictly-enforced Interface.
It is just easier to think of String as of String. As well as it's easier (and more beneficial) to think of WhateverList as of List.
The bonuses are discussed many times, but in brief you simply separate the concerns: when you need a CharSequence, you use it. It's highly unlikely that you need ArrayList only: usually, any List will do.
When you at some point decide to use a different implementation, say:
List<E> ints = new LinkedList<E>();
instead of
List<E> ints = new ArrayList<E>();
this change needs to be done only at a single place.
There is the right balance to strike:
usually you use the type which gives you the most appropriate guarantees. Obviously, a List is also a Collection which is also something Iterable. But a collection does not give you an order, and an iterable does not have an "add" method.
Using ArrayList for the variable type is also reasonable, when you want to be a bit more explicit about the need for fast random access by object position - in a LinkedList, a "get(100)" is a lot slower. (It would be nice if Java had an interface for this, but I don't think there is one. By using ArrayList, you disallow casting an array as list.)
List<E> ints = new ArrayList<E>();
If you write some code that deals only with List then it will work for any class that implements List (e.g. LinkedList, etc). But, if your code directly deals with ArrayList then it's limited to ArrayList.
CharSequence s = new String("String");
Manually instantiating a String object is not good. You should use string literal instead. I am just guessing the reason that you don't see CharSequence might because it's quite new and also, strings are immutable.
This is programming to the interface not the implementation, as per the Gang of Four. This will help to stop the code becoming dependent on methods that are added to particular implementations only, and make it easier to change to use a different implementation if that becomes necessary for whatever reason, e.g. performance.

Why should the interface for a Java class be preferred?

PMD would report a violation for:
ArrayList<Object> list = new ArrayList<Object>();
The violation was "Avoid using implementation types like 'ArrayList'; use the interface instead".
The following line would correct the violation:
List<Object> list = new ArrayList<Object>();
Why should the latter with List be used instead of ArrayList?
Using interfaces over concrete types is the key for good encapsulation and for loose coupling your code.
It's even a good idea to follow this practice when writing your own APIs. If you do, you'll find later that it's easier to add unit tests to your code (using Mocking techniques), and to change the underlying implementation if needed in the future.
Here's a good article on the subject.
Hope it helps!
This is preferred because you decouple your code from the implementation of the list. Using the interface lets you easily change the implementation, ArrayList in this case, to another list implementation without changing any of the rest of the code as long as it only uses methods defined in List.
In general I agree that decoupling interface from implementation is a good thing and will make your code easier to maintain.
There are, however, exceptions that you must consider. Accessing objects through interfaces adds an additional layer of indirection that will make your code slower.
For interest I ran an experiment that generated ten billion sequential accesses to a 1 million length ArrayList. On my 2.4Ghz MacBook, accessing the ArrayList through a List interface took 2.10 seconds on average, when declaring it of type ArrayList it took on average 1.67 seconds.
If you are working with large lists, deep inside an inner loop or frequently called function, then this is something to consider.
ArrayList and LinkedList are two implementations of a List, which is an ordered collection of items. Logic-wise it doesn't matter if you use an ArrayList or a LinkedList, so you shouldn't constrain the type to be that.
This contrasts with say, Collection and List, which are different things (List implies sorting, Collection does not).
Why should the latter with List be used instead of ArrayList?
It's a good practice : Program to interface rather than implementation
By replacing ArrayList with List, you can change List implementation in future as below depending on your business use case.
List<Object> list = new LinkedList<Object>();
/* Doubly-linked list implementation of the List and Deque interfaces.
Implements all optional list operations, and permits all elements (including null).*/
OR
List<Object> list = new CopyOnWriteArrayList<Object>();
/* A thread-safe variant of ArrayList in which all mutative operations
(add, set, and so on) are implemented by making a fresh copy of the underlying array.*/
OR
List<Object> list = new Stack<Object>();
/* The Stack class represents a last-in-first-out (LIFO) stack of objects.*/
OR
some other List specific implementation.
List interface defines contract and specific implementation of List can be changed. In this way, interface and implementation are loosely coupled.
Related SE question:
What does it mean to "program to an interface"?
Even for local variables, using the interface over the concrete class helps. You may end up calling a method that is outside the interface and then it is difficult to change the implementation of the List if necessary.
Also, it is best to use the least specific class or interface in a declaration. If element order does not matter, use a Collection instead of a List. That gives your code the maximum flexibility.
Properties of your classes/interfaces should be exposed through interfaces because it gives your classes a contract of behavior to use, regardless of the implementation.
However...
In local variable declarations, it makes little sense to do this:
public void someMethod() {
List theList = new ArrayList();
//do stuff with the list
}
If its a local variable, just use the type. It is still implicitly upcastable to its appropriate interface, and your methods should hopefully accept the interface types for its arguments, but for local variables, it makes total sense to use the implementation type as a container, just in case you do need the implementation-specific functionality.
In general for your line of code it does not make sense to bother with interfaces. But, if we are talking about APIs there is a really good reason. I got small class
class Counter {
static int sizeOf(List<?> items) {
return items.size();
}
}
In this case is usage of interface required. Because I want to count size of every possible implementation including my own custom. class MyList extends AbstractList<String>....
Interface is exposed to the end user. One class can implement multiple interface. User who have expose to specific interface have access to some specific behavior which are defined in that particular interface.
One interface also have multiple implementation. Based on the scenario system will work with different scenario (Implementation of the interface).
let me know if you need more explanation.
The interface often has better representation in the debugger view than the concrete class.

Categories

Resources