I understand Java best practice suggests that while declaring a variable the generic set interface is used to declare on the left and the specific implementation on the right.
Thus if I have to declare the Set interfaces, the right way is,
Set<String> set = new TreeSet<>();
However with this declaration I'm not able to access the set.last() method.
However when I declare it this way,
TreeSet<String> set = new TreeSet<>();
I can access, last() and first().
Can someone help me understand why?
Thelast() and first() are specific methods belonging to TreeSet and not the generic interface Set. When you refer to the variable, it's looking at the source type and not the allocated type, therefore if you're storing a TreeSet as a Set, it may only be treated as a Set. This essentially hides the additional functionality.
As an example, every class in Java extends Object. Therefore this is a totally valid allocation:
final Object mMyList = new ArrayList<String>();
However, we'll never be able to use ArrayList style functionality when referring directly to mMyList without applying type-casting, since all Java can tell is that it's dealing with a generic Object; nothing more.
You can use the SortedSet interface as your declaration (which has the first() and last() methods).
The interface of Set does not provide last() and first() because it does not always make sense for a Set.
Java is a static typing language. When compiler see you doing set.last(), as it is expecting set to be a Set. It will not know whether set is a TreeSet that provides last(), or it is a HashSet that does not. That's why it is complaining.
Hold on. You do not need to declare it using the concrete class TreeSet here. TreeSet bears a SortedSet interface which provides such methods. In another word: because you need to work with a Set that is sorted (for which it make sense to provide first() and last()), the interface that you should work against should be SortedSet instead of Set
Hence what you should be doing is
SortedSet<String> set = new TreeSet<>();
To add to the existing answers here, the reason is that TreeSet implements the interface NavigableSet which inherits from the interface java.util.SortedSet the methods comparator, first, last and spliterator.
Just the Set, on the other side, doesn't have SortedSet methods because it's not inherit from it.
Check it out from more Java TreeSet examples.
Related
PMD would report a violation for:
ArrayList<Object> list = new ArrayList<Object>();
The violation was "Avoid using implementation types like 'ArrayList'; use the interface instead".
The following line would correct the violation:
List<Object> list = new ArrayList<Object>();
Why should the latter with List be used instead of ArrayList?
Using interfaces over concrete types is the key for good encapsulation and for loose coupling your code.
It's even a good idea to follow this practice when writing your own APIs. If you do, you'll find later that it's easier to add unit tests to your code (using Mocking techniques), and to change the underlying implementation if needed in the future.
Here's a good article on the subject.
Hope it helps!
This is preferred because you decouple your code from the implementation of the list. Using the interface lets you easily change the implementation, ArrayList in this case, to another list implementation without changing any of the rest of the code as long as it only uses methods defined in List.
In general I agree that decoupling interface from implementation is a good thing and will make your code easier to maintain.
There are, however, exceptions that you must consider. Accessing objects through interfaces adds an additional layer of indirection that will make your code slower.
For interest I ran an experiment that generated ten billion sequential accesses to a 1 million length ArrayList. On my 2.4Ghz MacBook, accessing the ArrayList through a List interface took 2.10 seconds on average, when declaring it of type ArrayList it took on average 1.67 seconds.
If you are working with large lists, deep inside an inner loop or frequently called function, then this is something to consider.
ArrayList and LinkedList are two implementations of a List, which is an ordered collection of items. Logic-wise it doesn't matter if you use an ArrayList or a LinkedList, so you shouldn't constrain the type to be that.
This contrasts with say, Collection and List, which are different things (List implies sorting, Collection does not).
Why should the latter with List be used instead of ArrayList?
It's a good practice : Program to interface rather than implementation
By replacing ArrayList with List, you can change List implementation in future as below depending on your business use case.
List<Object> list = new LinkedList<Object>();
/* Doubly-linked list implementation of the List and Deque interfaces.
Implements all optional list operations, and permits all elements (including null).*/
OR
List<Object> list = new CopyOnWriteArrayList<Object>();
/* A thread-safe variant of ArrayList in which all mutative operations
(add, set, and so on) are implemented by making a fresh copy of the underlying array.*/
OR
List<Object> list = new Stack<Object>();
/* The Stack class represents a last-in-first-out (LIFO) stack of objects.*/
OR
some other List specific implementation.
List interface defines contract and specific implementation of List can be changed. In this way, interface and implementation are loosely coupled.
Related SE question:
What does it mean to "program to an interface"?
Even for local variables, using the interface over the concrete class helps. You may end up calling a method that is outside the interface and then it is difficult to change the implementation of the List if necessary.
Also, it is best to use the least specific class or interface in a declaration. If element order does not matter, use a Collection instead of a List. That gives your code the maximum flexibility.
Properties of your classes/interfaces should be exposed through interfaces because it gives your classes a contract of behavior to use, regardless of the implementation.
However...
In local variable declarations, it makes little sense to do this:
public void someMethod() {
List theList = new ArrayList();
//do stuff with the list
}
If its a local variable, just use the type. It is still implicitly upcastable to its appropriate interface, and your methods should hopefully accept the interface types for its arguments, but for local variables, it makes total sense to use the implementation type as a container, just in case you do need the implementation-specific functionality.
In general for your line of code it does not make sense to bother with interfaces. But, if we are talking about APIs there is a really good reason. I got small class
class Counter {
static int sizeOf(List<?> items) {
return items.size();
}
}
In this case is usage of interface required. Because I want to count size of every possible implementation including my own custom. class MyList extends AbstractList<String>....
Interface is exposed to the end user. One class can implement multiple interface. User who have expose to specific interface have access to some specific behavior which are defined in that particular interface.
One interface also have multiple implementation. Based on the scenario system will work with different scenario (Implementation of the interface).
let me know if you need more explanation.
The interface often has better representation in the debugger view than the concrete class.
So I have to use a TreeSet in my code.
As TreeSet<E> extends AbstractSet<E> implements NavigableSet<E>, Cloneable, java.io.Serializable
and interface NavigableSet<E> extends SortedSet<E> which extends Set<E>
I can use any of these three declaration:
NavigableSet<String> myTreeSet= new TreeSet<>();
SortedSet<String> myTreeSet= new TreeSet<>();
Set<String> myTreeSet= new TreeSet<>();
I know I will be having access to only those method which are exposed by the Interface I am using in the declaration. Is there any other reason to consider for selecting a particular declaration for a TreeSet?
its basically what you allow others (or yourself) to use, as you are stated. Other methods you like to use with your TreeSet might depend on the actual declaration. So there might be a method requiring a SortedSet, but when you define your TreeSet as Set, it will not be able to proceed
Please, don't make preliminary decisions. If you don't need methods from NavigableSet don't use it. Just use Set<String>.
You should use which ever interface you like to program. But what makes the difference is to which interface you are programming. I mean, programming to an interface (declaration type) and not to the actual TreeSet collection.
Here is the best answer which explains Programming to an interface.
As a guiding rule always choose the most top level type possible because this allows greater decoupling from the client code towards the concrete implementation. Most of the time it is not important to the calling code to know which implementation is used, you just want to provide the behaviour not expose it. Said that, sometimes you will feel like needing a little more control, then you should go through the interfaces hierarchy until you find the necessary level of control, but this should be the exception, not the rule
Its always said its better to use a collection object as below
1) List st = new LinkedList();
2) Map mp = new HashMap();
Than
3) LinkedList st = new LinkedList();
4) HashMap mp = new HashMap();
I agree by defining as above (1,2) I can reassign the same variable (st,mp) to other objects of List, Map interface
But Here I cant use the methods that are defined only in LinkedList, Hashmap which is correct as those are not visible for List, Map . (Please correct me if am worng)
But if am defining a object of HashMap or LinkedList, I want to use it for some special functionality from these.
Then Why is it said the best way to create a collection object is as done in ( 1,2 )
Because most of the time you don't need the special methods. If you need the special methods, then obviously you need to reference the specific type.
Lesson for today: Don't blindly apply programming principles without using your own brain.
But if am defining a object of HashMap or LinkedList, I want to use it for some special functionality from these.
In that case, you should absolutely declare the variable using the concrete class. That's fine.
The point of using the interface instead is to indicate that you only need the functionality exposed by that interface, leaving you open to potentially change implementation later. (Although you'd need to be careful of the performance and even behavioural implications of which concrete implementation you choose.)
I agree by defining as above (1,2) I can reassign the same variable
(st,mp) to other objects of List,Map interface
Yes, it's a general practice called programming against interfaces.
But Here I cant use the methods that are defined only in LinkedList,
Hashmap which is correct as those are not visible for List,Map .
(Please correct me if am worng)
No, you are right.
But if am defining a object of HashMap or LinkedList, I want to use it
for some special functionality from these.
Then Why is it said the best way to create a collection object is as
done in ( 1,2 )
This isn't the best way. If you need to use specific methods of those classes you need the reference to the concrete type. If you need to use those collections from a client class that is not supposed to know the internal implementation than it's better to expose only the interface.
Through interfaces you define service contracts. As you say, should you change the lower implementation of a given interface, you can do it flawlesly without any impact on your current code.
If you need any particular behaviour of the particular classes it's absolutely right to use them. Maps usually extend the AbstractMap class that itself implements Map, making the subclasses inherit those methods.
Of course, many classes throw IllegalOperationException on some defined methods of the Map interface, so that implementation type change is not always flawless (but in most cases, it is, because each map has a particular asset that makes it the most appropiate choice for a given context).
Use the type that suits you, not the one that someone says it's the correct one. Every rule has exceptions.
Because if you use the interface to access the collections, you are free to change the implementation. Eg use a ArrayList instead LinkedList, or a synchronized version of it.
This mostly applies to cases where you have a Collection in a public interface of the class, internally i wouldn't bother, just use what you need.
With respect to the following two different definitions of sets, what are the differences:
Set<Integer> intset = new Hashset<Integer>();
Set<Integer> intset = new Set<Integer>();
Thanks.
Since Set is an interface, the second won't compile.
You can't declare new Set its an interface. All of those(Set, Map, List) are interfaces the java.collections package. They are not directly instantiable, but require that implementations(HashSet, ArrayList, Hashmap) be supplied in the right hand side of the assignment operation.
The second one won't even compile. Often people ask what's the difference between these:
HashSet<Integer> intset = new Hashset<Integer>();
Set<Integer> intset = new HashSet<Integer>();
and perhaps that's what you meant to ask. The difference here is that code written using the first definition is dependent on the particular choice of Set implementation (HashSet vs. TreeSet or something else) whereas the second declaration would let you trivially change to a different implementation without modifying any other code. It's a good practice in general -- keeps you flexible.
java.util.Set is an interface while java.util.HashSet is an actual implementation.
You cannot instantiate a Set the way you are in your second definition.
Set is an Interface, and cannot be instantiated, it simply defines a contract by which concrete implementations can follow.
You can, however, instantiate an anonymous inner class that would follow Set's interface:
Set<Integer> intSet = new Set<Integer>() {
//need to define all set methods here...
};
In Java Set is an interface and thus cannot be initiated, thus the second one is wrong.
But the actual difference in terminology is that a Set is a mathematical concept conveying to some rules (e.g. uniqueness, order non-importance). A HashSet is a technique to implement the Set concept, using a Hashtable, which makes it computationally very fast --- amortized constant time insertion, deletion and access.
Set is an interface that provides some abstract methods for specific Set implementations to provide. You can't initialize a Set object, the same way you can't a List. An interface is similar to a class, however everything it contains is abstract and a class can implement multiple interfaces, but only extend a single class. Also, a class can contain both abstract and concrete methods while an interface can't. Interfaces are sort of a way of dealing with multiple inheritance.
Anyway:
http://download.oracle.com/javase/6/docs/api/java/util/Set.html
http://download.oracle.com/javase/tutorial/java/concepts/interface.html
Compiler error on line:
Set<Integer> intset = new Set<Integer>();
PMD would report a violation for:
ArrayList<Object> list = new ArrayList<Object>();
The violation was "Avoid using implementation types like 'ArrayList'; use the interface instead".
The following line would correct the violation:
List<Object> list = new ArrayList<Object>();
Why should the latter with List be used instead of ArrayList?
Using interfaces over concrete types is the key for good encapsulation and for loose coupling your code.
It's even a good idea to follow this practice when writing your own APIs. If you do, you'll find later that it's easier to add unit tests to your code (using Mocking techniques), and to change the underlying implementation if needed in the future.
Here's a good article on the subject.
Hope it helps!
This is preferred because you decouple your code from the implementation of the list. Using the interface lets you easily change the implementation, ArrayList in this case, to another list implementation without changing any of the rest of the code as long as it only uses methods defined in List.
In general I agree that decoupling interface from implementation is a good thing and will make your code easier to maintain.
There are, however, exceptions that you must consider. Accessing objects through interfaces adds an additional layer of indirection that will make your code slower.
For interest I ran an experiment that generated ten billion sequential accesses to a 1 million length ArrayList. On my 2.4Ghz MacBook, accessing the ArrayList through a List interface took 2.10 seconds on average, when declaring it of type ArrayList it took on average 1.67 seconds.
If you are working with large lists, deep inside an inner loop or frequently called function, then this is something to consider.
ArrayList and LinkedList are two implementations of a List, which is an ordered collection of items. Logic-wise it doesn't matter if you use an ArrayList or a LinkedList, so you shouldn't constrain the type to be that.
This contrasts with say, Collection and List, which are different things (List implies sorting, Collection does not).
Why should the latter with List be used instead of ArrayList?
It's a good practice : Program to interface rather than implementation
By replacing ArrayList with List, you can change List implementation in future as below depending on your business use case.
List<Object> list = new LinkedList<Object>();
/* Doubly-linked list implementation of the List and Deque interfaces.
Implements all optional list operations, and permits all elements (including null).*/
OR
List<Object> list = new CopyOnWriteArrayList<Object>();
/* A thread-safe variant of ArrayList in which all mutative operations
(add, set, and so on) are implemented by making a fresh copy of the underlying array.*/
OR
List<Object> list = new Stack<Object>();
/* The Stack class represents a last-in-first-out (LIFO) stack of objects.*/
OR
some other List specific implementation.
List interface defines contract and specific implementation of List can be changed. In this way, interface and implementation are loosely coupled.
Related SE question:
What does it mean to "program to an interface"?
Even for local variables, using the interface over the concrete class helps. You may end up calling a method that is outside the interface and then it is difficult to change the implementation of the List if necessary.
Also, it is best to use the least specific class or interface in a declaration. If element order does not matter, use a Collection instead of a List. That gives your code the maximum flexibility.
Properties of your classes/interfaces should be exposed through interfaces because it gives your classes a contract of behavior to use, regardless of the implementation.
However...
In local variable declarations, it makes little sense to do this:
public void someMethod() {
List theList = new ArrayList();
//do stuff with the list
}
If its a local variable, just use the type. It is still implicitly upcastable to its appropriate interface, and your methods should hopefully accept the interface types for its arguments, but for local variables, it makes total sense to use the implementation type as a container, just in case you do need the implementation-specific functionality.
In general for your line of code it does not make sense to bother with interfaces. But, if we are talking about APIs there is a really good reason. I got small class
class Counter {
static int sizeOf(List<?> items) {
return items.size();
}
}
In this case is usage of interface required. Because I want to count size of every possible implementation including my own custom. class MyList extends AbstractList<String>....
Interface is exposed to the end user. One class can implement multiple interface. User who have expose to specific interface have access to some specific behavior which are defined in that particular interface.
One interface also have multiple implementation. Based on the scenario system will work with different scenario (Implementation of the interface).
let me know if you need more explanation.
The interface often has better representation in the debugger view than the concrete class.