Why should the interface for a Java class be preferred? - java

PMD would report a violation for:
ArrayList<Object> list = new ArrayList<Object>();
The violation was "Avoid using implementation types like 'ArrayList'; use the interface instead".
The following line would correct the violation:
List<Object> list = new ArrayList<Object>();
Why should the latter with List be used instead of ArrayList?

Using interfaces over concrete types is the key for good encapsulation and for loose coupling your code.
It's even a good idea to follow this practice when writing your own APIs. If you do, you'll find later that it's easier to add unit tests to your code (using Mocking techniques), and to change the underlying implementation if needed in the future.
Here's a good article on the subject.
Hope it helps!

This is preferred because you decouple your code from the implementation of the list. Using the interface lets you easily change the implementation, ArrayList in this case, to another list implementation without changing any of the rest of the code as long as it only uses methods defined in List.

In general I agree that decoupling interface from implementation is a good thing and will make your code easier to maintain.
There are, however, exceptions that you must consider. Accessing objects through interfaces adds an additional layer of indirection that will make your code slower.
For interest I ran an experiment that generated ten billion sequential accesses to a 1 million length ArrayList. On my 2.4Ghz MacBook, accessing the ArrayList through a List interface took 2.10 seconds on average, when declaring it of type ArrayList it took on average 1.67 seconds.
If you are working with large lists, deep inside an inner loop or frequently called function, then this is something to consider.

ArrayList and LinkedList are two implementations of a List, which is an ordered collection of items. Logic-wise it doesn't matter if you use an ArrayList or a LinkedList, so you shouldn't constrain the type to be that.
This contrasts with say, Collection and List, which are different things (List implies sorting, Collection does not).

Why should the latter with List be used instead of ArrayList?
It's a good practice : Program to interface rather than implementation
By replacing ArrayList with List, you can change List implementation in future as below depending on your business use case.
List<Object> list = new LinkedList<Object>();
/* Doubly-linked list implementation of the List and Deque interfaces.
Implements all optional list operations, and permits all elements (including null).*/
OR
List<Object> list = new CopyOnWriteArrayList<Object>();
/* A thread-safe variant of ArrayList in which all mutative operations
(add, set, and so on) are implemented by making a fresh copy of the underlying array.*/
OR
List<Object> list = new Stack<Object>();
/* The Stack class represents a last-in-first-out (LIFO) stack of objects.*/
OR
some other List specific implementation.
List interface defines contract and specific implementation of List can be changed. In this way, interface and implementation are loosely coupled.
Related SE question:
What does it mean to "program to an interface"?

Even for local variables, using the interface over the concrete class helps. You may end up calling a method that is outside the interface and then it is difficult to change the implementation of the List if necessary.
Also, it is best to use the least specific class or interface in a declaration. If element order does not matter, use a Collection instead of a List. That gives your code the maximum flexibility.

Properties of your classes/interfaces should be exposed through interfaces because it gives your classes a contract of behavior to use, regardless of the implementation.
However...
In local variable declarations, it makes little sense to do this:
public void someMethod() {
List theList = new ArrayList();
//do stuff with the list
}
If its a local variable, just use the type. It is still implicitly upcastable to its appropriate interface, and your methods should hopefully accept the interface types for its arguments, but for local variables, it makes total sense to use the implementation type as a container, just in case you do need the implementation-specific functionality.

In general for your line of code it does not make sense to bother with interfaces. But, if we are talking about APIs there is a really good reason. I got small class
class Counter {
static int sizeOf(List<?> items) {
return items.size();
}
}
In this case is usage of interface required. Because I want to count size of every possible implementation including my own custom. class MyList extends AbstractList<String>....

Interface is exposed to the end user. One class can implement multiple interface. User who have expose to specific interface have access to some specific behavior which are defined in that particular interface.
One interface also have multiple implementation. Based on the scenario system will work with different scenario (Implementation of the interface).
let me know if you need more explanation.

The interface often has better representation in the debugger view than the concrete class.

Related

What are the differences between these two object declarations which uses an Interface and a Class in Java? [duplicate]

PMD would report a violation for:
ArrayList<Object> list = new ArrayList<Object>();
The violation was "Avoid using implementation types like 'ArrayList'; use the interface instead".
The following line would correct the violation:
List<Object> list = new ArrayList<Object>();
Why should the latter with List be used instead of ArrayList?
Using interfaces over concrete types is the key for good encapsulation and for loose coupling your code.
It's even a good idea to follow this practice when writing your own APIs. If you do, you'll find later that it's easier to add unit tests to your code (using Mocking techniques), and to change the underlying implementation if needed in the future.
Here's a good article on the subject.
Hope it helps!
This is preferred because you decouple your code from the implementation of the list. Using the interface lets you easily change the implementation, ArrayList in this case, to another list implementation without changing any of the rest of the code as long as it only uses methods defined in List.
In general I agree that decoupling interface from implementation is a good thing and will make your code easier to maintain.
There are, however, exceptions that you must consider. Accessing objects through interfaces adds an additional layer of indirection that will make your code slower.
For interest I ran an experiment that generated ten billion sequential accesses to a 1 million length ArrayList. On my 2.4Ghz MacBook, accessing the ArrayList through a List interface took 2.10 seconds on average, when declaring it of type ArrayList it took on average 1.67 seconds.
If you are working with large lists, deep inside an inner loop or frequently called function, then this is something to consider.
ArrayList and LinkedList are two implementations of a List, which is an ordered collection of items. Logic-wise it doesn't matter if you use an ArrayList or a LinkedList, so you shouldn't constrain the type to be that.
This contrasts with say, Collection and List, which are different things (List implies sorting, Collection does not).
Why should the latter with List be used instead of ArrayList?
It's a good practice : Program to interface rather than implementation
By replacing ArrayList with List, you can change List implementation in future as below depending on your business use case.
List<Object> list = new LinkedList<Object>();
/* Doubly-linked list implementation of the List and Deque interfaces.
Implements all optional list operations, and permits all elements (including null).*/
OR
List<Object> list = new CopyOnWriteArrayList<Object>();
/* A thread-safe variant of ArrayList in which all mutative operations
(add, set, and so on) are implemented by making a fresh copy of the underlying array.*/
OR
List<Object> list = new Stack<Object>();
/* The Stack class represents a last-in-first-out (LIFO) stack of objects.*/
OR
some other List specific implementation.
List interface defines contract and specific implementation of List can be changed. In this way, interface and implementation are loosely coupled.
Related SE question:
What does it mean to "program to an interface"?
Even for local variables, using the interface over the concrete class helps. You may end up calling a method that is outside the interface and then it is difficult to change the implementation of the List if necessary.
Also, it is best to use the least specific class or interface in a declaration. If element order does not matter, use a Collection instead of a List. That gives your code the maximum flexibility.
Properties of your classes/interfaces should be exposed through interfaces because it gives your classes a contract of behavior to use, regardless of the implementation.
However...
In local variable declarations, it makes little sense to do this:
public void someMethod() {
List theList = new ArrayList();
//do stuff with the list
}
If its a local variable, just use the type. It is still implicitly upcastable to its appropriate interface, and your methods should hopefully accept the interface types for its arguments, but for local variables, it makes total sense to use the implementation type as a container, just in case you do need the implementation-specific functionality.
In general for your line of code it does not make sense to bother with interfaces. But, if we are talking about APIs there is a really good reason. I got small class
class Counter {
static int sizeOf(List<?> items) {
return items.size();
}
}
In this case is usage of interface required. Because I want to count size of every possible implementation including my own custom. class MyList extends AbstractList<String>....
Interface is exposed to the end user. One class can implement multiple interface. User who have expose to specific interface have access to some specific behavior which are defined in that particular interface.
One interface also have multiple implementation. Based on the scenario system will work with different scenario (Implementation of the interface).
let me know if you need more explanation.
The interface often has better representation in the debugger view than the concrete class.

Instantiating an interface with a concrete class

Suppose for a program that I needed a List of some kind. Any kind of List will do, so I create one.
List<Integer> example = new LinkedList<Integer>();
I've heard that it's good practice, when instantiating your objects, to define them as interfaces, if you are doing something that requires the use of that interface, but not necessarily that specific concrete class. For example, I could have made that "example" list an ArrayList instead.
However, defining my LinkedList to be a List interface in this way limits me. I can't use any of the LinkedList-specific methods, for example. I can only use methods that are in the List interface itself. The only way I've found to use my LinkedList-specific methods are to cast example to a LinkedList, like so:
((LinkedList)example).addLast(1);
...which seems to defeat the purpose, since by casting "example" to be a LinkedList, I may as well have created it and defined it to be a LinkedList in the first place, instead of a List.
So why is it good practice to create concrete classes and define them via interface? Is there something that I am missing?
LinkedList implements several interfaces.
The method addLast() comes from the Deque interface. It could be that's the interface you want to use.
Alternatively you might just need a List, in which case the add() method will append to the list.

Why we should use Interface, instead of concrete types?

When using collections in Java, we are advised to use Interface instead of concrete types.
Like: List<Object> list = new ArrayList<Object>();
But, using ArrayList<Object> list = new ArrayList<Object>(); will also does the same job, right?
Yes, but if you later change your mind and use a LinkedList You have to change much more in your code.
That is the Polymorphism which is the core concept of OOP.
It means ‘a state of having many shapes’ or ‘the capacity to take on different forms’. When applied to OOP , it describes a language’s ability to process objects of various types and classes through a single, uniform interface.
List is a Uniform interface and its Different implementations are like ArrayList ,LinkedList.....etc
Prefer to read :What does it mean to program to a interface?
When you define your list as:
List myList = new ArrayList();
you can only call methods and reference members that belong to List class. If you define it as:
ArrayList myList = new ArrayList();
you'll be able to invoke ArrayList specific methods and use ArrayList specific members in addition to those inherited from List.
Nevertheless, when you call a method of a List class in the first example, which was overridden in ArrayList, then method from ArrayList will be called not the one in the List.
Also the first has the advantage that the implementation of the List can change (to a LinkedList for example), without affecting the rest of the code. This is will be difficult to do with an ArrayList, not only because you will need to change ArrayList to LinkedList everywhere, but also because you may have used ArrayList specific methods.
There's a useful principle: for declared types, use the loosest (vaguest) interface possible (and List is 'looser' than ArrayList).
In practice, this means if you only need to access methods declared in List<Object> on your list instance (which is actually an ArrayList), then declare it as List<Object>. This means you can change your mind on the exact type of list later and you only need to change the line that actually instantiates the ArrayList (or LinkedList or whatever you choose).
This has implications for method signature too: if you were passing around an ArrayList instead of a List, and then changed your mind about it being an ArrayList, you have to go and edit lots of method signatures.
Please read up on Polymorphism if you'd like to know more.
Tangentially related is the Liskov Substitution Principle:
What is the Liskov Substitution Principle?
Interfaces or should I say base calsses are used to generalize things and problems at hand. So when you implement an interface you can always get the specific objects.
For example:
From Animal interface or super class you can always derive specific interfaces or calsses like Lion, but not the other way, becaus its true that a Lion is an animal but several other animals cannot be derived from Lion. Thats why it is advised to make things general and hence use interfaces.
Same applies in your case. You can always get ArrayList and other implementations from a List.
Say you have a class with the following method
public ArrayList<T> foo (ArrayList<T> someInput) {
//Do some operations on someInput here...
return someOutput;
}
Now, what happens if you change the program so that it uses LinkedList objects instead of ArrayList objects? You will get a compiler error wherever this method is called, and you would have to go through and refactor your code so that it accepts LinkedList objects.
If you had programmed to an interface and used a List instead:
public List<T> foo (List<T> someInput) {
//Do some operations on someInput here....
return someOutput;
}
If this was the case, no refactoring would be necessary as both the LinkedList and ArrayList classes implement List so there would be no compiler errors. This makes it incredibly flexible. It does not matter to the method what it takes in and what it returns, as long as the objects implement the List interface. This allows you to define behaviour without exposing any of the underlying implementation.

Collection List and Subclass Initialization

Its always said its better to use a collection object as below
1) List st = new LinkedList();
2) Map mp = new HashMap();
Than
3) LinkedList st = new LinkedList();
4) HashMap mp = new HashMap();
I agree by defining as above (1,2) I can reassign the same variable (st,mp) to other objects of List, Map interface
But Here I cant use the methods that are defined only in LinkedList, Hashmap which is correct as those are not visible for List, Map . (Please correct me if am worng)
But if am defining a object of HashMap or LinkedList, I want to use it for some special functionality from these.
Then Why is it said the best way to create a collection object is as done in ( 1,2 )
Because most of the time you don't need the special methods. If you need the special methods, then obviously you need to reference the specific type.
Lesson for today: Don't blindly apply programming principles without using your own brain.
But if am defining a object of HashMap or LinkedList, I want to use it for some special functionality from these.
In that case, you should absolutely declare the variable using the concrete class. That's fine.
The point of using the interface instead is to indicate that you only need the functionality exposed by that interface, leaving you open to potentially change implementation later. (Although you'd need to be careful of the performance and even behavioural implications of which concrete implementation you choose.)
I agree by defining as above (1,2) I can reassign the same variable
(st,mp) to other objects of List,Map interface
Yes, it's a general practice called programming against interfaces.
But Here I cant use the methods that are defined only in LinkedList,
Hashmap which is correct as those are not visible for List,Map .
(Please correct me if am worng)
No, you are right.
But if am defining a object of HashMap or LinkedList, I want to use it
for some special functionality from these.
Then Why is it said the best way to create a collection object is as
done in ( 1,2 )
This isn't the best way. If you need to use specific methods of those classes you need the reference to the concrete type. If you need to use those collections from a client class that is not supposed to know the internal implementation than it's better to expose only the interface.
Through interfaces you define service contracts. As you say, should you change the lower implementation of a given interface, you can do it flawlesly without any impact on your current code.
If you need any particular behaviour of the particular classes it's absolutely right to use them. Maps usually extend the AbstractMap class that itself implements Map, making the subclasses inherit those methods.
Of course, many classes throw IllegalOperationException on some defined methods of the Map interface, so that implementation type change is not always flawless (but in most cases, it is, because each map has a particular asset that makes it the most appropiate choice for a given context).
Use the type that suits you, not the one that someone says it's the correct one. Every rule has exceptions.
Because if you use the interface to access the collections, you are free to change the implementation. Eg use a ArrayList instead LinkedList, or a synchronized version of it.
This mostly applies to cases where you have a Collection in a public interface of the class, internally i wouldn't bother, just use what you need.

Why do we first declare subtypes as their supertype before we instantiate them?

Reading other people's code, I've seen a lot of:
List<E> ints = new ArrayList<E>();
Map<K, V> map = new HashMap<K, V>();
My question is: what is the point/advantage of instantiating them that way as opposed to:
ArrayList<E> ints = new ArrayList<E>();
HashMap<K, V> map = new HashMap<K, V>();
What also makes it odd is that I've never seen anything like:
CharSequence s = new String("String");
or
OutputStream out = new PrintStream(OutputStream);
Duplicates (of the first part of the question):
When/why to use/define an interface
Use interface or type for variable definition in java?
When should I use an interface in java?
why are interfaces created instead of their implementations for every class
What's the difference between these two java variable declarations?
Quick answer? Using interfaces and superclasses increases the portability and maintainability of your code, principally by hiding implementation detail. Take the following hypothetical example:
class Account {
private Collection<Transaction> transactions;
public Account() {
super();
transactions = new ArrayList<Transaction>(4);
}
public Collection<Transaction> getTransactions() {
return transactions;
}
}
I've declared a contract for an Account that states that the transactions posted to the account can be retrieved as a Collection. The callers of my code don't have to care what kind of collection my method actually returns, and shouldn't. And that frees me to change up the internal implementation if I need to, without impacting (aka breaking) unknown number of clients. So to wit, if I discover that I need to impose some kind of uniqueness on my transactions, I can change the implementation shown above from an ArrayList to a HashSet, with no negative impact on anyone using my class.
public Account() {
super();
transactions = new HashSet<Transaction>(4);
}
As far as your second question, I can say that you use the principal of portability and encapsulation wherever they make sense. There are not a terrible lot of CharSequence implementations out there, and String is by far the most used common. So you just won't see alot of developers declaring CharSequence variables in their code.
Using interfaces has the main advantage that you can later change the implementation (the class) without the need to change more than the single line where you create the instance and do the assignment.
For
List<E> ints = new ArrayList<E>();
Map<K, V> map = new HashMap<K, V>();
List and Map are the interfaces, so any class implementing those interfaces can be assigned to these references.
ArrayList is one of the several classes (another is LinkedList) which implement List interface.
Same with Map. HashMap, LinkedHashMap, TreeMap all implement Map.
It is a general principle To program for interfaces and not for implementations. Due to this, the programming task becomes easier. You can dynamically change the behavior of the references.
If you write
ArrayList<E> ints = new ArrayList<E>();
HashMap<K, V> map = new HashMap<K, V>();
ints and map will be ArrayList and HashMap only, forever.
Is a design principle that you program to the interface and not to the implementation.
That way you may provide later a new implementation to the same interface.
From the above link Eric Gamma explains:
This principle is really about dependency relationships which have to be carefully managed in a large app. It's easy to add a dependency on a class. It's almost too easy; just add an import statement and modern Java development tools like Eclipse even write this statement for you. Interestingly the inverse isn't that easy and getting rid of an unwanted dependency can be real refactoring work or even worse, block you from reusing the code in another context. For this reason you have to develop with open eyes when it comes to introducing dependencies. This principle tells us that depending on an interface is often beneficial.
Here, the termin interface refers not only to the Java artifact, but the public interface a given object has, which is basically composed of the methods it has, so, it could be a Java interface ( like List in your example ) or a concrete superclass.
So in your example if you ever want to use a LinkedList instead it would be harder because the type is already declared as ArrayList when just list would've been enough.
Of course, if you need specific methods from a given implementation, you have to declare it of that type.
I hope this helps.
#Bhushan answered why.
To answer your confusion Why nobody uses
CharSequence s = new String("String");
or
OutputStream out = new PrintStream(OutputStream);
CharSequence contains only few common methods. Other classes that implement this interface are mostly buffers and only String is immutable. CharSequence defines common api for classes backed by char array and This interface does not refine the general contracts of the equals and hashCode methods (see javadoc).
OutputStream is low-level api for writing data. Because PrintStream adds extra convenient methods for writing - higher level of abstraction, it's used over OutputStream.
You do this to make sure later when working with the variable you (or anyone using your classes) won't rely on methods specific for the implementation chosen (ArrayList, HashMap, etc.)
The reason behind this is not technical but the stuff you have to read between the lines of code: The List and Map examples says: "I'm only interested in basic list/map stuff, basically you can use anything here." An extreme example of that would be
Iterable<Foo> items = new ArrayList<Foo>();
when you really only want to do some stuff for each thing.
As an added bonus this makes it a little easier to refactor the code later into common utility classes/methods where the concrete type is not required. Or do you want to code your algorithm multiple times for each kind of collection?
The String example on the other hand is not seen wildly, because a) String is special class in Java - each "foo" literal is automatically a String and sooner or later you have to give the characters to some method which only accepts String and b) the CharSequence is really ahh minimal. It does not even support Unicode beyond the BMP properly and it misses most query/manipulation methods of String.
This (good) style of declaring the type as the Interface the class implements is important because it forces us to use methods only defined in the Interface.
As a result, when we need to change our class implementations (i.e. we find our ArraySet is better than the standard HashSet) we are guaranteed that if we change the class our code will work because both classes implement the strictly-enforced Interface.
It is just easier to think of String as of String. As well as it's easier (and more beneficial) to think of WhateverList as of List.
The bonuses are discussed many times, but in brief you simply separate the concerns: when you need a CharSequence, you use it. It's highly unlikely that you need ArrayList only: usually, any List will do.
When you at some point decide to use a different implementation, say:
List<E> ints = new LinkedList<E>();
instead of
List<E> ints = new ArrayList<E>();
this change needs to be done only at a single place.
There is the right balance to strike:
usually you use the type which gives you the most appropriate guarantees. Obviously, a List is also a Collection which is also something Iterable. But a collection does not give you an order, and an iterable does not have an "add" method.
Using ArrayList for the variable type is also reasonable, when you want to be a bit more explicit about the need for fast random access by object position - in a LinkedList, a "get(100)" is a lot slower. (It would be nice if Java had an interface for this, but I don't think there is one. By using ArrayList, you disallow casting an array as list.)
List<E> ints = new ArrayList<E>();
If you write some code that deals only with List then it will work for any class that implements List (e.g. LinkedList, etc). But, if your code directly deals with ArrayList then it's limited to ArrayList.
CharSequence s = new String("String");
Manually instantiating a String object is not good. You should use string literal instead. I am just guessing the reason that you don't see CharSequence might because it's quite new and also, strings are immutable.
This is programming to the interface not the implementation, as per the Gang of Four. This will help to stop the code becoming dependent on methods that are added to particular implementations only, and make it easier to change to use a different implementation if that becomes necessary for whatever reason, e.g. performance.

Categories

Resources