Why do people create arraylist like this? - java

Occasionally I see somebody create an arraylist like this, why?
List numbers = new ArrayList( );
Instead of:
ArrayList<something> numbers = new ArrayList<something>();

If you asking about using interface instead of concrete object, than it is a good practice. Imagine, you will switch to LinkedList tomorrow. In first case you won't need to fix variable declaration.
If the question was about non-using generics, then it is bad. Generics are always good as they give type safety.

What's good:
1. List is a general case for many implementations.
List trololo = new ListImpl();
Hides real implementation for the user:
public List giveMeTheList(){
List trololo = new SomeCoolListImpl();
return trololo;
}
By design it's good: user shouldn't pay attention to the realization. He just gets interface access for the implementation. Implementation should already has all neccessary properties: be fast for appending, be fast for inserting or be unmodifiable, e.t.c.
What's bad:
I've read that all raw types will be restricted in future Java versions, so such code better write this way:
List<?> trololo = new ListImpl<?>();
In general wildcard has the same meaning: you don't know fo sure will your collection be heterogenous or homogeneous?

Someday you could do:
List<something> numbers = new LinkedList<something>();without changing client code which calls numbers.

Declaring interface instead of implementation is indeed the rather good and widespread practice, but it is not always the best way. Use it everytime except for all of the following conditions are true:
You are completely sure, that chosen implementation will satisfy your needs.
You need some implementation-specific feauture, that is not available through interface, e.g. ArrayList.trimToSize()
Of course, you may use casting, but then using interface makes no sense at all.

The first line is old style Java, we had to do it before Java 1.5 introduced generics. But a lot of brilliant software engineers are still forced to use Java 1.4 (or less), because their companies fear risk and effort to upgrade the applications...
OK, that was off the records. A lot of legacy code has been produced with java 1.4 or less and has not been refactored.
The second line includes generics (so it's clearly 1.5+) and the variable is declared as an ArrayList. There's actually no big problem. Sure, always better to code against interfaces, so to my (and others) opinion, don't declare a variable as ArrayList unless you really need the special ArrayList methods.

Most of the time, when you don't care about the implementation, it's better to program to interface. So, something like:
List<something> numbers = new ArrayList<something>();
would be preferred than:
ArrayList<something> numbers = new ArrayList<something>();
The reason is you can tweak your program later for performance reason.
But, you have to be careful not to just choose the most generic interface available. For example, if you want to have a sorted set, instead of to Set, you should program to SortedSet, like this:
SortedSet<something> s = new TreeSet<something>();
If you just blatantly use interface like this:
Set<something> s = new TreeSet<something>();
Someone can modify the implementation to HashSet and your program will be broken.
Lastly, this program to interface will even be much more useful when you define a public API.

Two differences are that numbers in the first line is of type List, not ArrayList. This is possible because ArrayList is a descendant of List; that is, it has everything that List has, so can fill in for a List object. (This doesn't work the other way around.)
The second line's ArrayList is typed. This means that the second numbers list can only hold type something objects.

Related

Whether or not to code to an interface when only certain implementations provide correct behavior

So, I know that coding to an interface (using an interface as a variable's declared type instead of its concrete type) is a good practice in OO code, for a bunch of reasons. This is seen a lot, for example, with Java collections. Well, is referring to an interface in your program still a good thing to do when only certain implementations of that interface provide correct behavior?
For example, I have a Java program. In that program, I have multiple sets of objects. I chose to use a Set, because I didn't want duplicate elements. However, I wanted a list's ordering property (i.e. maintain insertion order). Therefore, I am using a LinkedHashSet as the concrete Set type. One thing these sets are used for is computing a dot product involving the primitive fields of the objects contained in the sets, such as in (simplifying a bit):
double dot(LinkedHashSet<E> set, double[] array) {
double sum = 0.0;
int i = 0;
for(E element : set) {
sum += (element.getValue()*array[i]);
}
return sum;
}
This method's result is dependent on the set's iteration order, and so certain Set implementations, mainly HashSet, will give incorrect/unexpected results. Currently, I am using LinkedHashSet throughout my program as the declared type, instead of Set, to ensure correct behavior. However, that feels bad stylistically. What's the right thing to do here? Is it okay to use the concrete type in this case? Or maybe should I use Set as the type, but then state in the documentation which implementations will/won't produce correct behavior? I'm looking more for general input than anything specific to the scenario above. In particular, this should apply to really any scenario where you're using the ordering properties of a LinkedHashSet or TreeSet. How do you prevent unintended implementations from being used? Do you force it in the code (by ditching the interface), or do you specify it in the documentation? Or perhaps some other approach?
It is true that you should code to interfaces, but only if the assurances they make fit your needs. In your case, if you would only use Set then you are saying: I don't want duplicates, but I don't care about the order. You could also use a List and mean: I care about insertion order, but not about duplicates. There even is a SortedSet but it does not have the ordering you want. So in your case you can't replace LinkedHashSet by one of its interfaces without violating the Liskov substitution principle.
So I would argue that in your case you should stick to the implementation until you really need the to switch to another implementation. With modern IDEs refactoring is not that hard anymore so I would refrain from doing any premature optimizations -- YAGNI and KISS.
Very very great question. One solution is: Make another interface! Say one that extends SortedMap but has a getInsertionOrderIterator() method or an interface that extends Map & has getOrderIterator() & getInsertionOrderIterator() methods.
You can write a quick adapter class that contains a LinkedHashMap & TreeMap as the backend data structures.
You can make arguments for either way. As long as you and others maintaining this code know that particular implementations of Set might break the rest of the app or library, then coding to the interface is fine. However, if that is not true, then you should use the specific implementation.
The purpose of coding to an interface is to give you flexibility that will not break your app. Take JDBC for instance. If you use the wrong driver it will break your program similar to how you are describing here. However, if let's say Oracle decided to put behavior in their JDBC driver that subtly broke code written to the JDBC spec instead of the specific Oracle driver code then you'd have to choose.
There is no cut and dry, "this is always right" type of answer.

Is a Collection better than a LinkedList?

Collection list = new LinkedList(); // Good?
LinkedList list = new LinkedList(); // Bad?
First variant gives more flexibility, but is that all? Are there any other reasons to prefer it? What about performance?
These are design decisions, and one size usually doesn't fit all. Also the choice of what is used internally for the member variable can (and usually should be) different from what is exposed to the outside world.
At its heart, Java's collections framework does not provide a complete set of interfaces that describe the performance characteristics without exposing the implementation details. The one interface that describes performance, RandomAccess is a marker interface, and doesn't even extend Collection or re-expose the get(index) API. So I don't think there is a good answer.
As a rule of thumb, I keep the type as unspecific as possible until I recognize (and document) some characteristic that is important. For example, as soon as I want methods to know that insertion order is retained, I would change from Collection to List, and document why that restriction is important. Similarly, move from List to LinkedList if say efficient removal from front becomes important.
When it comes to exposing the collection in public APIs, I always try to start exposing just the few APIs that are expected to get used; for example add(...) and iterator().
Collection list = new LinkedList(); //bad
This is bad because, you don't want this reference to refer say an HashSet(as HashSet also implements Collection and so does many other class's in the collection framework).
LinkedList list = new LinkedList(); //bad?
This is bad because, good practice is to always code to the interface.
List list = new LinkedList();//good
This is good because point 2 days so.(Always Program To an Interface)
Use the most specific type information on non-public objects. They are implementation details, and we want our implementation details as specific and precise as possible.
Sure. If for example java will find and implement more efficient implementation for the List collection, but you already have API that accepts only LinkedList, you won't be able to replace the implementation if you already have clients for this API. If you use interface, you can easily replace the implementation without breaking the APIs.
They're absolutely equivalent. The only reason to use one over the other is that if you later want to use a function of list that only exists in the class LinkedList, you need to use the second.
My general rule is to only be as specific as you need to be at the time (or will need to be in the near future, within reason). Granted, this is somewhat subjective.
In your example I would usually declare it as a List just because the methods available on Collection aren't very powerful, and the distinction between a List and another Collection (Map, Set, etc.) is often logically significant.
Also, in Java 1.5+ don't use raw types -- if you don't know the type that your list will contain, at least use List<?>.

Polymorphism: Why use "List list = new ArrayList" instead of "ArrayList list = new ArrayList"? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why should the interface for a Java class be prefered?
When should I use
List<Object> list = new ArrayList<Object>();
ArrayList inherits from List, so if some features in ArrayList aren't in List, then I will have lost some of the features of ArrayList, right? And the compiler will notice an error when trying to access these methods?
The main reason you'd do this is to decouple your code from a specific implementation of the interface. When you write your code like this:
List list = new ArrayList();
the rest of your code only knows that data is of type List, which is preferable because it allows you to switch between different implementations of the List interface with ease.
For instance, say you were writing a fairly large 3rd party library, and say that you decided to implement the core of your library with a LinkedList. If your library relies heavily on accessing elements in these lists, then eventually you'll find that you've made a poor design decision; you'll realize that you should have used an ArrayList (which gives O(1) access time) instead of a LinkedList (which gives O(n) access time). Assuming you have been programming to an interface, making such a change is easy. You would simply change the instance of List from,
List list = new LinkedList();
to
List list = new ArrayList();
and you know that this will work because you have written your code to follow the contract provided by the List interface.
On the other hand, if you had implemented the core of your library using LinkedList list = new LinkedList(), making such a change wouldn't be as easy, as there is no guarantee that the rest of your code doesn't make use of methods specific to the LinkedList class.
All in all, the choice is simply a matter of design... but this kind of design is very important (especially when working on large projects), as it will allow you to make implementation-specific changes later without breaking existing code.
This is called programming to interface. This will be helpful in case if you wish to move to some other implementation of List in the future. If you want some methods in ArrayList then you would need to program to the implementation that is ArrayList a = new ArrayList().
This is also helpful when exposing a public interface. If you have a method like this,
public ArrayList getList();
Then you decide to change it to,
public LinkedList getList();
Anyone who was doing ArrayList list = yourClass.getList() will need to change their code. On the other hand, if you do,
public List getList();
Changing the implementation doesn't change anything for the users of your API.
I think #tsatiz's answer is mostly right (programming to an interface rather than an implementation). However, by programming to the interface you won't lose any functionality. Let me explain.
If you declare your variable as a List<type> list = new ArrayList<type> you do not actually lose any functionality of the ArrayList. All you need to do is to cast your list down to an ArrayList. Here's an example:
List<String> list = new ArrayList<String>();
((ArrayList<String>) list).ensureCapacity(19);
Ultimately I think tsatiz is correct as once you cast to an ArrayList you're no longer coding to an interface. However, it's still a good practice to initially code to an interface and, if it later becomes necessary, code to an implementation if you must.
Hope that helps!
This enables you to write something like:
void doSomething() {
List<String>list = new ArrayList<String>();
//do something
}
Later on, you might want to change it to:
void doSomething() {
List<String>list = new LinkedList<String>();
//do something
}
without having to change the rest of the method.
However, if you want to use a CopyOnWriteArrayList for example, you would need to declare it as such, and not as a List if you wanted to use its extra methods (addIfAbsent for example):
void doSomething() {
CopyOnWriteArrayList<String>list = new CopyOnWriteArrayList<String>();
//do something, for example:
list.addIfAbsent("abc");
}
I guess the core of your question is why to program to an interface, not to an implementation
Simply because an interface gives you more abstraction, and makes the code
more flexible and resilient to changes, because you can use different
implementations of the same interface(in this case you may want to change your List implementation to a linkedList instead of an ArrayList ) without changing its client.
I use that construction whenever I don't want to add complexity to the problem. It's just a list, no need to say what kind of List it is, as it doesn't matter to the problem. I often use Collection for most of my solutions, as, in the end, most of the times, for the rest of the software, what really matters is the content it holds, and I don't want to add new objects to the Collection.
Futhermore, you use that construction when you think that you may want to change the implemenation of list you are using. Let's say you were using the construction with an ArrayList, and your problem wasn't thread safe. Now, you want to make it thread safe, and for part of your solution, you change to use a Vector, for example. As for the other uses of that list won't matter if it's a AraryList or a Vector, just a List, no new modifications will be needed.
In general you want to program against an interface. This allows you to exchange the implementation at any time.
This is very useful especially when you get passed an implementation you don't know.
However, there are certain situations where you prefer to use the concrete implementation.
For example when serialize in GWT.

List versus ArrayList as reference type?

Ok so I know that Set, List and Map are interfaces but what makes the first line of code any better than the second line?
List myArr = new ArrayList();
ArrayList myArr = new ArrayList();
If you use the first form, you are saying all you are ever going to use is the functionality of the List interface - nothing else, especially nothing extra added by any implementation of it. This means you can easily change the implementation used (e.g. just substitute LinkedList for ArrayList in the instantiation), and not worry about it breaking the rest of the code because you might have used something specific to ArrayList.
A useful general principle about types in programming (sometime referred to as the robustness principle) is as follows:
Be liberal about what you accept
Be conservative about what you emit
List is more liberal than ArrayList, since List can be any kind of List implementation e.g. an ArrayList, a LinkedList or FrancosSpecialList. Hence it is a good idea to be liberal and accept any kind of list since you may want to change the implementation later.
The main reason to use ArrayList explicitly as a type (your second case) is if you need to use methods that are specific to ArrayList that are not available through the List interface. In this case a generic List won't work (unless you want to do lots of ugly and confusing casting), so you might as well be explicit and use an ArrayList directly. This has the added bonus of hinting to a reader that specific features of ArrayList are needed.
As you can see from the source of ArrayList here, most of the methods implemented are annotated as #override because all of them that are defined through List interface so, if you are gonna use just basic functionalities (that is what you are gonna do most of the time) the difference won't be any practical one.
The difference will come if someday you will think that the features of the ArrayList are not suitable anymore for your kind of problem and you will need something different (a LinkedList for example). If you declared everything as List but instantiated as ArrayList you will easily switch to new implementation by changing the instantiations to new LinkedList() while in other case you will have to change also all variable declarations.
Using List list = new ArrayList() is more OOP style since you declare that you don't care about the specific implementation of the list, and that you want to discard the static information about the type since you will rely on the interface provided by this kind of collection abstracting from its implementation.

Java: ArrayList for List, HashMap for Map, and HashSet for Set?

I usually always find it sufficient to use the concrete classes for the interfaces listed in the title. Usually when I use other types (such as LinkedList or TreeSet), the reason is for functionality and not performance - for example, a LinkedList for a queue.
I do sometimes construct ArrayList with an initial capcacity more than the default of 10 and a HashMap with more than the default buckets of 16, but I usually (especially for business CRUD) never see myself thinking "hmmm...should I use a LinkedList instead ArrayList if I am just going to insert and iterate through the whole List?"
I am just wondering what everyone else here uses (and why) and what type of applications they develop.
Those are definitely my default, although often a LinkedList would in fact be the better choice for lists, as the vast majority of lists seem to just iterate in order, or get converted to an array via Arrays.asList anyway.
But in terms of keeping consistent maintainable code, it makes sense to standardize on those and use alternatives for a reason, that way when someone reads the code and sees an alternative, they immediately start thinking that the code is doing something special.
I always type the parameters and variables as Collection, Map and List unless I have a special reason to refer to the sub type, that way switching is one line of code when you need it.
I could see explicitly requiring an ArrayList sometimes if you need the random access, but in practice that really doesn't happen.
For some kind of lists (e.g. listeners) it makes sense to use a CopyOnWriteArrayList instead of a normal ArrayList. For almost everything else the basic implementations you mentioned are sufficient.
Yep, I use those as defaults. I generally have a rule that on public class methods, I always return the interface type (ie. Map, Set, List, etc.), since other classes (usually) don't need to know what the specific concrete class is. Inside class methods, I'll use the concrete type only if I need access to any extra methods it may have (or if it makes understanding the code easier), otherwise the interface is used.
It's good to be pretty flexible with any rules you do use, though, as a dependancy on concrete class visibility is something that can change over time (especially as your code gets more complex).
Indeed, always use base interfaces Collection, List, Map instead their implementations. To make thinkgs even more flexible you could hide your implementations behind static factory methods, which allow you to switch to a different implementation in case you find something better(I doubt there will be big changes in this field, but you never know). Another benefit is that the syntax is shorter thanks to generics.
Map<String, LongObjectClasName> map = CollectionUtils.newMap();
instead of
Map<String, LongObjectClasName> map = new HashMap<String, LongObjectClasName>();
public class CollectionUtils {
.....
public <T> List<T> newList() {
return new ArrayList<T>();
}
public <T> List<T> newList(int initialCapacity) {
return new ArrayList<T>(initialCapacity);
}
public <T> List<T> newSynchronizedList() {
return new Vector<T>();
}
public <T> List<T> newConcurrentList() {
return new CopyOnWriteArrayList<T>();
}
public <T> List<T> newSynchronizedList(int initialCapacity) {
return new Vector<T>(initialCapacity);
}
...
}
Having just come out of a class about data structure performance, I'll usually look at the kind of algorithm I'm developing or the purpose of the structure before I choose an implementation.
For example, if I'm building a list that has a lot of random accesses into it, I'll use an ArrayList because its random access performance is good, but if I'm inserting things into the list a lot, I might choose a LinkedList instead. (I know modern implementations remove a lot of performance barriers, but this was the first example that came to mind.)
You might want to look at some of the Wikipedia pages for data structures (especially those dealing with sorting algorithms, where performance is especially important) for more information about performance, and the article about Big O notation for a general discussion of measuring the performance of various functions on data structures.
I don't really have a "default", though I suppose I use the implementations listed in the question more often than not. I think about what would be appropriate for whatever particular problem I'm working on, and use it. I don't just blindly default to using ArrayList, I put in 30 seconds of thought along the lines of "well, I'm going to be doing a lot of iterating and removing elements in the middle of this list so I should use a LinkedList".
And I almost always use the interface type for my reference, rather than the implementation. Remember that List is not the only interface that LinkedList implements. I see this a lot:
LinkedList<Item> queue = new LinkedList<Item>();
when what the programmer meant was:
Queue<Item> queue = new LinkedList<Item>();
I also use the Iterable interface a fair amount.
If you are using LinkedList for a queue, you might consider using the Deque interface and ArrayDeque implementing class (introduced in Java 6) instead. To quote the Javadoc for ArrayDeque:
This class is likely to be faster than
Stack when used as a stack, and faster
than LinkedList when used as a queue.
I tend to use one of *Queue classes for queues. However LinkedList is a good choice if you don't need thread safety.
Using the interface type (List, Map) instead of the implementation type (ArrayList, HashMap) is irrelevant within methods - it's mainly important in public APIs, i.e. method signatures (and "public" doesn't necessarily mean "intended to be published outside your team).
When a method takes an ArrayList as a parameter, and you have something else, you're screwed and have to copy your data pointlessly. If the parameter type is List, callers are much more flexible and can, e.g. use Collections.EMPTY_LIST or Collections.singletonList().
I too typically use ArrayList, but I will use TreeSet or HashSet depending on the circumstances. When writing tests, however, Arrays.asList and Collections.singletonList are also frequently used. I've mostly been writing thread-local code, but I could also see using the various concurrent classes as well.
Also, there were times I used ArrayList when what I really wanted was a LinkedHashSet (before it was available).

Categories

Resources