What exactly does the Collection keyword mean in java? [duplicate] - java

This question already has answers here:
What does it mean to "program to an interface"?
(33 answers)
Closed 8 months ago.
I saw some code in Java that used code similar to
Collection<Cars> carsCollection = new ArrayList<>(); But I am a bit confused about the word Collection.
I have a basic understanding of Collections in general, and how a list or queue is part of the collections but I am having a hard time understanding why they would use Collection<Cars> instead of ArrayList<Cars>. All the information I find on the internet about Collections is how lists and queues use them but I haven't seen much other code that uses the Collections keyword itself, most of them just implement arrays or lists or something else that is a part of the Collections framework. How do you use it or why use it? I tried casting it to an ArrayList like ArrayList<Cars> aList = new ArrayList<>(carsCollection) and it said their was an issue with casting it to an ArrayList.

As people have mentioned Collection is an interface, an interface in Java is a tool for abstraction. So the way to think about it is in terms of generality, where a Collection in this case is the most general term. Then you have, for example List, Map or Set which in turn are more specific abstractions of the idea of a Collection, finally you have the implementations for example ArrayList, HashSet and HashMap.
Generally you want to be as abstract as possible, i.e. using the most general abstraction that still fullfills the requirements that you have on your code.
In your example with Collection<Car> and ArrayList<Car> (which probably ought to be List<Car>), my guess would be that in the case of Collection<Car> that the code doesn't care about the order, because that isn't a requirement of the Collection abstraction, but in the case of the List abstraction it is.
I'd recommend that you read the javadoc for Collection and the other interfaces and implementations.

It is just an interface for a collections alike structures Lists, Sets,Maps etc. It denotes fact that this is a "collection" of object and exposes some specific methods.

Doing this allows you to easily change the type of collection later. Maybe a LinkedHashSet yields better performance? This will be possible because your code is oblivious to the real type of collection and thus cannot call methods that aren’t available in all Collection types.
You might want to read up on the Collections Framework.

Related

Why no mutable sorted map in Scala

I have been looking for options to implement a mutable sorted map in scala. I know I can store my data in a mutable map and then transformed into a sorted map if is needed or wrap the TreeMap from Java. However, Does anyone know why this is not implemented in scala? Is against any functional programming style?
Regards
There is no reason but the omission of writing one. In fact, a mutable sorted map was added to Scala 2.12.x.
There's some discussion in this old answer about possible reasons there isn't an implementation.
With regards to your second question, there are other mutable collections in Scala so I don't see any hard reason there couldn't be a mutable sorted map (see the older question as well). In a more general sense, functional programming can be taken to imply that mutable data is not used, and in this case a mutable sorted map would be avoided. However mutable collections may well be used "behind the scenes" in a library to improve performance, as long as they won't be visible to users of the library.

Why doesn't Collection<T> Implement Stream<T>? [duplicate]

This question already has an answer here:
Why doesn't java.util.Collection implement the new Stream interface?
(1 answer)
Closed 8 years ago.
This is a question about API desing. When extension methods were added in C#, IEnumerable got all the methods that enabled using lambda expression directly on all Collections.
With the advent of lambdas and default methods in Java, I would expect that Collection would implement Stream and provide default implementations for all its methods. This way, we would not need to call stream() in order to leverage the power it provides.
What is the reason the library architects opted for the less convenient approach?
From Maurice Naftalin's Lambda FAQ:
Why are Stream operations not defined directly on Collection?
Early drafts of the API exposed methods like filter, map, and reduce on Collection or Iterable. However, user experience with this design led to a more formal separation of the “stream” methods into their own abstraction. Reasons included:
Methods on Collection such as removeAll make in-place modifications, in contrast to the new methods which are more functional in nature. Mixing two different kinds of methods on the same abstraction forces the user to keep track of which are which. For example, given the declaration
Collection strings;
the two very similar-looking method calls
strings.removeAll(s -> s.length() == 0);
strings.filter(s -> s.length() == 0); // not supported in the current API
would have surprisingly different results; the first would remove all empty String objects from the collection, whereas the second would return a stream containing all the non-empty Strings, while having no effect on the collection.
Instead, the current design ensures that only an explicitly-obtained stream can be filtered:
strings.stream().filter(s.length() == 0)...;
where the ellipsis represents further stream operations, ending with a terminating operation. This gives the reader a much clearer intuition about the action of filter;
With lazy methods added to Collection, users were confused by a perceived—but erroneous—need to reason about whether the collection was in “lazy mode” or “eager mode”. Rather than burdening Collection with new and different functionality, it is cleaner to provide a Stream view with the new functionality;
The more methods added to Collection, the greater the chance of name collisions with existing third-party implementations. By only adding a few methods (stream, parallel) the chance for conflict is greatly reduced;
A view transformation is still needed to access a parallel view; the asymmetry between the sequential and the parallel stream views was unnatural. Compare, for example
coll.filter(...).map(...).reduce(...);
with
coll.parallel().filter(...).map(...).reduce(...);
This asymmetry would be particularly obvious in the API documentation, where Collection would have many new methods to produce sequential streams, but only one to produce parallel streams, which would then have all the same methods as Collection. Factoring these into a separate interface, StreamOps say, would not help; that would still, counterintuitively, need to be implemented by both Stream and Collection;
A uniform treatment of views also leaves room for other additional views in the future.
A Collection is an object model
A Stream is a subject model
Collection definition in doc :
A collection represents a group of objects, known as its elements.
Stream definition in doc :
A sequence of elements supporting sequential and parallel aggregate operations
Seen this way, a stream is a specific collection. Not the way around. Thus Collection should not Implement Stream, regardless of backward compatibility.
So why doesnt Stream<T> implement Collection<T> ? Because It is another way of looking at a bunch of objects. Not as a group of elements, but by the operations you can perform on it. Thus this is why I say a Collection is an object model while a Stream is a subject model
First, from the documentation of Stream:
Collections and streams, while bearing some superficial similarities, have different goals. Collections are primarily concerned with the efficient management of, and access to, their elements. By contrast, streams do not provide a means to directly access or manipulate their elements, and are instead concerned with declaratively describing their source and the computational operations which will be performed in aggregate on that source.
So you want to keep the concepts of stream and collection appart. If Collection would implement Stream every collection would be a stream, which it is conceptually not. The way it is done now, every collection can give you a stream which works on that collection, which is something different if you think about it.
Another factor that comes to mind is cohesion/coupling as well as encapsulation. If every class that implements Collection had to implement the operations of Stream as well, it would have two (kind of) different purposes and might become too long.
My guess would be that it was made that way to avoid breakage with existing code that implements Collection. It would be hard to provide a default implementation that worked correctly with all existing implementations.

Why to keep interface as reference? [duplicate]

This question already has answers here:
Type List vs type ArrayList in Java [duplicate]
(15 answers)
Closed 9 years ago.
I have observed in Java programming language, we code like following:
List mylist = new ArrayList();
Why we should not use following instead of above one?
ArrayList mylist = new ArrayList();
While the second option is viable, the first is preferable in most cases. Typically you want to code to interfaces to make your code less coupled and more cohesive. This is a type of data abstraction, where the user of mylist (I would suggest myList), does not care of the actual implementation of it, only that it is a list.
We may want to change the underlying data structure at some point, and by keeping references, we only need to change the declaration.
The separation of Abstract Data Type and specific implementation is one the key aspects of object oriented programming.
See Interface Instansiation
Just to avoid tight coupling. You should in theory never tie yourself to implementation details, because they might change, opposite to interface contract, which is supposed to be stable. Also, it really simplifies testing.
You could view interface as an overall contract all implementing classes must obey. Instead, implementation-specific details may vary, like how data is represented internally, accessed, etc. - the information that you'd never want to rely on.
If you use ArrayList, you are saying it has to be an ArrayList, not any other kind of List, and to replace it you would have to change every reference to the type. If you use List you are making it clear there is nothing special about the List and it is used as a plain list. It can be changed to another List by changing just one line.

Is a Collection better than a LinkedList?

Collection list = new LinkedList(); // Good?
LinkedList list = new LinkedList(); // Bad?
First variant gives more flexibility, but is that all? Are there any other reasons to prefer it? What about performance?
These are design decisions, and one size usually doesn't fit all. Also the choice of what is used internally for the member variable can (and usually should be) different from what is exposed to the outside world.
At its heart, Java's collections framework does not provide a complete set of interfaces that describe the performance characteristics without exposing the implementation details. The one interface that describes performance, RandomAccess is a marker interface, and doesn't even extend Collection or re-expose the get(index) API. So I don't think there is a good answer.
As a rule of thumb, I keep the type as unspecific as possible until I recognize (and document) some characteristic that is important. For example, as soon as I want methods to know that insertion order is retained, I would change from Collection to List, and document why that restriction is important. Similarly, move from List to LinkedList if say efficient removal from front becomes important.
When it comes to exposing the collection in public APIs, I always try to start exposing just the few APIs that are expected to get used; for example add(...) and iterator().
Collection list = new LinkedList(); //bad
This is bad because, you don't want this reference to refer say an HashSet(as HashSet also implements Collection and so does many other class's in the collection framework).
LinkedList list = new LinkedList(); //bad?
This is bad because, good practice is to always code to the interface.
List list = new LinkedList();//good
This is good because point 2 days so.(Always Program To an Interface)
Use the most specific type information on non-public objects. They are implementation details, and we want our implementation details as specific and precise as possible.
Sure. If for example java will find and implement more efficient implementation for the List collection, but you already have API that accepts only LinkedList, you won't be able to replace the implementation if you already have clients for this API. If you use interface, you can easily replace the implementation without breaking the APIs.
They're absolutely equivalent. The only reason to use one over the other is that if you later want to use a function of list that only exists in the class LinkedList, you need to use the second.
My general rule is to only be as specific as you need to be at the time (or will need to be in the near future, within reason). Granted, this is somewhat subjective.
In your example I would usually declare it as a List just because the methods available on Collection aren't very powerful, and the distinction between a List and another Collection (Map, Set, etc.) is often logically significant.
Also, in Java 1.5+ don't use raw types -- if you don't know the type that your list will contain, at least use List<?>.

what's a good persistent collections framework for use in java?

By persistent collections I mean collections like those in clojure.
For example, I have a list with the elements (a,b,c).
With a normal list, if I add d, my original list will have (a,b,c,d) as its elements.
With a persistent list, when I call list.add(d), I get back a new list, holding (a,b,c,d).
However, the implementation attempts to share elements between the list wherever possible, so it's much more memory efficient than simply returning a copy of the original list.
It also has the advantage of being immutable (if I hold a reference to the original list, then it will always return the original 3 elements).
This is all explained much better elsewhere (e.g. http://en.wikipedia.org/wiki/Persistent_data_structure).
Anyway, my question is... what's the best library for providing this functionality for use in java? Can I use the clojure collections somehow (other that by directly using clojure)?
Just use the ones in Clojure directly. While obviously you might not want to use the language it's self, you can still use the persistent collections directly as they are all just Java classes.
import clojure.lang.PersistentHashMap;
import clojure.lang.IPersistentMap;
IPersistentMap map = PersistentHashMap.create("key1", "value1");
assert map.get("key1").equals("value1");
IPersistentMap map2 = map.assoc("key1", "value1");
assert map2 != map;
assert map2.get("key1").equals("value1");
(disclaimer: I haven't actually compiled that code :)
the down side is that the collections aren't typed, i.e. there are no generics with them.
What about pcollections?
You can also check out Clojure's implementation of persistent collections (PersistentHashMap, for instance).
I was looking for a slim, Java "friendly" persistent collection framework and took TotallyLazy and PCollections mentioned in this thread for a testdrive, because they sounded most promising to me.
Both provide reasonable simple interfaces to manipulate persistent lists:
// TotallyLazy
PersistentList<String> original = PersistentList.constructors.empty(String.class);
PersistentList<String> modified = original.append("Mars").append("Raider").delete("Raider");
// PCollections
PVector<String> original = TreePVector.<String>empty();
PVector<String> modified = original.plus("Mars").plus("Raider").minus("Raider");
Both PersistentList and PVector extend java.util.List, so both libraries should integrate well into an existing environment.
It turns out, however, that TotallyLazy runs into performance problems when dealing with larger lists (as already mentioned in a comment above by #levantpied). On my MacBook Pro (Late 2013) inserting 100.000 elements and returning the immutable list took TotallyLazy ~2000ms, whereas PCollections finished within ~120ms.
My (simple) test cases are available on Bitbucket, if someone wants to take a more thorough look.
[UPDATE]: I recently had a look at Cyclops X, which is a high performing and more complete lib targeted for functional programming. Cyclops also contains a module for persistent collections.
https://github.com/andrewoma/dexx is a port of Scala's persistent collections to Java. It includes:
Set, SortedSet, Map, SortedMap and Vector
Adapters to view the persistent collections as java.util equivalents
Helpers for easy construction
Paguro provides type-safe versions of the actual Clojure collections for use in Java 8+. It includes: List (Vector), HashMap, TreeMap, HashSet, and TreeSet. They behave exactly the way you specify in your question and have been painstakingly fit into the existing java.util collections interfaces for maximum type-safe Java compatibility. They are also a little faster than PCollections.
Coding your example in Paguro looks like this:
// List with the elements (a,b,c)
ImList<T> list = vec(a,b,c);
// With a persistent list, when I call list.add(d),
// I get back a new list, holding (a,b,c,d)
ImList<T> newList = list.append(d);
list.size(); // still returns 3
newList.size(); // returns 4
You said,
The implementation attempts to share elements between the list
wherever possible, so it's much more memory efficient and fast than
simply returning a copy of the original list. It also has the
advantage of being immutable (if I hold a reference to the original
list, then it will always return the original 3 elements).
Yes, that's exactly how it behaves. Daniel Spiewak explains the speed and efficiency of these collections much better than I could.
May want to check out clj-ds. I haven't used it, but it seems promising. Based off of the projects readme it extracted the data structures out of Clojure 1.2.0.
Functional Java implements a persistent List, lazy List, Set, Map, and Tree. There may be others, but I'm just going by the information on the front page of the site.
I am also interested to know what the best persistent data structure library for Java is. My attention was directed to Functional Java because it is mentioned in the book, Functional Programming for Java Developers.
There's pcollections (Persistent Collections) library you can use:
http://code.google.com/p/pcollections/
The top voted answer suggest to directly use the clojure collections which I think is a very good idea. Unfortunately the fact that clojure is a dynamically typed language and Java is not makes the clojure libraries very uncomfortable to use in Java.
Because of this and the lack of light-weight, easy-to-use wrappers for the clojure collections types I have written my own library of Java wrappers using generics for the clojure collection types with a focus on ease of use and clarity when it comes to interfaces.
https://github.com/cornim/ClojureCollections
Maybe this will be of use to somebody.
P.S.: At the moment only PersistentVector, PersistentMap and PersistentList have been implemented.
In the same vein as Cornelius Mund, Pure4J ports the Clojure collections into Java and adds Generics support.
However, Pure4J is aimed at introducing pure programming semantics to the JVM through compile time code checking, so it goes further to introduce immutability constraints to your classes, so that the elements of the collection cannot be mutated while the collection exists.
This may or may not be what you want to achieve: if you are just after using the Clojure collections on the JVM I would go with Cornelius' approach, otherwise, if you are interested in pursuing a pure programming approach within Java then you could give Pure4J a try.
Disclosure: I am the developer of this
totallylazy is a very good FP library which has implementations of:
PersistentList<T>: the concrete implementations are LinkedList<T> and TreeList<T> (for random access)
PersistentMap<K, V>: the concrete implementations are HashTreeMap<K, V> and ListMap<K, V>
PersistentSortedMap<K, V>
PersistentSet<T>: the concrete implementation is TreeSet<T>
Example of usage:
import static com.googlecode.totallylazy.collections.PersistentList.constructors.*;
import com.googlecode.totallylazy.collections.PersistentList;
import com.googlecode.totallylazy.numbers.Numbers;
...
PersistentList<Integer> list = list(1, 2, 3);
// Create a new list with 0 prepended
list = list.cons(0);
// Prints 0::1::2::3
System.out.println(list);
// Do some actions on this list (e.g. remove all even numbers)
list = list.filter(Numbers.odd);
// Prints 1::3
System.out.println(list);
totallylazy is constantly being maintained. The main disadvantage is the total absence of Javadoc.
I'm surprised nobody mentioned vavr. I use it for a long time now.
http://www.vavr.io
Description from their site:
Vavr core is a functional library for Java. It helps to reduce the amount of code and to increase the robustness. A first step towards functional programming is to start thinking in immutable values. Vavr provides immutable collections and the necessary functions and control structures to operate on these values. The results are beautiful and just work.
https://github.com/arnohaase/a-foundation is another port of Scala's libraries.
It is also available from Maven Central: com.ajjpj.a-foundation:a-foundation

Categories

Resources