Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there? Are there any performance drawbacks?
Can a list be initialised with elements in a single statement like an array (list of all elements separated by commas) ?
Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there ?
In some (probably most) circumstances yes, it is definitely advisable to use collections anyway, in some circumstances it is not advisable.
On the pro side:
If you use an List instead of an array, your code can use methods like contains, insert, remove and so on.
A lot of library classes expect collection-typed arguments.
You don't need to worry that the next version of the code may require a more dynamically sized array ... which would make an initial array-based approach a liability.
On the con side:
Collections are a bit slower, and more so if the base type of your array is a primitive type.
Collections do take more memory, especially if the base type of your array is a primitive type.
But performance is rarely a critical issue, and in many cases the performance difference is not relevant to the big picture.
And in practice, there is often a cost in performance and/or code complexity involved in working out what the array's size should be. (Consider the hypothetical case where you used a char[] to hold the concatenation of a series. You can work out how big the array needs to be; e.g. by adding up the component string sizes. But it is messy!)
Collections/lists are more flexible and provide more utility methods. For most situations, any performance overhead is negligible.
And for this single statement initialization, use:
Arrays.asList(yourArray);
From the docs:
Returns a fixed-size list backed by the specified array. (Changes to the returned list "write through" to the array.) This method acts as bridge between array-based and collection-based APIs, in combination with Collection.toArray. The returned list is serializable and implements RandomAccess.
My guess is that this is the most performance-wise way to convert to a list, but I may be wrong.
1) a Collection is the most basic type and only implies there is a collection of objects. If there is no order or duplication use java.util.Set, if there is possible duplication and ordering use java.util.List, is there is ordering but no duplication use java.util.SortedSet
2) Curly brackets to instantiate an Array, Arrays.asList() plus generics for the type inference
List<String> myStrings = Arrays.asList(new String[]{"one", "two", "three"});
There is also a trick using anonymous types but personally I'm not a big fan:
List<String> myStrings = new ArrayList<String>(){
// this is the inside of an anonymouse class
{
// this is the inside of an instance block in the anonymous class
this.add("one");
this.add("two");
this.add("three");
}};
Yes, it is advisable.
Some of the various list constructors (like ArrayList) even take arguments so you can "pre-allocate" sufficient backing storage, alleviating the need for the list to "grow" to the proper size as you add elements.
There are different things to consider: Is the type of the array known? Who accesses the array?
There are several issues with arrays, e.g.:
you can not create generic arrays
arrays are covariant: if A extends B -> A[] extends B[], which can lead to ArrayStoreExceptions
you cannot make the fields of an array immutable
...
Also see, item 25 "Prefer lists to arrays" of the Effective Java book.
That said, sometimes arrays are convenient, e.g. the new Object... parameter syntax.
How can a list be initialised with elements in a single statement like an array = {list of all elements separated by commas} ?
Arrays.asList(): http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html#asList%28T...%29
Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there ? Performance drawbacks ???
If an array is enough, then use an array. Just to keep things simple. You may even get a slightly better performance out of it. Keep in mind that if you...
ever need to pass the resulting array to a method that takes a Collection, or
if you ever need to work with List-methods such as .contains, .lastIndexOf, or what not, or
if you need to use Collections methods, such as reverse...
then may just as well go for the Collection/List classes from the beginning.
How can a list be initialised with elements in a single statement like an array = {list of all elements separated by commas} ?
You can do
List<String> list = Arrays.asList("foo", "bar");
or
List<String> arrayList = new ArrayList<String>(Arrays.asList("foo", "bar"));
or
List<String> list = new ArrayList<String>() {{ add("foo"); add("bar"); }};
Is it advisable to use Java
Collections List in the cases when you
know the size of the list before hand
and you can also use array there ?
Performance drawbacks ?
It can be perfectly acceptable to use a List instead of an array, even if you know the size before hand.
How can a list be initialised with
elements in a single statement like an
array = {list of all elements
separated by commas} ?
See Arrays.asList().
Related
Is it possible to find out if some a list is fixed size or not?
I mean, for example this code:
String[] arr = {"a", "b"};
List<String> list = Arrays.asList(array);
returns fixed size List backed by an array. But is it possible to understand programmatically if List is fixed-size or not without trying to add/remove elements and catching the exception? For example:
try {
list.add("c");
}
catch(UnsupportedOperationException e) {
// Fixed-size?
}
A list created from a String[] by
List<String> list = Arrays.asList(array);
will have Arrays as enclosing class, while one created by for example new ArrayList() won't have the enclosing class. So the following should work to check if the List was produced as a result of calling Arrays.toList():
static <T> boolean wasListProducedAsAResultOfCallingTheFunctionArrays_asList(List<T> l) {
return Arrays.class.equals(l.getClass().getEnclosingClass());
}
Beware that this method relies on undocumented behavior. It will break if they added another nested List subclass to the Arrays class.
Is it possible to find out if some list is fixed size or not?
In theory - No. Fixed sizedness is an emergent property of the implementation of a list class. You can only determine if a list has that property by trying to add an element.
And note that a simple behavioral test would not reliably distinguish between a fixed sized list and a bounded list or a list that was permanently or temporarily read-only.
In practice, a fixed sized list will typically have a different class to an ordinary one. You can test the class of an object to see if it or isn't a specific class. So if you understand what classes would be used to implement fixed sized lists in your code-base, then you can test if a specific list is fixed sized.
For example the Arrays.asList(...) method returns a List object whose actual class is java.util.Arrays.ArrayList. That is a private nested class, but you could use reflection find it, and then use Object.getClass().equals(...) to test for it.
However, this approach is fragile. Your code could break if the implementation of Arrays was modified, or if you started using other forms of fixed sized list as well.
No.
The List API is identical regardless of whether a List is expandable or not, something that was deliberate.
There is also nothing in the List API that allows you to query it to determine this feature.
You can't completely reliably determine this information by reflection, because you will be depending on internal details of the implementation, and because there is an unbounded number of classes that are potentially fixed-size. For example, in addition to Arrays.asList, there is also Arrays.asList().subList, which happens to return a different class. There can also be wrappers around the base list like Collections.checkedList, Collections.synchronizedList and Collections.unmodifiableList. There are also other fixed-size lists: Collections.emptyList, Collections.singletonList, and Collections.nCopies. Outside the standard library, there are things like Guava's ImmutableList. It's also pretty trivial to hand-roll a list for something by extending AbstractList (for a fixed-size list you need only implement the size() and get(int) methods).
Even if you detect that your list is not fixed-size, the specification of List.add allows it to refuse elements for other reasons. For example, Collections.checkedList wrappers throw a ClassCastException for elements of unwanted type.
And even if you know your list is expandable, and allows arbitrary elements, that doesn't mean you want to use it. Perhaps it's synchronized, or not synchronized, or isn't serializable, or it's a slow linked list, or has some other quality that you don't want.
If you want control over the type, mutability, serializability, or thread-safety of the list, or you want to be sure that no other code has kept a reference to it, the practice is that you create a new one yourself. It's not expensive to do so when unnecessary (memcopies are blazing fast), and it lets you reason more definitely about your code will actually do at runtime. If you'd really like to avoid creating unnecessary copies, try whitelisting instead of blacklisting list classes. For example:
if (list.getClass() != ArrayList.class) {
list = new ArrayList<>(list);
}
(Note: That uses getClass instead of instanceof, because instanceof would also be true for any weird subclasses of ArrayList.)
There are immutable collections in java-9, but there is still no common #Immutable annotation for example or a common marker interface that we could query to get this information.
The simplest way I can think of would be simply to get the name of the class of such an instance:
String nameList = List.of(1, 2, 3).getClass().getName();
System.out.println(nameList.contains("Immutable"));
but that still relies on internal details, since it queries the name of the common class ImmutableCollections, that is not public and obviously can change without notice.
I am writing a program that will be heavily reliant on ... something ... that stores data like an array where I am able to access any point of the data at any given time as I can in an array.
I know that the java library has an Array class that I could use or I could use a raw array[].
I expect that using the Array type is a bit easier to code, but I expect that it is slightly less efficient as well.
My question is, which is better to use between these two, and is there a better way to accomplish the same result?
Actually Array would be of no help -- it's not what you think it is. The class java.util.ArrayList, on the other hand, is. In general, if you can program with collection classes like ArrayList, do so -- you'll more easily arrive at correct, flexible software that's easier to read, too. And that "if" applies almost all the time; raw arrays are something you use as a last resort or, more often, when a method you want to call requires one as an argument.
The Array class is used for Java reflection and is very, very, rarely used.
If you want to store data in an array, use plain old arrays, indicated with [], or as Gabe's comment on the question suggests, java.util.ArrayList. ArrayList is, as your comment suggests easier to code (when it comes to adding and removing elements!!) but yes, is slightly less efficient. For variable-size collections, ArrayList is all but required.
My question is, which is better to use between these two, and is there a better way to accomplish the same result?
It depends on what you are trying to achieve:
If the number of elements in the array is known ahead of time, then an array type is a good fit. If not, a List type is (at least) more convenient to use.
The List interface offers a number of methods such as contains, insert, remove and so on that can save you coding ... if you need to do that sort of thing.
If properly used, an array type will use less space. The difference is particularly significant for arrays of primitive types where using a List means that the elements need to be represented using wrapper types (e.g. byte becomes Byte).
The Array class is not useful in this context, and neither is the Arrays class. The choice is between ArrayList (or some other List implementation class) and primitive arrays.
In terms of ease of use, the Array class is a lot easier to code.
The array[] is quite a problem in terms of the case that you need to know
the size of the list of objects beforehand.
Instead, you could use a HashMap. It is very efficient in search as well as sorting as
the entire process is carried out in terms of key values.
You could declare a HashMap as:
HashMap<String, Object> map = new HashMap<String, Object>();
For the Object you can use your class, and for key use the value which needs to be unique.
When is it better to use a vector than an array and vice versa in java? and why?
Vector: never, unless an API requires it, because it's a class, not an interface.
List: this should be your default array-like collection. It's an interface so anything can be a List if it needs to. (and there are lots of List implementations out there e.g. ArrayList, LinkedList, CopyOnWriteArrayList, ImmutableList, for various feature sets)
Vector is threadsafe, but so is the Collections.synchronizedList() wrapper.
array: rarely, if required by an API. The one other major advantage of arrays is when you need a fixed-length array of primitives, the memory space required is fairly compact, as compared to a List<Integer> where the integers need to be boxed into Integer objects.
A Vector (or List) when you don't know before hand how many elements are going to be inserted.
An array when you absolutely know what's the maximum number of elements on that vector's whole life.
Since there are no high performance penalties when using List or Vector (any collection for that matter), I would always chose to use them. Their flexibility is too important to not be considered.
Nowadays I only use arrays when I absolutely need to. Example: when using an API that requires them.
First off ArrayList is a faster implementation than Vector (but not thread safe though).
Arrays are handy when you know the length beforehand and will not change much (or at all).
When declaring a method, use List.
Do not use Vector, it's an early part of the JDK and was retrofitted to work with Collections.
If there's a very performance-sensitive algorithm, a private array member can be helpful. If there's a need to return its contents or pass them to a method, it's generally best to construct an object around it, perhaps as simple as Arrays.asList(thePrivateArray). For a thread-safe list: Collections.synchronizedList(Arrays.asList(thePrivateArray)). In order to prevent modification of the array contents, I typically use Collections.unmodifiableList(Arrays.asList(thePrivateArray)).
In Java, when would it be preferential to use a List rather than an Array?
I see the question as being the opposite-
When should you use an Array over a List?
Only you have a specific reason to do so (eg: Project Constraints, Memory Concerns (not really a good reason), etc.)
Lists are much easier to use (imo), and have much more functionality.
Note: You should also consider whether or not something like a Set, or another datastructure is a better fit than a List for what you are trying to do.
Each datastructure, and implmentation, has different pros/cons. Pick the ones that excel at the things that you need to do.
If you need get() to be O(1) for any item? Likely use an ArrayList, Need O(1) insert()? Possibly a Linked List. Need O(1) contains()? Possibly a Hashset.
TLDR: Each data structure is good at some things, and bad at others. Look at your objectives and choose the data structure that best fits the given problem.
Edit:
One thing not noted is that you're
better off declaring the variable as
its interface (i.e. List or Queue)
rather than its implementing class.
This way, you can change the
implementation at some later date
without changing anything else in the
code.
As an example:
List<String> myList = new ArrayList<String>();
vs
List<String> myList = new LinkedList<String>();
Note that myList is a List in both examples.
--R. Bemrose
Rules of thumb:
Use a List for reference types.
Use arrays for primitives.
If you have to deal with an API that is using arrays, it might be useful to use arrays. OTOH, it may be useful to enforce defensive copying with the type system by using Lists.
If you are doing a lot of List type operations on the sequence and it is not in a performance/memory critical section, then use List.
Low-level optimisations may use arrays. Expect nastiness with low-level optimisations.
Most people have answered it already.
There are almost no good reason to use an array instead of List. The main exception being the primitive array (like int[]). You cannot create a primitive list (must have List<Integer>).
The most important difference is that when using List you can decide what implementation will be used. The most obvious is to chose LinkedList or ArrayList.
I would like to point out in this answer that choosing the implementation gives you very fine grained control over the data that is simply not available to array:
You can prevent client from modifying your list by wrapping your list in a Collection.unmodifiableList
You can synchronize a list for multithreading using Collection.synchronizedList
You can create a fixed length queue with implementation of LinkedBlockingQueue
... etc
In any case, even if you don't want (now) any extra feature of the list. Just use an ArrayList and size it with the size of the array you would have created. It will use an Array in the back-end and the performance difference with a real array will be negligible. (except for primitive arrays)
Pretty much always prefer a list. Lists have much more functionality, particularly iterator support. You can convert a list to an array at any time with the toArray() method.
Always prefer lists.
Arrays when
Varargs for a method ( I guess you are forced to use Arrays here ).
When you want your collections to be covariant ( arrays of reference types are covariant ).
Performance critical code.
If you know how many things you'll be holding, you'll want an array. My screen is 1024x768, and a buffer of pixels for that isn't going to change in size ever during runtime.
If you know you'll need to access specific indexes (go get item #763!), use an array or array-backed list.
If you need to add or remove items from the group regularly, use a linked list.
In general, dealing with hardware, arrays, dealing with users, lists.
It depends on what kind of List.
It's better to use a LinkedList if you know you'll be inserting many elements in positions other than the end. LinkedList is not suitable for random access (getting the i'th element).
It's better to use an ArrayList if you don't know, in advance, how many elements there are going to be. The ArrayList correctly amortizes the cost of growing the backing array as you add more elements to it, and is suitable for random access once the elements are in place. An ArrayList can be efficiently sorted.
If you want the array of items to expand (i.e. if you don't know what the size of the list will be beforehand), a List will be beneficial. However, if you want performance, you would generally use an array.
In many cases the type of collection used is an implementation detail which shouldn't be exposed to the outside world. The more generic your returntype is the more flexibility you have changing the implementation afterwards.
Arrays (primitive type, ie. new int[10]) are not generic, you won't be able to change you implementation without an internal conversion or altering the client code. You might want to consider Iterable as a returntype.
This is a two-part question:
First, I am interested to know what the best way to remove repeating elements from a collection is. The way I have been doing it up until now is to simply convert the collection into a set. I know sets cannot have repeating elements so it just handles it for me.
Is this an efficient solution? Would it be better/more idiomatic/faster to loop and remove repeats? Does it matter?
My second (related) question is: What is the best way to convert an array to a Set? Assuming an array arr The way I have been doing it is the following:
Set x = new HashSet(Arrays.asList(arr));
This converts the array into a list, and then into a set. Seems to be kinda roundabout. Is there a better/more idiomatic/more efficient way to do this than the double conversion way?
Thanks!
Do you have any information about the collection, like say it is already sorted, or it contains mostly duplicates or mostly unique items? With just an arbitrary collection I think converting it to a Set is fine.
Arrays.asList() doesn't create a brand new list. It actually just returns a List which uses the array as its backing store, so it's a cheap operation. So your way of making a Set from an array is how I'd do it, too.
Use HashSet's standard Collection conversion constructor. According to The Java Tutorials:
Here's a simple but useful Set idiom.
Suppose you have a Collection, c, and
you want to create another Collection
containing the same elements but with
all duplicates eliminated. The
following one-liner does the trick.
Collection<Type> noDups = new HashSet<Type>(c);
It works by creating a Set (which, by
definition, cannot contain a
duplicate), initially containing all
the elements in c. It uses the
standard conversion constructor
described in the The Collection
Interface section.
Here is a minor variant of this idiom
that preserves the order of the
original collection while removing
duplicate element.
Collection<Type> noDups = new LinkedHashSet<Type>(c);
The following is a generic method that
encapsulates the preceding idiom,
returning a Set of the same generic
type as the one passed.
public static <E> Set<E> removeDups(Collection<E> c) {
return new LinkedHashSet<E>(c);
}
Assuming you really want set semantics, creating a new Set from the duplicate-containing collection is a great approach. It's very clear what the intent is, it's more compact than doing the loop yourself, and it leaves the source collection intact.
For creating a Set from an array, creating an intermediate List is a common approach. The wrapper returned by Arrays.asList() is lightweight and efficient. There's not a more direct API in core Java to do this, unfortunately.
I think your approach of putting items into a set to produce the collection of unique items is the best one. It's clear, efficient, and correct.
If you're uncomfortable using Arrays.asList() on the way into the set, you could simply run a foreach loop over the array to add items to the set, but I don't see any harm (for non-primitive arrays) in your approach. Arrays.asList() returns a list that is "backed by" the source array, so it doesn't have significant cost in time or space.
1.
Duplicates
Concurring other answers: Using Set should be the most efficient way to remove duplicates. HashSet should run in O(n) time on average. Looping and removing repeats would run in the order of O(n^2). So using Set is recommended in most cases. There are some cases (e.g. limited memory) where iterating might make sense.
2.
Arrays.asList() is a cheap operation that doesn't copy the array, with minimal memory overhead. You can manually add elements by iterating through the array.
public static Set arrayToSet(T[] array) {
Set set = new HashSet(array.length / 2);
for (T item : array)
set.add(item);
return set;
}
Barring any specific performance bottlenecks that you know of (say a collection of tens of thousands of items) converting to a set is a perfectly reasonable solution and should be (IMO) the first way you solve this problem, and only look for something fancier if there is a specific problem to solve.