How is ArrayList represented internally in Java Collection Framework.? - java

I was going through lectures of Algorithms on Coursera by Robert Sedgewick.I was a bit confused when Mr.Robert pointed out that one cannot use Generics with Arrays as it is not allowed.
But ArrayList in Collection Framework uses Arrays internally and Generic datatypes are allowed.I mean to say that we can do the following:
ArrayList<Integer> list = new ArrayList<Integer>();
One hack he pointed out was this:
public class FixedCapacityStack<Item>{
private Item[] s;
private int N = 0;
public FixedCapacityStack(int capacity)
{ s = (Item[]) new Object[capacity];} //this hack
He also mentioned that this is an ugly hack and must be avoided and it also produces warning during compilation.
My Question is:
1.) How does ArrayList then internally represent various Generics Types?
2.) If (assumed) they use the hack mentioned above why it doesn't produce a warning when we compile a program with ArrayList?
3.) Is there any better way apart from that cast above?

Per the source:
1 - ArrayList stores items in an Object[], and casts the value when retrieving individual elements. There's actually an #SuppressWarnings("unchecked") where the cast happens.
2 - Two answers here - the first is that you're not (typically) compiling ArrayList, but just including it on your classpath from rt.jar in the JRE/JDK. The second is that ArrayList uses a #SuppressWarnings on its unchecked conversion from Object to the generic type.
3 - Your other alternative ("better" is quite subjective) would be to require the Class for your generic type, and use Array.newInstance(Class clazz, int capacity) to create your array, as described in this question

1.) How does ArrayList then internally represent various Generics Types?
What do you mean "internally"? Generics only exist at compile time. ArrayList has already been compiled by someone else for you and you are just using the class file. So there is no generics there.
Different Java library implementations could write the source differently, but that is of no concern to you. What it does "internally" is an internal implementation detail that a user of the class should not care about.
If you were to write your own class like FixedCapacityStack, then you could do it in different ways:
You could do the thing where s is of type Item[] as you have shown above, and you create an Object[] and cast to Item[]
Or you can make s of type Object[] and cast to type Item when you get elements out of it
Note that both approaches are the same after erasure, so both will compile to the exact same bytecode. The difference is just style at compile-time.
The advantage of the first approach over the second is that when you get elements out of it, it's already the right type, so you don't have all these ugly casts everywhere. The disadvantage of the first approach is that the initial cast from Object[] to Item[] is basically a "lie", and it will only work if you make absolutely sure not to expose s to the outside of the class (e.g. do not have a method that returns s as type Item[]); otherwise you will have class cast exceptions in unexpected places.
2.) If (assumed) they use the hack mentioned above why it doesn't produce a warning when we compile a program with ArrayList?
There would only be a warning when you actually compile this class. But not if it was already compiled and you are just using the class file. In fact, you don't usually even have the source of ArrayList.
3.) Is there any better way apart from that cast above?
Depends on what you mean by "better". I have shown the two approaches and the advantages and disadvantages of each.

Related

How to instance ArrayList<String>[] in Java [duplicate]

I'm having trouble figuring out what type parameter is expected at RHS of the following
ArrayList<Pair<ParseNode,ParseNode>>[] nodes = new ArrayList[indexes.length];
Why a copy of <Pair<ParseNode,ParseNode>> is not legitimate?
Arrays of concrete paramaterized types are inherently broken. Remember arrays are covariant and the array type check is a runtime operation. At runtime all the generics have been type erased, so the Array Store check can't tell <Pair<ParseNode, ParseNode>> from <Pair<BigInteger,IOException>>.
The fundamental contract of a generic is "I, the compiler, promise that if you write code that generates no warnings, you will never get a class cast exception at runtime."
Neither can the compiler guarantee to you that it will be able to give you a compile time error if something that is not an ArrayList<Pair<ParseNode,ParseNode>> is put into that array. Nor can the runtime system guarantee you will get an ArrayStoreException (like the Language Specification says it will) if you add the wrong type, rather than a ClassCastException later when you take it back out. (The second part is really why it's actually illegal rather than just a warning, it would result in an array that doesn't obey the language specification.)
So it doesn't let you declare them that way and forces you to acknowledge the 'unsafe' warning. That way it has said "I told you I can't guarantee there will not be any class cast exceptions as a result of using this array, it's on you to make sure you only put the right things in here."
Java not supports generic arrays. Arrays are covariant, generics are not. This means that if class A extends class B then A[] is also B[]. And code
A[] a = new A[10];
B[] b = a;
is legal.
But it is not same for generics. You could not assign Foo<T> to Foo<X> even if T extends X. And so elements of Foo<T>[] can't be guaranteed type safe.
EDIT
Excuse me for just linking out, but I've found Java theory and practice: Generics gotchas article, that explains everything about arrays covariance better than I even would dream.
Don't use an array. Use another ArrayList.
ArrayList<List<Pair<ParseNode,ParseNode>>> listOfLists = new ArrayList<List<Pair<ParseNode,ParseNode>>>();
listOfLists.add(new ArrayList<<Pair<ParseNode,ParseNode>>());

Array of Lists, and workarounds

I want to create an array of ArrayLists, similar to that in this thread: How to do an array of hashmaps?. However, Java gives the warning
"Cannot create a generic array of ArrayList<String>"
when I try to do the following
ArrayList[] arrayOfLists = new ArrayList[size];
I have sort of understood the problems and the workarounds provided.
I have my own approach which unfortunately does not fix the problem either.
I tried creating a list of ArrayLists and then used toArray().
ArrayList<ArrayList<String>> listOfLists = new ArrayList<ArrayList<String>>();
ArrayList<String>[] arrayOfLists = (ArrayList<String>[])listOfLists.toArray();
Worked fine, but got the warning :
Type safety: Unchecked cast from Object[] to ArrayList<String>[]
When I tried to check for type safety, using
if(listOfLists.toArray() instanceof ArrayList<String>[])
I get the error:
Cannot perform instanceof check against parameterized type ArrayList<String>[]. Use the form ArrayList<?>[] instead since further generic type information will be erased at runtime
Why cant I use this method? Why does toArray() return Object[] instead of ArrayList<String> since the instance was initialised with theArrayList<String>; type?
Any other workarounds/suggestions on how I can get this done? A 2D array will not work since different lists can vary greatly in size.
The currently accepted answer has a major error in describing Java's generics, so I felt I should answer to make sure there aren't any misconceptions.
Generics in Java are an entirely compile-time feature and for the most part don't exist at runtime due to erasure (you can get the runtime to cough up generic type information in some cases, but that's far from the general case). This provides the basis for the answers to your questions.
Why cant I use this method?
Because generics are erased, an ArrayList<String>[] (as well as all other parameterized ArrayList<>[] instances) at runtime is really an ArrayList[]. Thus, it is impossible for the runtime to check if something is instanceof ArrayList<String>[], as the runtime doesn't actually know that String is your type parameter -- it just sees ArrayList[].
Why does toArray() return Object[] instead of ArrayList since the instance was initialised with theArrayList; type?
Again, erasure. The type parameter is erased to Object, so at runtime what you effectively have is an ArrayList<Object>. Because of this erasure, the runtime doesn't have the information necessary to return an array of the proper type; it only knows that the ArrayList holds Objects, so it returns an Object[]. This is why the toArray(T[]) overload exists -- arrays retain their type information, so an array could be used to provide the requisite type information to return an array of the right type.
Any other workarounds/suggestions on how I can get this done?
As you can see, mixing generic stuff and arrays doesn't work too well, so ideally, you wouldn't mix Lists and arrays together. Therefore, if possible, you should use List<List<String>> or something of the sort instead of List<String>[]. If you want to keep a ArrayList<String>[], though, you could do this:
#SuppressWarnings("unchecked")
ArrayList<String>[] array = new ArrayList[size];
You'll still get the unchecked type warning, but you can be reasonably sure that you won't encounter heap pollution as the only reference to the object is through array. You can also use this as the parameter to toArray():
#SuppressWarnings("unchecked")
ArrayList<String>[] temp = new ArrayList[0];
ArrayList<String>[] arrayOfLists = listOfLists.toArray(temp);
or
#SuppressWarnings("unchecked")
ArrayList<String>[] arrayOfLists = listOfLists.toArray((ArrayList<String>[]) new ArrayList[0]);
For more reading on why you can't parameterize an array, see this SO question. In short, such a thing isn't safe because arrays are covariant, while generics are invariant.
The problem is that Generics are created during runtime, but type conversions and array sizes must be checkable at compile time. The compiler cannot tell what class ArrayList<String> will be during compile time (as it will be generated later), it can only tell that it will be at least an Object, because every class in Java is at least an Object. You can do type conversion and suppress the warning and it might even work, but you run into a pitfall to accidentally confuse types somewhere and mess up your code.
Java is a type-safe language by choice to prevent you from doing one of the most recurring mistakes programmers do in their daily work: confusing variable types. So while it is possible to do the type conversion, you - as an upcoming good Java programmer - should not do that. Use the ArrayList<ArrayList<String>> if you need such a construct, and use arrays only when they are necessary.
The main reason to use arrays is speed of execution, as obviously using an object will keep the runtime busy with some overhead. The main reason to not use arrays is the fact that this overhead will allow you more flexibility in coding and reduce the amount of errors you make. So as a general advice: unless you know (as in measured and determined to be a bottleneck) that you need the speed, go with Lists. Java even does some internal optimizations beyond what you would expect to speed up Lists to a point where they come very close to the execution speed of arrays.

Adding values to Arraylist

Code 1:
ArrayList arr = new ArrayList();
arr.add(3);
arr.add("ss");
Code 2:
ArrayList<Object> arr = new ArrayList<Object>();
arr.add(3);
arr.add("ss");
Code 3:
ArrayList<Object> arr = new ArrayList<Object>();
arr.add(new Integer(3));
arr.add(new String("ss"));
all the above three codes are working fine.. can some one tell me the which is prefered and why.. and why the eclipse compiler always gives warning when type of arguments are not mentioned to the Arraylist.. thanks in advance..
First simple rule: never use the String(String) constructor, it is absolutely useless (*).
So arr.add("ss") is just fine.
With 3 it's slightly different: 3 is an int literal, which is not an object. Only objects can be put into a List. So the int will need to be converted into an Integer object. In most cases that will be done automagically for you (that process is called autoboxing). It effectively does the same thing as Integer.valueOf(3) which can (and will) avoid creating a new Integer instance in some cases.
So actually writing arr.add(3) is usually a better idea than using arr.add(new Integer(3)), because it can avoid creating a new Integer object and instead reuse and existing one.
Disclaimer: I am focusing on the difference between the second and third code blocks here and pretty much ignoring the generics part. For more information on the generics, please check out the other answers.
(*) there are some obscure corner cases where it is useful, but once you approach those you'll know never to take absolute statements as absolutes ;-)
The second one would be preferred:
it avoids unnecessary/inefficient constructor calls
it makes you specify the element type for the list (if that is missing, you get a warning)
However, having two different types of object in the same list has a bit of a bad design smell. We need more context to speak on that.
The second form is preferred:
ArrayList<Object> arr = new ArrayList<Object>();
arr.add(3);
arr.add("ss");
Always specify generic arguments when using generic types (such as ArrayList<T>). This rules out the first form.
As to the last form, it is more verbose and does extra work for no benefit.
Actually, a third is preferred:
ArrayList<Object> array = new ArrayList<Object>();
array.add(Integer.valueOf(3));
array.add("ss");
This avoids autoboxing (Integer.valueOf(3) versus 3) and doesn't create an unnecessary String object.
Eclipse complains when you don't use type arguments with a generic type like ArrayList, because you are using something called a raw type, which is discouraged. If a class is generic (that is, it has type parameters), then you should always use type arguments with that class.
Autoboxing, on the other hand, is a personal preference. Some people are okay with it, and some not. I don't like it, and I turn on the warning for autoboxing/autounboxing.
You are getting the warning because ArrayList is part of java generics. Essentially, it's a way to catch your type errors at compile time. For example, if you declare your array list with types Integer (ArrrayList<Integer>) and then try to add Strings to it, you'll get an error at compile time - avoiding nasty crashes at runtime.
The first syntax is there for backward compatibility and should be avoided whenever possible (note that generics were not there in older versions of java).
Second and third examples are pretty much equivalent. As you need to pass an object and not a primitive type to add method, your 3 is internally converted to Integer(3). By writing a string in double-quotes you effectively are creating a String object. When calling String("ss") you are creating a new String object with value being the same as the parameter ("ss").
Unless you really do need to store different types in your List, I would suggest actually using a proper type declaration, e.g. ArrayList<Integer> = new ArrayList<Integer>() - it'll save you a lot of headache in the long run.
If you do need multiple datatypes in the list, then the second example is better.
Two last variants are the same, int is wrapped to Integer automatically where you need an Object. If you not write any class in <> it will be Object by default. So there is no difference, but it will be better to understanding if you write Object.
Well by doing the above you open yourself to run time errors, unless you are happy to accept that your arraylists can contains both strings and integers and elephants.
Eclipse returns an error because it does not want you to be unaware of the fact that by specifying no type for the generic parameter you are opening yourself up for run time errors. At least with the other two examples you know that you can have objects in your Arraylist and since Inetegers and Strings are both objects Eclipse doesn't warn you.
Either code 2 or 3 are ok. But if you know you will have either only ints or only strings in your arraylist then I would do
ArrayList<Integer> arr = new ArrayList<Integer>();
or
ArrayList<String> arr = new ArrayList<String>();
respectively.
There's a faster and easy way in Java 9 without involving much of code: Using Collection Factory methods:
List<String> list = List.of("first", "second", "third");
in the first you don't define the type that will be held and linked within your arraylist construct
this is the preferred method to do so, you define the type of list and the ide will handle the rest
in the third one you will better just define List for shorter code

What are the implications of casting a generic List to a non-generic List?

I'm refatoring a home-grown DAO container, hoping to make the class generic. It internally uses an ArrayList to store the retrieved objects.
One usage of this class puts the container's list into a request scope, and due to a limitation of Websphere, I can't pass the generic List<Foo> to the request scope (Websphere doesn't handle generics out-of-the-box)
If I go ahead with my refactorings, I will need to convert/cast the List<Foo> into a non-generic List object..
// Boils down to this...
List<Foo> listFoo = new FooListing().findAllFoo();
List listThings = listFoo;
request.setAttribute("listThings", listThings);
What are the implications of reversing a generification like this? Should I avoid doing this kind of manipulation?
EDIT: The code snippet is verbose to explicitly demonstrate what I'm describing..
If the component type of the List does match the expected type, there is no problem.
Generics in Java are only used for type-checks by the compiler, they have not effect at runtime. If you are using an older library that does not support generics, you have no choice but to ignore the generic type.
Things should continue to work, as this system has been designed with backwards compatibility in mind.
So all you are losing is the compile-time type checking (it puts you back to where Java was at 1.4, which means, if the types match, everything will work, if not, you'll get ClassCastExceptions or other unwanted behaviour at runtime).
However, I think you can just write
request.setAttribute("listThings", listFoo);
This method takes any kind of Object. Even if it wanted a List, you could still pass a List<Foo> (which is still a List).
Java uses "type erasure" for generics -- essentially that means that the compiler checks the generics, but the runtime forgets all about it and just treats it as a list of objects.*
Whenever you treat a List<Foo> as just a List, you won't get compiler checks to make sure you don't put a Bla into your list. So you could get a ClassCastException if you call List<Foo>.get() and it turns out to be a Bla hiding in the list. But that can only happen if you some code puts a Bla in your list.
If you wan't to be cautious, then if you pass the List<Foo> as a List to anything that might add a non-Foo to the list, don't treat it as a List<Foo> whenever you access it, but treat it as a list of Objects and add instanceof checks.
*Some of the information is accessible at runtime, but let's not complicate matters.
A "non-generic" version of a generic type is called a "raw type".
Passing a generic type where the raw equivalent is requested is generally ok. This is actually the main reason generics in Java work the way they do (with erasure): to enable interoperability between "generified" code and pre-generics code.
The main thing you need to be careful about is that if you pass a List<Foo> to something that askes for a List, they may put non-Foo objects into the List. You won't get any compile time checking to help you here. You do get some runtime checks: a ClassCastException will be thrown when you use a method that returns a Foo on your List<Foo> and it has to return a non-Foo.
If you want more fail-fast behavior you can wrap your List<Foo> with Collections.checkedList() to get a List that'll check the type of elements on insertion.
Things get more complicated if Foo itself is a generic type. Runtime checks are only done on reified types (ie: the type with generic type parameters removed) so if you give them a List<Set<Bar>> and they insert a Set<Baz> or just a Set, you won't know since the runtime/reified type of the element is Set either way.
First, you can't cast a generic to a non-generic list so yeah you'd have to convert it.
Second, the two main advantages to a generic list are 1) it ensures that all objects are of the specified type and 2) it allows you to directly access methods of the object collection without needing to recast them. This allows you to write cleaner code and saves some processing cycles from having to cast back and fourth.
Neither one of these advantages is a dire need however. If you can't use them you won't notice a difference in performance. Your code may look a little messier though.
I have similar problems with Weblogic Portal. Just use none-generic type for this case.

What does <E> mean in Collection<E>?

What meaning has <E> on the code Collection<E>?
It means that you're dealing with a collection of items with type E. Imagine you've got a cup of tea. Instead of tea, it could also hold coffee so it makes sense to describe the cup as a generic entity:
class Cup<T> { … }
now you could fill it, either with coffee or tea (or something else):
Cup<Tea> cuppa = new Cup<Tea>();
Cup<Coffee> foamee = new Cup<Coffee>();
In order for this to work, both Tea and Coffee would need to be types defined in your program as well.
This is a compile-time constraint on your code. Coming back from the (rather useless) cup example, collections (arrays, lists …) usually contain items of one type, e.g. integers or strings. Generics help you to express this in Java:
Collection<String> strList = new ArrayList<String>();
strList.add("Foobar"); // Works.
strList.add(42); // Compile error!
Notice the compile error above? You only get this when using generics. The following code also works, but would not give the nice error message:
Collection strList = new ArrayList();
strList.add("Foobar"); // Works.
strList.add(42); // Works now. Do we really want this?!
It's the use of generics. Check this intro out. And then don't forget to read this tutorial.
An excerpt follows (which compares the use of a cast versus the use of generics):
When you see the code <Type>, read it
as “of Type”; the declaration above
reads as “Collection of String c.” The
code using generics is clearer and
safer. We have eliminated an unsafe
cast and a number of extra
parentheses. More importantly, we have
moved part of the specification of the
method from a comment to its
signature, so the compiler can verify
at compile time that the type
constraints are not violated at run
time. Because the program compiles
without warnings, we can state with
certainty that it will not throw a
ClassCastException at run time. The
net effect of using generics,
especially in large programs, is
improved readability and robustness.
For example, the interface of a List is
public interface List<E> {
void add(E x);
Iterator<E> iterator();
}
This means you can build a list whose contents are all of the same explicit type (not only of type Object), even if you have defined the type yourself. So, if you create a Name class you can write
List<Name> nameList = new ArrayList<>();
and then fill it with Name instances and directly retrieve Name instances from it without having to cast or otherwise worry about it because you'll always get either a Name instance or null back, never an instance of a different type.
More importantly, you cannot insert anything different from a Name instance in such a List, because it will fail at compile time.
nameList.add(false); //Fails!
nameList.add(new Name("John","Smith")); //Succeeds supposing Name has a
//firstName, lastName constructor

Categories

Resources