Sorry if this is a really stupid question, but hearing as "Java arrays are literally just Objects" it makes no sense to me that they need to have a pre-defined length?
I understand why primitive types do, for example int myInt = 15; allocates 32 bits of memory to store an integer and that makes sense to me. But if I had the following code:
class Integer{
int myValue;
public Integer(int myValue){
this.myValue = myValue;
}
}
Followed by a Integer myInteger = new Integer(15);myInteger.myValue = 5; then there's no limit on the amount of data I can store in myInteger. It's not limited to 32 bits, but rather it's a pointer to an Object which can store any amount of ints, doubles, Strings, or really anything. It allocated 32 bits of memory to store the pointer, but the object itself can store any amount of data, and it doesn't need to be specified beforehand.
So why can't an array do that? Why do I need to tell an array how much memory to allocate beforehand? If an array is "literally just an object" then why can't I simply say String[] myStrings = new String[];myStrings[0] = "Something";?
I'm super new to Java so there's a 100% chance that this is a stupid question and that there's a very simple and clear answer, but I am curious.
Also, to give another example, I can say ArrayList<String> myStrings = new ArrayList<String>();myStrings.add("Something"); without any problem... So what makes an ArrayList different from an array? Why does an array NEED to be told how much memory to allocate when an ArrayList doesn't?
Thanks in advance to anybody who takes the time to fill me in. :)
EDIT: Okay, so far everybody in the comments have misunderstood my post and I feel like it's my fault for wording it wrong.
My question is not "how do I define an array?", or "does changing the value of a variable change its memory usage?", or "do pointers store the data of the object they point to?", or "are arrays objects?", nor is it "how to ArrayLists work?"
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront? (With ArrayLists being an example of the difference)
I hope this makes more sense now... I'm not sure why everybody misunderstood? (Did I word something wrong? If so, let me know and I'll change it for others' convenience)
My question is why does a pointer to an array need to know how big the array is beforehand, when a pointer to any other object doesn't?
It doesn't. Here, this runs perfectly fine:
String[] x = new String[10];
x = new String[15];
The whole 'needs to know in advance how large it is' refers only to the ARRAY OBJECT. As in, new int[10] goes to the heap, which is like a giant beach, and creates a new treasure chest out of thin air, big enough to hold exactly 10 ints (Which, being primitives, are like coins in this example). It then buries it in the sand, lost forever. Hence why new int[10]; all by its lonesome is quite useless.
When you write int[] arr = new int[10];, you still do that, but you now also make a treasure map. X marks the spot. 'arr' is this map. It is NOT AN INT ARRAY. It is a map to an int array. In java, both [] and . are 'follow the map, dig down, and open er up'.
arr[5] = 10; means: Follow your arr map, dig down, open up the chest you find there, and you'll see it has room for precisely 10 little pouches, each pouch large enough to hold one coin. Take the 6th pouch. Remove whatever was there, put a 10ct coin in.
It's not the map that needs to know how large the chest is that the map leads to. It's the chest itself. And this is true for objects as well, it is not possible in java to make a treasure chest that can arbitrarily resize itself.
So how does ArrayList work?
Maps-in-boxes.
ArrayList has, internally, a field of type Object[]. That field doesn't hold an object array. It can't. It holds a map to an object array: It's a reference.
So, what happens when you make a new arraylist? It is a treasure chest, fixed size, with room for exactly 2 things:
A map to an 'object array' treasure chest (which it will also make, with room for 10 maps, and buries it in the sand, and stores the map to this chest-of-maps inside itself.
A coinpouch. The coin inside represents how many objects the list actually contains. The map to the treasure it has leads to a treasure with room for 10 maps, but this coin (value: 0) says that so far, none of those maps go anywhere.
If you then run list.add("foo"), what that does is complicated:
"foo" is an object (i.e. treasure), so "foo" as an expression resolves to be a map to "foo". It then takes your list treasuremap, follows it, digs down, opens the box, and you yell 'oi! ADD THIS!', handing it a copy of your treasuremap to the "foo" treasure. What the box then does with this is opaque to you - that's the point of OO.
But let's dig into the sources of arraylist: What it will do, is query its treasuremap to the object array (which is private, you can't get to it, it's in a hidden compartment that only the djinn that lives in the treasure chest can open), follows it, digs down, and goes to the first slot (why? Because the 'size coin' in the coinpouch is currently at 0). It takes the map-to-nowhere that is there, tosses it out, makes a copy of your map to the "foo" treasure, and puts the copy in there. It then replaces its coin in the coin pouch with a penny, to indicate it is now size 1.
If you add an 11th element, the ArrayList djinn goes out to the other treasure, notices there is no room, and goes: Well, dang. Okay. It then conjures up an entirely new treasure chest that can hold 15 treasure maps, it copies over the 10 maps in the old treasure, moves them to the new treasurechest, adds the copy of the map of the thing you added as 11th, then goes back to its own chest, rips out the map to the real treasure and replaces it to a map of the newly made treasure (With 15 slots), and puts an 11ct coin in the pouch.
The old treasure chest remains exactly where it is. If nobody has any maps to this (and nobody does), eventually, the beachcomber finds it, gets rid of it (that'd be the garbage collector).
Thus, ALL treasure chests are fixed size, but by replacing maps with new maps and conjuring up new treasure chests, you can nevertheless make it look like ArrayList is capable of shrinking and growing.
So why don't arrays allow it? Because that shrinking and growing stuff is complicated and arrays expose low-level functionality. Don't use arrays, use Lists.
You seem to misunderstand what "storage" means. You say "there's no limit on the amount of data I can store", but if you run myInteger.myValue = 15, you overwrite the value of 32 that you put there originally. You still can't store any more than 32 bits, it's simply that you can change which 32 bits you put in that variable.
If you want to see how ArrayList works, you can read the source code; it can expand because if it runs out of space it creates a new larger array and switches its single array variable elementData to it.
Based on your update, it seems like you may be wondering about the ability to add lots of different fields to your object definition. In this case, those fields and their types are fixed when the class is compiled, and from that point on the class has a fixed size. You can't just pile in extra properties at runtime like you can in JavaScript. You are telling it up front about the scale it needs.
I'm going to ignore most of the details you've given, and answer the question in your edit.
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront?
It's worth starting by dealing with "when I make any other object it scales on its own", because this isn't true. If you create a class like this:
class MyInteger
public int value;
public MyInteger(int value) {
this.value = value;
}
}
Then that class has a statically defined size. Once you've compiled this class, the amount of memory for an instance of MyInteger is already determined. In this case, it's the object header size (JVM dependent), and the size of an integer (at least 4 bytes).
Once an object has been allocated by the JVM, its size cannot change. It is treated as a block of bytes by the JVM (and importantly, the garbage collector) until it is reclaimed. Classes like ArrayList give the illusion of growing, but they actually work by allocating other objects, which they store references to.
class MyArrayList {
public int[] values;
public MyArrayList(int[] values) {
this.values = values;
}
}
In this case, the MyArrayList instance will always take the same amount of memory (object header size + reference size), but the array that is referenced may change. We could do something like this:
MyArrayList list = new MyArrayList(new int[50]);
This allocates a block of memory for list, and a block of memory for list.values. If we then do (as ArrayList effectively does internally):
list.values = new int[500];
then the memory allocated for list is still the same size, but we have allocated a new block which we then reference in list.values. This leaves our old int[50] with no references (so it can be garbage collected). Importantly, though, no allocation has changed size. We have reallocated a new, bigger, block for our list to use, and have referenced it from our MyArrayList instance.
Why do arrays in Java need to have a pre-defined length when Objects don't?
In order to understand this, we need to establish that "size" is a complicated concept in Java. There are a variety of meanings:
Each object is stored in the heap as one or more heap nodes, where one of these is the primary node, and the rest are component objects that can be reached from the primary node.
The primary heap node is represented by a fixed and unchanging number of bytes of heap memory. I will call this1 the native size of the object.
An array has an explicit length field. This field is not declared. It has a type int and cannot be assigned to. There is actually a 32 bit field in the header of each array instance that holds the length.
The length of an array directly maps to its native size. The JVM can compute the native size from the length.
An object that is not an array instance also has a native size. This is determined by the number and types of the object's fields. Since fields cannot be added or removed at runtime, the native size does not change. But it doesn't need to be stored since it can be determined (when needed) at runtime from the object's class.
Some objects support a class specific size concept. For example, a String has a size returned by its length() method, and an ArrayList has a size returned by its size() method.
NB:
The meaning of the class specific size is ... class specific.
The class specific size does not correlate to the native size of an instance. (Except in degenerate cases ...)
In fact, all objects have a fixed native size.
1 - This term is solely for the purposes of this answer. I claim no authority for this term ...
Examples:
A String[] has a native size that depends on its length. On a typical JVM it will be 12 + length * (<reference size>) rounded up to a multiple of 16 bytes.
Your Integer class has a fixed native size. On a typical JVM each instance will be 16 bytes long.
An ArrayList object has 2 private int fields and a private Object[] field. That gives it a fixed native size of either 16 or 24 bytes. One of the int fields is call size, and it contains the value returned by size().
The size of an ArrayList may change, but this is implemented by the code of the class. In order to do this, it may need to reallocate its internal Object[] to make it large enough to hold more elements. If you examine the source code for the ArrayList class, you can see how this happens. (Look for the ensureCapacity and grow methods.)
So, the differences between the size(s) of regular object and the length of an array are:
The natural size of a regular object is determined solely by the type of the object, it never changes. It is rarely relevant to the application and it is not exposed via a field.
The length of an array depends on value supplied when you instantiate it. It never changes. The natural size can be determined from the length.
The class specific size of an object (if relevant) is managed by the class.
To your revised question:
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront? (With ArrayLists being an example of the difference)
The point is that at the JVM level, NOTHING scales automatically. The native size of a Java object CANNOT change.
Why? Because increasing the size of the object's heap node would entail moving the heap node, and a heap node cannot be moved without updating all references for the object. That cannot be done efficiently.
(It has been pointed out that the GC can efficiently move heap nodes. However, that is not a viable solution. Running the GC is expensive. It would be highly inefficient to perform a GC in order to (say) grow a single Java array. If Java had been specified so that arrays could "grow", it would need to implemented using an underlying non-growable array type.)
The ArrayList case is being handled by the ArrayList class itself, and it does it by (if necessary) creating a new, larger backing array, copying the elements from the old to the new, and then discarding the old backing array. It also adjusts the size field that hold the logical size of the list.
Object arrays allocate space for object pointers, and not entire objects in memory.
So new String[10] doesnt allocate space for 10 strings, but for 10 object references that would be point to what strings are stored in the array.
As far as I know, we can pass multiple dimensional array to a method in Java without size info, like this:
void foo(int arr[][][])
But in C++, you can only exclude the size of the outer-most dimension, like this:
void foo(int arr[][y_size][z_size)
Now I understand that in C++, 'arr' will decay to a pointer, so the compiler needs to know how much elements to skip between two pointers.
My question is, what happens underneath Java makes it smarter than C++ on this, so it can distinguish the bounds between elements without knowing the size of each dimension?
C is passing an address to one contiguous area of memory. The recipient needs all but one of the dimensions in order to compute the locations of array elements in memory.
Java is passing a reference to an array object that knows its contents and size. A multidimensional array is not one contiguous area of memory. The computation made by C does not occur. Instead, the multidimensional array is comprised of multiple 1D arrays. Each level but the last is an array of references to arrays.
Array length is still used by Java during array access. Every array access is checked at runtime. If the requested index is greater than or equal to the length (or less than zero), an ArrayIndexOutOfBoundsException is thrown.
Two things:
Every array, which is a run-time object, in Java has a length property associated with it.
In C, excluding C99 VLAs, arrays are only a compile-time type describing how to access objects therein per a particular layout.
Multi-dimensional arrays in Java are always jagged arrays.
This means that the length per/in the type is not even particularly relevant in Java - every "multi-dimensional array" access goes one array at a time so
r = a[x][y][z]
is merely
a_ = a[x]
b_ = a_[y]
r = b_[z]
and there is is no dimension-to-linear calculation involved.
In java arrays are objects which effectively consist of a pointer, and the size of the array.
So in java the extra information is included in the array object.
I want to know if there is a difference in performance if I use a primitive array and then rebuild it to add new elements like this:
AnyClass[] elements = new AnyClass[0];
public void addElement(AnyClass e) {
AnyClass[] temp = new AnyClass[elements.length + 1];
for (int i = 0; i < elements.length; i++) {
temp[i] = elements[i];
}
temp[elements.length] = e;
elements = temp;
}
or if I just use an ArrayList and add the elements.
I am not certain that is why I ask, is it the same speed because an ArrayList is build in the same way as I did it with the primitive array or is there really a difference and a primitive array is always faster even if I rebuild it everytime I add an element?
ArrayLists work in a similar way but instead of rebuilding every time they double there capacity every time the limit is reached. so if you are constantly adding to it ArrayLists will be faster because recreating the array is fairly slow.
So your implementation could use less memory if you are not adding to it often but as far as speed goes it will be slower most of the time.
In a nutshell, stick with ArrayList. It is:
widely understood;
well tested;
will probably be more performant that your own implementation (for example, ArrayList.add() is guaranteed to be amortised constant-time, which your method is not).
When an ArrayList resizes it doubles itself, so that you are not wasting time resizing each time. Amortized, that means that it doesn't take any time to resize. That's why you shouldn't waste time recreating the wheel. The people who created the first one already learned how to make one more efficient and know more about the platform than you do.
There is no performance issue in both Arrays and ArrayList.
Arrays and ArrayList are index based so both will work in same way.
If you required the dynamic Array you can use arrayList.
If Array size is static then go with Array.
Your implementation is likely to lose clearly to Java's ArrayList in terms of speed. One particularly expensive thing you're doing is reallocating the array every time you want to add elements, while Java's ArrayList tries to optimize by having some "buffer" before having to reallocate.
ArrayList will also use internally Array Only , so this is true Array will be faster than ArrayList. While writing high performance code always use Array. For the same reason Array is back bone for most of the collections.
You must go through JDK implementation of Collections.
We use ArrayList when we are developing some application and we are not concerned about such minor performance issues and we do trade off because we get already written API to put , get , resize etc.
Context is very important: I mean if you are constantly inserting new items/elements ArrayList will certainly be faster than Array. On the other hand if you just want to access an element at a known position-say arrayItems[8], Array is faster than ArrayList.get(8); Sine there is overhead of get() function calls and other steps and checks.
In C++ i can insert an item into an arbitrary position in a vector, just like the code below:
std::vector<int> vec(10);
vec.insert(vec.begin()+2,2);
vec.insert(vec.begin()+4,3);
In Java i can not do the same, i get an exception java.lang.ArrayIndexOutOfBoundsException, code below:
Vector l5 = new Vector(10);
l5.add(0, 1);
l5.add(1, "Test");
l5.add(3, "test");
It means that C++ is better designed or is just a Java design decision ?
Why java use this approach ?
In the C++ code:
std::vector<int> vec(10);
You are creating a vector of size 10. So all indexes from 0 to 9 are valid afterwards.
In the Java code:
Vector l5 = new Vector(10);
You are creating an empty vector with an initial capacity of 10. It means the underlying array is of size 10 but the vector itself has the size 0.
It does not mean one is better designed than the other. The API is just different and this is not a difference that makes one better than the other.
Note that in Java it is now preffered to use ArrayList, which has almost the same API, but is not synchronized. If you want to find a bad design decision in Java's Vector, then this synchronization on every operation was probably one.
Therefore the best way to write an equivalent of the C++ initialization code in Java is :
List<Integer> list = new ArrayList<Integer>();
for (int i = 0; i < 10; i++){
list.add(new Integer());
}
The Javadoc for Vector.add(int, Object) pretty clearly states that an IndexOutOfBoundsException will be thrown if the index is less than zero or greater than or equal to the size. The Vector type grows as needed, and the constructor you've used sets the initial capacity, which is different than the size. Please read the linked Javadoc to better understand how to use the Vector type. Also, we java developers typically use a List type, such as ArrayList in situations where you would generally use a std::vector in C++.
Differences? You cannot compare how 2 languages do those. Normally Vector do use Stack data structure or LinkedList (or may be both). Which means, you put one item to the top, put another item on top of it, another item even on top of it, like wise. In LinkedList, it is bit different, you "pull" the value but the same thing. So in C++ it is better to use push_back() method.
C++ Vector objects are instantiated automatically. But in Java it is not, you have to fill it. I disagree with the way of filling it using l5.add(1, "Test");. Use l5.add("test").
Since you asked differences, you can define your object in this way as well
Vector a = new Vector();
That is without a type, in Java we call it without Generics. Possible since Java 1.6
Vector is now not widely used in Java. It has delays. We now move with ArrayList which is inside List interface.
Edit
variable names such as l5 are widely used in C++. But Java community expects more meaningful variable names :)
I wanted to know if the Java arrays are fixed after declaration. When we do:
int a[10];
and then can we do:
a = new int [100];
I am unsure if the first statement already allocates some memory and the second statement allocates a new chunk of memory and reassigns and overwrites the previous reference.
Yes it is:
The length of an array is established when the array is created. After
creation, its length is fixed.
Taken from here.
Also, in your question the first scenario: int a[10] is syntactically incorrect.
The second statement allocates a new chunk of memory, and the previous reference will eventually be garbage collected.
You can see it by yourself using a java debugger. You will notice that a will point to a different location after the second statement executes.
Good luck with your H.W.
Array have fixed length that is determined when you create it. If you want a data structure with variable length take a look at ArrayList or at LinkedList classes.
Array has fixed length but if you want the array size to be increased after this:
private Object[] myStore=new Object[10];
In normal way you have to create another array with other size and insert again all element by looping through the first array,but arrays class provide inbuild method which might be useful
myStore = Arrays.copyOf(myStore, myStore.length*2);