I know what arrays are and how to use them. However, I don't know how they are implemented. I was trying to figure out if I can try to implement an array-like data structure using Java but I couldn't.
I've searched online but didn't find anything useful.
Is it even possible to implement an array-like data structure in Java? Is it possible in other languages? if so how (without using arrays of course)?
EDIT: what I want to know is how to implement an array data structure without using arrays?
Arrays are contiguous sections within memory, so to create an array you would need to reserve a chunk of memory which is of size n * sizeof(type), where n is the amount of items you would like to store and the sizeof(type) would return the size, in bytes which the JVM would need to represent that given type.
You would then store a reference (pointer) to the first location of your memory segment, say 0x00, and then you use that as a base to know how much you need to move to access the elements, so a[n] would be equal to doing 0x00 + (n * sizeof(type)).
The problem with trying to implement this in Java is that Java does not allow pointer manipulation, so I do not think that building your own array type would be possible since you cannot go down to that level.
That being said, you should be able to create a linked data structure, where the nth element points to the (n + 1)th element.
Other problems why you should try other languages, such as C# (check unsafe operations), C++ or C:
To my knowledge, Java does not have a sizeof function (see this).
Java does not allow operator overloading. So you cannot define your own indexing operators such as [index]. You would probably need to do something like array.getElementAt(0) to get the first element.
As #ug_ recommended, you could take a look at the Unsafe class. But also as he recommended, I do not think that you should do pointer arithmetic with a language which has pointer abstraction as one of its core ideas.
If what you want is something like this:
MyArray ma = new MyArray(length);
ma[0] = value;
Then you can't do this in Java but you can in other languages. Look for "operator overloading".
I'm wondering if your thinking of a structs, vectors or link lists. These are all similar to arrays but are different.
Structs are not really in java, but you can implement them.
Read up on Structs here:
www.cplusplus.com/doc/tutorial/structures/
An example Structs used in java:
Creating struct like data structure in Java
I think what you are really looking for though are vectors. They are very similar to an array, but their not one.
Vectors info:
www.cplusplus.com/reference/vector/vector/
Array compared to vector:
https://softwareengineering.stackexchange.com/questions/207308/java-why-do-we-call-an-array-a-vector
I recommend a link list. Its kinda the same idea of an array, but without knowing your exact size. It is easier to implement.
Link lists:
en.wikipedia.org/wiki/Linked_list
All these come down to the situation on what need them for. Saying , "what I want to know is how to implement an array data structure without using arrays?" is kinda open ended.
Related
in the interview room,the interviewer asked me a question that how arraylist is so fast,i said that it implements RandomAccess, but he asked how random access beneficial for searching the object in memory area?
Do you want to say that objects are stored in line in the memory and it goes to the 10th index for example
An array is just the starting point of a chunk of memory along with a data type (int, boolean, String, etc.). The data type is used to determine how far apart the elements are spaced.
A Java ArrayList is similar to an array, but with additional features.
When using an array (or any array-related data structure), individual read/write operations are fast and completely unrelated to the total size of the array. If you want the one millionth array element, it's a single calculation to determine where that element is (one million * <size of each element>) – no scanning or searching involved.
There are lots of great resources online where you can read more about arrays. It's worth building a solid understanding of arrays, not just for job interview purposes but also for general understanding of how computers work.
Because ArrayList is a resizable array implementation of the List interface.
Please refer below link for understanding it well:
ArrayList Internal Working
Java has the concept of data structures. Each data structure has its own advantages and disadvantages. In SAS, all data always goes into a 'data' step?
What java Collection does the 'data' in SAS compare to? Is it an Array, a List? Seems more like a Hashmap?
Is it even fair to draw this comparison?
SAS data can be likened to a tables in any typical RDBMS. So probably a two dimensional array would be the fairest comparison. ie. a table structure.
These structures can be operated on by all sorts of procedures (e.g. proc sort, proc sql, etc.) or the data step.
It's definitely not a hashmap as data in SAS does not require a unique key (as implemented by hashmaps).
If you wanted a different data structure such as a graph structure containing nodes, and edges, etc. then SAS does not really provide a mechanism to represent them.
http://www.ats.ucla.edu/stat/sas/library/SASRead_os.htm says
You can think of a data set as a two-dimensional table
which IMO means that nothing directly comparable comes with standard Java.
You can build similar (functionally equivalent) data structures with lists of lists or other collection of collection compositions.
I don't think there's a direct comparison to a Java collection type, but articles I've seen loosely correlate it to an array.
https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a003252712.htm
Only one-dimensional array parameters are supported. However, it is
possible to pass multidimensional array arguments by taking advantage
of the fact that the arrays are passed in row-major order. You must
handle the dimensional indexing manually in the Java code--that is,
you must declare a one-dimensional array parameter and index to the
subarrays accordingly.
I know Java and also recently started learning Python. At one point I understood that I need to take a pause and clarify all questions related to Data Structures, especially Lists, Arrays and Tuples. Could you please correct me if I am wrong in any of the following:
Originally, according to Data Structures standards, Lists do not
support any kind of indexation. The only way to get access to the
element is through iterations (next method).
In Java there is actually a way to get access to elements by index (i.e. get(index) method), but even if you use these index-related methods it is still iterating from the first element (or more specifically its reference)
There is a way in Python to access to Lists elements as we work with arrays in Java, using list[index] syntax, but in reality, even though this data type is called "lists", we do have an array of references in the background and when we refer to the third element, for example, we are referring directly to the 3 element in array to get reference without iteration from the first one (I am pretty sure that I am wrong over here)
Tuples are implemented in the same way as Lists in Python. The only difference is that they are immutable. But it is still something closer to lists than arrays, because elements are not located contiguously in memory.
There are no arrays as in Python
In Data Structure theory, when we are creating an array, it uses only a reference to the first cell of memory, and then iterates to the # of element that we specified as index. The main difference between Lists and Arrays is that all elements are located contiguously in memory, that's why we are winning in performance aspect.
I am pretty sure that I am wrong somewhere. Could you correct me please?
Thanks
Most of that is wrong.
The list abstract data type is an ordered sequence of elements, allowing duplicates. There are many ways to implement this data type, particularly as a linked list, but most programming languages use dynamically resized arrays.
Even linked lists may support indexing. There is no way for the implementation to skip directly to the n'th element, but it can just follow links to get there.
Java's List type does not specify an implementation, only an interface. The ArrayList type is a List implemented with a dynamic array; the Linkedlist is exactly what the name says.
Python's lists are implemented with dynamically resized arrays. Python's tuples are implemented with fixed-size arrays.
There are actually two Python types commonly referred to as arrays, not counting the common newbie usage of "array" to refer to Python lists. There are the arrays provided by the array module, and there are NumPy's ndarrays.
When you index an array, the implementation does not iterate from the location of the first element to the n'th. It adds an offset to the address of the array to skip to the element directly, without iterating.
I am writing a program that will be heavily reliant on ... something ... that stores data like an array where I am able to access any point of the data at any given time as I can in an array.
I know that the java library has an Array class that I could use or I could use a raw array[].
I expect that using the Array type is a bit easier to code, but I expect that it is slightly less efficient as well.
My question is, which is better to use between these two, and is there a better way to accomplish the same result?
Actually Array would be of no help -- it's not what you think it is. The class java.util.ArrayList, on the other hand, is. In general, if you can program with collection classes like ArrayList, do so -- you'll more easily arrive at correct, flexible software that's easier to read, too. And that "if" applies almost all the time; raw arrays are something you use as a last resort or, more often, when a method you want to call requires one as an argument.
The Array class is used for Java reflection and is very, very, rarely used.
If you want to store data in an array, use plain old arrays, indicated with [], or as Gabe's comment on the question suggests, java.util.ArrayList. ArrayList is, as your comment suggests easier to code (when it comes to adding and removing elements!!) but yes, is slightly less efficient. For variable-size collections, ArrayList is all but required.
My question is, which is better to use between these two, and is there a better way to accomplish the same result?
It depends on what you are trying to achieve:
If the number of elements in the array is known ahead of time, then an array type is a good fit. If not, a List type is (at least) more convenient to use.
The List interface offers a number of methods such as contains, insert, remove and so on that can save you coding ... if you need to do that sort of thing.
If properly used, an array type will use less space. The difference is particularly significant for arrays of primitive types where using a List means that the elements need to be represented using wrapper types (e.g. byte becomes Byte).
The Array class is not useful in this context, and neither is the Arrays class. The choice is between ArrayList (or some other List implementation class) and primitive arrays.
In terms of ease of use, the Array class is a lot easier to code.
The array[] is quite a problem in terms of the case that you need to know
the size of the list of objects beforehand.
Instead, you could use a HashMap. It is very efficient in search as well as sorting as
the entire process is carried out in terms of key values.
You could declare a HashMap as:
HashMap<String, Object> map = new HashMap<String, Object>();
For the Object you can use your class, and for key use the value which needs to be unique.
Sometime back our architect gave this funda to me and I couldn't talk to him more to get the details at the time, but I couldn't understand how arrays are more serializable/better performant over ArrayLists.
Update: This is in the web services code if it is important and it can be that he might mean performance instead of serializability.
Update: There is no problem with XML serialization for ArrayLists.
<sample-array-list>reddy1</sample-array-list>
<sample-array-list>reddy2</sample-array-list>
<sample-array-list>reddy3</sample-array-list>
Could there be a problem in a distributed application?
There's no such thing as "more serializable". Either a class is serializable, or it is not. Both arrays and ArrayList are serializable.
As for performance, that's an entirely different topic. Arrays, especially of primitives, use quite a bit less memory than ArrayLists, but the serialization format is actually equally compact for both.
In the end, the only person who can really explain this vague and misleading statement is the person who made it. I suggest you ask your architect what exactly he meant.
I'm assuming that you are talking about Java object serialization.
It turns out that an array (of objects) and ArrayList have similar but not identical contents. In the array case, the serialization will consist of the object header, the array length and its elements. In the ArrayList case, the serialization consists of the list size, the array length and the first 'size' elements of the array. So one extra 32 bit int is serialized. There may also be differences in the respective object headers.
So, yes, there is a small (probably 4 byte) difference in the size of the serial representations. And it is possible that an array can be serialized / deserialized
slightly more quickly. But the differences are likely to be down in the noise, and not worth worrying about ... unless profiling, etc tells you this is a bottleneck.
EDIT
Based on #Tom Hawtin's comment, the object header difference is significant, especially if the serialization only contains a small number of ArrayList instances.
Maybe he was refering to XML-serialization used in Webservices ?
Having used those a few years ago, I remember that a Webservice returning a List object was difficult to connect to (at least I could not figure it out, probably because of the inner structure of ArrayLists and LinkedLists), although this was trivially done when a native array was returned.
To adress Reddy's comment,
But in any case (array or ArrayList)
will get converted to XML, right?
Yes they will, but the XML-serialization basically translated in XML all the data contained in the serialized object.
For an array, that is a series of values.
For instance, if you declare and serialize
int[] array = {42, 83};
You will probably get an XML result looking like :
<array>42</array>
<array>83</array
For an ArrayList, that is :
an array (obviously), which may have a size bigger than the actual number of elements
several other members such as integer indexes (firstIndex and lastIndex), counts, etc
(you can find all that stuff in the source for ArrayList.java)
So all of those will get translated to XML, which makes it more difficult for the Webservice client to read the actual values : it has to read the index values, find the actual array, and read the values contained between the two indexes.
The serialization of :
ArrayList<Integer> list = new ArrayList<Integer>();
list.add(42);
list.add(83);
might end up looking like :
<firstIndex>0</firstIndex>
<lastIndex>2</lastIndex>
<E>42</E>
<E>83</E>
<E>0</E>
<E>0</E>
<E>0</E>
<E>0</E>
<E>0</E>
<E>0</E>
<E>0</E>
<E>0</E>
So basically, when using XML-serialization in Webservices, you'd better use arrays (such as int[]) than collections (such as ArrayList<Integer>). For that you might find useful to convert Collections to arrays using Collection#toArray().
They both serialize the same data. So I wouldn't say one is significantly better than the other.
As of i know,both are Serializable but using arrays is better coz the main purpose of implementing the ArrayList is for internal easy manipulation purpose,not to expose to outer world.It is little heavier to use ,so when using in webservices while serializing it ,it might create problems in the namespace and headers.If it automatically sets them ,you ll not be able to receive or send data properly.So it is better to use primitive arrays .
Only in Java does this make a difference, and even then it's hard to notice it.
If he didn't mean Java then yes, your best bet would most likely be asking him exactly what he meant by that.
Just a related thought: The List interface is not Serializable so if you want to include a List in a Serializable API you are forced to either expose a Serializable implementation such as ArrayList or convert the List to an array. Good design practices discourage exposing your implementation, which might be why your architect was encouraging you to convert the List to an array. You do pay a little time penalty converting the List to an array, but on the other end you can wrap the array with a list interface with java.util.Arrays.asList(), which is fast.