Stingy Linked Structure and nodes? - java

I am fairly new to programming and am unfamiliar with certain terminology and references within Java. Although I believe I have effectively utilized google, I find that asking you guys to dumb it down for me will help me more efficiently.
My question is, what specifically are nodes? And what are they used for? Additionally, what are Stingy Linked Structures used for?

A linked structure is a data structure that consists of a bunch of smaller elements (called cells or nodes) that are linked together to form a larger structure. This is similar to how molecules are formed - you have a bunch of smaller atoms that are then connected together to form a molecule. Many important data structures, such as linked lists or binary search trees - are linked structures.
Linked structures are usually contrasted with array-based structures. Arrays have a fixed size and are "rigid" - you can't efficiently break them into smaller pieces - so typically growing or shrinking an array-based structure takes time. Linked structures, being made of smaller pieces, can easily be divided up into smaller pieces or built up out of new pieces. For example, to append an element to an array, you may have to allocate a giant new array, copy over all the old elements, then append the new element. With a linked list or linked structure, you could just add another piece onto the end, which can be a lot more efficient. Similarly, if you have a sorted array and need to insert an element, you may have to shuffle down all of the other elements in the array, since there is no way to "splice" something into the array. If the sorted sequence is stored in a binary search tree, the new element can be added in in the proper place without moving any other elements around, which makes insertions more efficient.
I don't believe there is anything called a "stingy linked list." I think you mean singly-linked list, which is a linked list in which each cell (piece) stores only one link, usually to the next element in the sequence. This makes it easy to scan forward in the list from one element to the next, but makes it difficult to back up one position in the list.
Quite honestly, there is no simple way to enumerate all the cases where you'd want to use a linked structure because so many structures are linked structures. I would suggest picking up a book on fundamental data types (lists, stacks, queues, trees, etc.) to learn more about this. I just finished teaching a quarter-long programming class dedicated to this topic, and I doubt it's possible to condense into a single SO answer. :-)
Hope this helps!

Related

Unknown Data Structure?

I had a question about whether or not Java has its own data structure for what I am looking for. It is something like a combination of an array, linked list and a tree.
If it is not in Java, but exists already as a concept in computer science/other languages, that is also an acceptable answer so I can research it more and find out how to implement it myself.
Here is a picture to better illustrate what I am looking for. Excuse the lack of professionalism; I made it as best as I could:
I am looking for something that starts with several indexed starting elements, that eventually link to other elements and end in a convergence of sorts (one final element). In the end, each index has its corresponding starting element, which is linked all the way to the final converged element.
It should be the case that asking for unknownStructure[i] or something should grab an object that is a representation of the ith starting element linked all the way to the final converged element. (This thing to be grabbed is outlined in various bright colors in the picture).
It seems to me that you are looking for a directed Graph data structure.
You may need to use a list of graphs if needed.
See this page for algorithms and this for implementation.
There is no "name" for this that I know of, but an array of linked list nodes would work quite well for this.
Traditionally linked lists are separate and simply a row of items pointing to the next. However, there is no reason why certain linked list nodes cannot point to the same child node. After all, trees and linked lists are essentially created the same way in Java.
The only foreseeable problem would be if you want to traverse this "tree" back to the starting node in the array. (Which could still be achieved with multiple parent support.)
To implement your linked-list array, simply created a Node class as for a linked list and then created an array of these elements:
Node[] myTreeArray = new Node[];
Then simply fill this array with your "base" nodes and link them to their appropriate children (eventually leading the the "end" node, which has a child of null)

Performance difference between arrays, stacks and queues

What is the search performance of arrays, stacks and queues?
I think that arrays are the quickest and most straightforward, because I can access any element immediately by calling it using its index. Is this correct? What about the performance of stacks and queues? How do they compare?
Arrays (and collections based on arrays, eg. ArrayList) are bad for search performance because in the worst case you end up comparing every item (O(n)).
But if you don't mind modifying the order of your elements, you could sort the array (Arrays.sort(yourArray)) then use Arrays.binarySearch(yourArray, element) on it, which provides a O(log n) performance profile (much better that O(n)).
Stacks are in O(n), so nope.
Queues are not even meant to be iterated on, so looking for an object in here would mean consuming the whole queue, which is 1. not performant (O(n)) and 2. probably not what you want.
So among the datastructures you suggested, I would go for a sorted array.
Now, if you don't mind considering other datastructures, you really should take a look at those who use hash functions (HashSet, HashMap...).
Hash functions are really good at searching for elements, with a performance profile in O(1) (with a good hashcode() method in your objects).
I'll try to answer in a very simple way.
Stacks and queues are for storing data temporarily so that you can process the contents one by one. Just like a queue for buying movie tickets, or a stack of pancakes, you process one element at a time.
Arrays are for storing data as well as for accessing elements from the beginning, the end or in between. For searching, arrays would be a better choice.
Can you search elements inside stacks and queues? Possibly. But that's not what they are used for.
It depends how your search (or which search algorithm) is implemented. Stack or Queue may also helpful for some searching application like - BFS, DFS. But in normal case when you are using linear search you can consider about array or ArrayList.
In Java you have ArrayList (built on an array), a Stack (built on an array) and an ArrayQueue and ArrayDeque (which is also built on an array) As they all use the same underlying data structure their access speeds are basically the same.
For a brute force search, the time to scan or iterate over them (all of them support iteration) is O(n) Btw even a HashMap uses an array to store it's entries which is why iterating over its elements to find a value e.g. containsValue is O(n) as well.
While you could have a sorted array which would more naturally sit in an ArrayList, you could equally argue that a PriorityQueue will find and remove the next element the most efficiently. A Stack is ideal for finding the most recently added element.
To answer the question you have to determine what assumption the person asking the question is making. Without these further assumption you would have to say they could all be utilised. In that case I would use an ArrayList as it is the simplest to understand IMHO.
stack, queue and array all are three different and efficient data structures but as a bioinformatician if you want to store biological data, you should choose stack as data structure because last in and first out characteristic of recursive procedure shows that it is the most suitable data structure. Recursion is actually a characteristic of stack. At each procedure call a value can easily be pushed in it and can be retrieved when you exit from procedure. So it actually an aesy to use method.
See you can not compare one data structures with another. Each one has their own advantage and disadvantage.
Although array are good but you can not use them all the time because of their fixed size. They are use for inserting, deleting etc because they need O(1) time.
But when you want to access and insert data from one way only and don't want to perform searching, deleting in between you go for stack and queue. Stack and queue differ only in method of accessing element. In stack you access data from same side you enter data[LIFO] requires O(1) time. In queue you access data from other corner [FIFO]. Accessing elements in between can be done, but that's not what they are used for.

Speeding up a linked list?

I'm a student and fairly new to Java. I was looking over the different speeds achieved by the two collections in Java, Linked List, and ArrayList. I know that an ArrayList is much much faster at looking up and placing in values into its indexes. My question is:
how can one make a linked list faster, if at all possible?
Thanks for any help.
zmahir
When talking about speed, perhaps you mean complexity. Insertion and retrieval operations for ArrayList (and arrays) are O(1), while for LinkedList they are O(n). And this cannot be changed - it is 'by definition'.
O(n) means that in order to insert an object at a given position, or retrieve it, you must traverse, in the worst case, all (n) the items in the list. Hence n operations. For ArrayList this is only one operation.
You probably can't. You don't know the size (well, ok you can), nor the location of each element. To find element 100 in a linked list, you need to start with item 1, find it's link to item 2, etc. until you find 100. This makes inserting into this list a tedious job.
There are many alternatives depending on your exact goals. You can use b-trees or similar methods to split the large linked list into smaller ones. Or use hashlists if you want to quickly find items. Or use simple arrays. But if you want a list that performs like an ArrayList, why not use an ArrayList?
You can split off regions which are linked to the main linked list, so this gives you entry points directly inside the list so you don't have to walk up to them. See the subList method here: http://download.oracle.com/javase/1.4.2/docs/api/java/util/AbstractList.html. This is useful if you have a number of 'sentences' made out of words, say. You can use a separate linked list to iterate over the sentences, which are sublists of the main linked list.
You can also use a ListIterator when adding, removing, or accessing elements. This helps greatly with increasing the speed of sequential access. See the listIterator method for this, and the class: http://download.oracle.com/javase/1.4.2/docs/api/java/util/ListIterator.html.
Speed of a linked list could be improved by using skip lists: http://igoro.com/archive/skip-lists-are-fascinating/
a linked list uses pointers to walk through the items, so for example if you asked for the 5th item, the runtime will start from the first item and walks through each pointer until it reaches the 5th item.
there is really not much you can do about it. a linked list may not be a good choice if you need fast acces to items. although there are some optimizations for it such as creating a circular linked list or a double linked list where you can walk back and forth the list but this really depends on the business logic and the application requirements.
my advise is to avoid linked lists if it does not match your needs and changing to a different data structure might be the best approach.
As a general rule, data structures are designed to do certain things well. LinkedLists are designed to be faster than ArrayLists at inserting elements and removing elements and about the same as ArrayLists at iterating across the list in order. When you change the way a LinkedList works, you make it no longer a true LinkedList, so there's not really any way to modify them to be faster at something and still be a LinkedList.
You'll need to examine the way you're using this particular collection and decide whether a LinkedList is really the best data structure for your purposes. If you share with us how you're using it, and why you need it to be faster, then we can advise you on which data structure you ought to consider using.
Lots of people smarter than you or I have looked at the implementation of the Java collection classes. If there were an optimization to be made, they would have found it and already made it.
Since the collection classes are pretty much as optimized as they can be, our primary task should be to choose the correct one.
When choosing your collection type, don't forget about things like HashSet. If order doesn't matter, and you don't need to put duplicates in the collection, then HashSet may be appropriate.
I'm a student and fairly new to Java. ... how can one make a linked list faster, if at all possible?
The standard Java collection type (indeed all data structures implemented in any language!) represent compromises on various "measures" such as:
The amount of memory needed to represent the data structure.
The time taken to perform various operations; e.g. for a "list" the operations of interest are insertion, removal, indexing, contains, iteration and so on.
How easy or hard it is to integrate / reuse the collection type; see below.
So for instance:
ArrayList offers lower memory overheads, fast indexing (O(1)), but slow contains, random insertion and removal (O(N)).
LinkedList has higher memory overheads, slow indexing and contains (O(N)), but faster removal (O(1)) under certain circumstances.
The various performance measures are typically determines by the maths of the various data structures. For example, if you have a chain of nodes, the only way to get the ith node is to step through them from the beginning. This involves following i pointers.
Sometimes you can modify the data structures to improve one aspect of the performance. But this typically comes at the cost of some other aspect of the performance. (For example, you could add a separate index to make indexing of a linked list faster. But the cost of maintaining the index on insertion / deletion would mean that you'd probably be better of using an ArrayList.)
In some cases the integration / reuse requirements have significant impact on performance.
For example, it is theoretically possible to optimize a linked list's space usage by adding a next field to the list element type, combining the element and node objects and saving 16 or so bytes per list entry. However, this would make the list type less general (the member/element class would need to implement a specific interface), and has the restriction that an element can belong to at most one list at any time. These restrictions are so limiting that this approach is rarely used in Java.
For a second example, consider the problem of inserting at a given position in a linked list. For the LinkedList class, this is normally an O(N) operation, because you have to step through the list to find the position. In theory, if an application could find and remember a position, it should be able to perform the insertion at that position in O(1). Unfortunately, neither the List APIs provides no way to "remember" a position.
While neither of these examples is a fundamental roadblock to a developer "doing his own thing", they illustrate that using general data structure APIs and general implementations of those APIs has performance implications, and therefore represents a trade-off between performance and ease-of-use.
I'm a bit surprised by the answers here. There are big difference between the theoretical performance of LinkedLists and ArrayLists compared to the actual performance of the Java implementations.
What makes the Java LinkedList slower than a theoretical LinkedList is that it does a lot more than just the operations. For example it checks for concurrent modifications and other safeties.
If you know your use case, you can write a your own simple implementation of a LinkedList and it will be much faster.

Data structures: Which should I use for these conditions?

This shouldn't be a difficult question, but I'd just like someone to bounce it off of before I continue. I simply need to decide what data structure to use based on these expected activities:
Will need to frequently iterate through in sorted order (starting at the head).
Will need to remove/restore arbitrary elements from the/a sorted view.
Later I'll be frequently resorting the data and working with multiple sorted views.
Also later I'll be frequently changing the position of elements within their sorted views.
This is in Java, by the way.
My best guess is that I'll either be rolling some custom Linked Hash Set (to arrange the links in sorted order) or possibly just using a Tree Set. But I'm still not completely sure yet. Recommendations?
Edit: I guess because of the arbitrary remove/restore, I should probably stick with a Tree Set, right?
Actually, not necessarily. Hmmm...
In theory, I'd say the right data structure is a multiway tree - preferably something like a B+ tree. Traditionally this is a disk-based data structure, but modern main memory has a lot of similar characteristics due to layers of cache and virtual memory.
In-order iteration of a B+ tree is very efficient because (1) you only iterate through the linked-list of leaf nodes - branch nodes aren't needed, and (2) you get extremely good locality.
Finding, removing and inserting arbitrary elements is log(n) as with any balanced tree, though with different constant factors.
Resorting within the tree is mostly a matter of choosing an algorithm that gives good performance when operating on a linked list of blocks (the leaf nodes), minimising the need to use leaf nodes - variants of quicksort or mergesort seem like likely candidates. Once the items are sorted in the branch nodes, just propogate the summary information back through the leaf nodes.
BUT - pragmatically, this is only something you'd do if you're very sure that you need it. Odds are good that you're better off using some standard container. Algorithm/data structure optimisation is the best kind of optimisation, but it can still be premature.
Standard LinkedHashSet or LinkedMultiset from google collections if you want your data structure to store not unique values.

If Linked List and Array are fundamental data structures what type of data structure are tree, hash table, heap etc?

I was going through data structure online class and it was mentioned that linked list and array are fundamental data-structures and so my question is about Hash table, Heap, tree and graph are those not fundamental data structures and if not are they derived from any other data structure ?
Thanks.
List and array could be considered fundamental because almost every single data structure is composed by pieces of these original data structures.
Graphs for instance, can be array backed or list backed (usually for sparse graphs).
But AFIAK as is many things in computer science it is not formalized what a "fundamental data structure" might mean mathematically.
You could consider arrays and linked lists fundamental in that there's essentially a single way to implement them (a sequence of contiguous objects for an array, a linear chain (singly or doubly linked) of objects for a linked list).
The more advanced data structures could be considered "derivative" in that they can be implemented multiple ways, and essentially come back to arrays and linked lists at the lowest level. Examples:
---An n-ary tree is generally implemented as a series of linked nodes (like a list) but with each node containing an array of child links. For a binary tree, you don't usually see the array overtly because the children are usually given the names left and right.
---Hash tables can be implemented in multiple ways. For a chained hash table, it's implemented as a (sparse) array of linked lists. For a probed or open addressed hash table, it's just a big array with collision logic.
---Sets are usually trees or hash tables, and thus transitively defined in terms of arrays and lists
---Stacks, queues, vectors, etc. are just arrays with special semantics imposed.
I agree with the other posters that CS doesn't really have a "textbook" definition of fundamental data structures, but you can easily see it to be "de facto" true.
Interesting question, by the way...
I'm not aware of an accepted definition in computer science of "fundamental data structure", or even that it is a conventionally used term at all.
But it's an interesting question, as I can think of useful definitions. After all, the data structures we build in an HLL are built out of something. (Simply reducing all data structures to "is the language and platform Turing-complete" seems silly.)
So I can think of two reasonable uses for the term:
It could describe the data structures with which all those others are built, or could be built, or "will be built in this class", or some other sort of axiomatic starting point.
It could describe the data structures that are built in to a given language
In Lisp (well Scheme anyway), the only fundamental data structure is a pair. Everything else is constructed from this type.
You'll have to specify your language to find out what the fundamental data structures for that language are.
Edit: Ok, Java and C++. I'd imagine that all C++ library containers like Vector and Queue would be based on the humble array, but in Java more types may be built-in. I guess it depends on how you define "fundamental".
I'd disagree. The fundamental data type is the array of words. That's what all modern RAM chips provide. Everything else is an abstraction created by programs. Linked list? That uses one word per node for a "next pointer", possibly one word for a "previous pointer", and then some words for the node data. Hashtables are another abstraction on top of that array of words.
Even arrays of bytes are an abstraction; they are implemented by sub-addressing each byte. This abtraction is ofen provided by the CPU, though.

Categories

Resources