How can we reduce search time in linked list?

How can we reduce search time in linked list? - java

I read this question on career cup but didn't find any good answer other than 'SkipList'. The description of SkipList that I found on wikipedia was interesting, however, I didn't understand some terms like 'geometric/binomial distrubution'... I read what it is and goes deep into probabilistic theory. I simply wanted to implement a way to make some searching quicker. So here's what I did:
1. Created indexes.
- I wrote a function to create say 1000 nodes. Then, I created an array of type linked list and looped through the 1000 nodes and picked every 23rd element (random number that came in my mind) and added to the array which I call 'index'.
SLL index = new SLL[50]
Now the function to to create the index:
private static void createIndex(SLL[] index, SLL head){
int count=0;
SLL temp = head;
while(temp!=null)
{
count++;
temp = temp.next;
if((count==23){
index[i] = temp;
i++;
count=0;
}
}
}
Now finally the 'find' function. In that function, I first take the input element say 769 for example. I go through the 'index' array and find index[i]>769. Thus, now I pass head = index[i-1] and tail = index[i] to the 'find' function. It will then search between a short range of 23 elements for 769. Thus, I calculated that it takes a total of 43 jumps (including the array jumps and the node=node.next jumps) to find the element I wanted which otherwise would have taken 769 jumps.
Please Note: I consider the code to create index array NOT a part of searching, thus I do not add its time complexitiy(which is terrible) with the 'find' function's time complexity. I assume that this creation of index should be done as a separate function after a list has been created, OR, do it timely. Just like it takes time for a webpage to show up on google searches.
Also, this question was asked in a Microsoft interview and I wonder if the solution I provided would be any good or would I look like a fool for providing such kind of a solution. The solution has been written in Java.
Waiting for your feedback.

It is difficult to make out what problem it is that you are trying to solve here, or how your solution is supposed to work. (Hint: complete working code would help with both!)
However there are a couple of general things we can say:
You can't search a list data structure (e.g. find i in the list) in better that O(N) unless some kind of ordering has been placed on it. For example, sorting the elements.
If the elements of the list are sorted and your list is indexable (i.e. getting the element at position i is O(1)), then you can use binary search and find an element in O(logN).
You can't get the element at position i of a linked list in better that O(N).
If you add secondary data (indexes, whatever), you can potentially get better performance for certain operations ... at the expense of more storage space, and making certain other operations more expensive. However, you no longer have a list / linked list. The entire data structure is "something else".

Related

Java interview question: get entry by two fields in O(log(n)) time

Hi had an interview task, the idea is to store elements with fields: id, name, updateTime;
There should be methods add(Element), getElement(id), getLastUpdatedElements()
Requirements:
code should be on Java
Should be thread safe
Upper bound of computational complexiy for all these methods should be O(log(n))
Notes
Update time of any element can be changed in runtime
getLastUpdatedElements - returns updated last minute elements
My thoughts
I can not use CopyOnWriteArrayList because it will take O(N) to find last updated elements if the key is id, what breaks the requirement.
To fit O(log(N)) complexity with getLastUpdatedElements() I can use ConcurrentSkipListSet with comparator by updateTime but in that case it will take O(N) to get element by ID. (Please note that in this case add(Element) is O(log(N)) since we know updateTime for newly created elements)
I can use two trees, first one with comparator by id, second - with comparator by updateTime, but all access methods I should make synchronize what makes my programm single threaded
I think I'm close, just need to find how to get element with O(log(N)) but my thoughts are running out.

I hope I understood you correctly.
If you need to store the elements and have an "add" and "get" time as low as (log(N)), that sounds like classic hash map (which uses linked list hash and binary tree if search time reaches a certain threshold - since java 8 I believe).
so in the worst case it's log(N).
for the "get last updated" function: you can store each updated element in a stack (not really a stack, just a list you keep adding into) and when the function is performed. just perform a binary search on the list. when you reach the first item that has been updated in the last minute - just return the index to that item.
that way you only perform binary search (log(N)).
oh and of course just have a lock for those two data structures.
if you really want to dig into it performance-wise, you can implement two locks: one for inserting/updating entries, and one just for reading them.
similar to the "readers-writers problem" like so: https://www.tutorialspoint.com/readers-writers-problem

Traversal of Giant LinkedList

For a project I am required to write a method that times the traversal of a LinkedList filled with 5 million random Integers using a listIterator, then with LinkedList's get(index) method.
I had no problem traversing it with the listIterator and it completed in around 75ms. HOWEVER, after trying the get method traversal on 5 million Integers, I just stopped the run at around 1.5 hours.
The getTraverse method I used is something like the code below for example (however mine was grouped with other methods in a class and was non-static, but works the same way).
public static long getTraverse(LinkedList<Integer> list) {
long start = System.currentTimeMillis();
for (int i = 0; i < linkedList.size(); i++) {
linkedList.get(i);
}
long stop = System.currentTimeMillis();
return stop - start;
}
This worked perfectly fine for LinkedLists of Integers of sizes 50, 500, 5000, 50000, and took quite a while but completed for 500000.
My professor tends to be extremely vague with instructions and very unhelpful when approached with questions. So, I don't know if my code is broken, or if he got carried away with the Integers in the guidelines. Any input is appreciated.

Think about how a LinkedList is implemented - as a chain of nodes - and you'll see that to get to a particular node you have to start at the head and traverse to that node.
You're calling .get() on a LinkedList n times, which requires traversing the list until it reaches that index. This means your getTraverse() method takes O(n^2) (or quadratic) time, because for each element it has to traverse (part of) the list.
As Elliott Frisch said, I suspect you're discovering exactly what your instructor wanted you to discover - that different algorithms can have drastically different runtimes, even if in principle they do the same thing.

A LinkedList is optimised for insertion, which is a constant time operation.
Searching a LinkedList requires you to iterate over every element to find the one you want. You provide the index to the get method, but under the covers it is traversing the list to that index.
If you add some print statements, you'll probably see that the first X elements are retrieved pretty fast and it slows down over time as you index elements further down the list.
An ArrayList (backed by an array) is optimised for retrieval and can index the desired element in constant time. Try changing your code to use an ArrayList and see how much faster get runs.

Data structure in Java that supports quick search and remove in array with duplicates

More specifically, suppose I have an array with duplicates:
{3,2,3,4,2,2,1,4}
I want to have a data structure that supports search and remove the first occurrence of some value faster than O(n), say if the value is 4, then it becomes:
{3,2,3,2,2,1,4}
I also need to iterate the list from head according to the same order. Other operations like get(index) or insert are not needed.
You can use O(n) time to record the original data(say it's an int[]) in your data structure, I just need the later search and remove faster than O(n).
"Search and remove" is considered as ONE operation as shown above.
If I have to make it myself, I would use a LinkedList to store the data, and HashMap to map every key to a list of all occurrence of nodes together with their previous and next ones.
Is it a right approach? Are there any better choices already there in Java?

The data structure you describe, essentially a hybrid linked list and map, I think is the most efficient way of handling your stated problem. You'll have to keep track of the nodes yourself, since Java's LinkedList doesn't provide access to the actual nodes. The AbstractSequentialList may be helpful here.
The index structure you'll need is a map from an element value to the appearances of that element in the list. I recommend a hash table from hashCode % modulus to a linked list of (value, list of main-list nodes).
Note that this approach is still O(n) in the worst case, when you have universal hash collisions; this applies whether you use open or closed hashing. In the average case it should be something closer to O(ln(n)), but I'm not prepared to prove that.
Consider also whether the overhead of keeping track of all of this is really worth the gains. Unless you've actually profiled running code and determined that a LinkedList is causing problems because remove is O(n), stick with that until you do.

Since your requirement is that the first occurrence of the element should be removed and the remaining occurrences retained, there would be no way to do it faster than O(n) as you would definitely have to move through to the end of the list to find out if there is another occurrence. There is no standard api from Oracle in the java package that does this.

Insertion in the middle of ArrayList vs LinkedList [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Closed 9 years ago.
Talking in Java's context. If I want to insert in the middle of either an ArrayList or a linkedList, I've been told that Arraylist will perform terribly.
I understand that it is because, we need to shift all the elements and then do the insertion. This should be of the order n/2 i.e. O(n).
But is not it the same for linkedList. For linked List, we need to traverse till the time we find the middle, and then do the pointer manipulation. In this case too, it will take O(n) time. Would not it?
Thanks

The reason here is that there's no actual shifting of elements in the linked list. A linked list is built up from nodes, each of which holds an element and a pointer to the next node. To insert an element into a list requires only a few things:
create a new node to hold the element;
set the next pointer of the previous node to the new node;
set the next pointer of the new node to the next element in the list.
If you've ever made a chain of paper clips, you can think of each paper clip as being the beginning of the chain of it and all the paper clips that come after it. To stick a new paper clip into the chain, you only need to disconnect the paper clips at the spot where the new one will go, and insert the new one. A LinkedList is like a paper clip chain.
An ArrayList is kind of like a pillbox or a mancala board where each compartment can hold only a single item. If you want to insert a new one in the middle (and keep all the elements in the same order), you're going to have to shift everything after that spot.
The insertion after a given node in a linked list is constant time, as long as you already have a reference to that node (with a ListIterator in Java), and getting to that position will typically require time linear in the position of the node. That is, to get to the _n_th node takes n steps. In an array list (or array, or any structure that's based on contiguous memory, really) the address of the _n_th element in the list is just (address of 1st element)+n×(size of element), a trivial bit of arithmetic, and our computing devices support quick access to arbitrary memory addresses.

I think, when analysing the complexity, you need to take into account the metric you are using. In the ArrayList, your metric is shuffling, which is just assignment. But this is quite a complex operation.
On the other hand, you're using a LinkedList, and you're simply looking going to the reference. In fact, you only perform 1 insertion. So while the algorithmic complexity will wind up similar, the actual processes that are being executed at O(n) time are different. In the case of an ArrayList, it is performing a lot of memory manipulation. In the case of a LinkedList, it's only reading.
For those saying he doesn't understand LinkedLists
A LinkedList only has a pointed at the start, and a pointer at the end. It does not automatically know the Node behind the node you want to delete (unless it's a doubly linked list) so you need to traverse through the list, from the start by creating a temp pointer, until you come to the node before the one you want to delete, and I believe it's this that OP is discussing.

Java Search an array for a matching string

how can I optimize the following:
final String[] longStringArray = {"1","2","3".....,"9999999"};
String searchingFor = "9999998"
for(String s : longStringArray)
{
if(searchingFor.equals(s))
{
//After 9999998 iterations finally found it
// Do the rest of stuff here (not relevant to the string/array)
}
}
NOTE: The longStringArray is only searched once per runtime & is not sorted & is different every other time I run the program.
Im sure there is a way to improve the worst case performance here, but I cant seem to find it...
P.S. Also would appreciate a solution, where string searchingFor does not exist in the array longStringArray.
Thank you.

Well, if you have to use an array, and you don't know if it's sorted, and you're only going to do one lookup, it's always going to be an O(N) operation. There's nothing you can do about that, because any optimization step would be at least O(N) to start with - e.g. populating a set or sorting the array.
Other options though:
If the array is sorted, you could perform a binary search. This will turn each lookup into an O(log N) operation.
If you're going to do more than one search, consider using a HashSet<String>. This will turn each lookup into an O(1) operation (assuming few collisions).

import org.apache.commons.lang.ArrayUtils;
ArrayUtils.indexOf(array, string);
ArrayUtils documentation

You can create a second array with the hash codes of the string and binary search on that.
You will have to sort the hash array and move the elements of the original array accordingly. This way you will end up with extremely fast searching capabilities but it's going to be kept ordered, so inserting new elements takes resources.
The most optimal would be implementing a binary tree or a B-tree, if you have really so much data and you have to handle inserts it's worth it.

Arrays.asList(longStringArray).contains(searchingFor)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.