Quadtree performance - java

Anyone knows where i can find some documentation on, or know how many operations insertion and queries takes in a quadtree?
wiki says O(logn) but i found another source saying O(nlogn) and i need to know which is true.
I'm working with a point quadtree
http://www.codeproject.com/Articles/30535/A-Simple-QuadTree-Implementation-in-C
http://en.wikipedia.org/wiki/Quadtree

Search: O(logn): it must traverse down the entire tree to find the element. To be specific the log in this case is log_4, as there are 4 children.
Insert(single point): O(logn): You must traverse the tree place to find the insertion location, then some small constant amount of work to split the points in that quadrant.
Insert(n points): O(nlogn), every point must inserted, leading to nlogn. I hope this is what the other site you read meant be nlogn, otherwise they would be very wrong.
The original paper is called "Quad trees a data structure for retrieval on composite keys" by Finkel and Bentley.

Related

Finding shortest path from a list of points

If I have a list of points returned from a breadth-first-search through a type of 2D array maze in Java, how could I find the shortest path in that list of points?
For example, if my potential target points are [2, 4] and [6, 0] and I have a list of points that goes to each target point, how can I find out which route is shortest?
I'm thinking I might have to use Djikstra's Algorithm or something, but I'm unsure.
Thanks very much
You can use Dijkstra's algorithm here. Given the array maze you mentioned, the first step would be to get all the edges in your maze between 2 adjacent nodes, and the edge costs. For the nodes, you would need to know if each node has been visited, and the current cost of visiting that node.
Now, if you are not concerned with getting the shortest path, you might not need Dijkstra's. Instead, you can use a DP/Recursion approach to get possible paths from a source coordinate to a target coordinate. If you want to see a Dijkstra's implementation, this is something I had written. To get the shortest path, you would obviously need the distance between nodes.
To just find a path between 2 points, I would suggest something like this implementation. It includes both DP and recursion implementations and compares the running times for both (recursion can take very long to run for large datasets).
I feel this much should be enough to get you started. Let me know if you need some other information.
I didn't end up using Dijkstra's, but instead changed my breadth-first-search to add a "level" value or "distance" value from the origin point that would count upward with each visited node. The counts would stay consistent if the path ever branched, and since I already knew the end point, all it would take is to check out what the "count" was in my end points and compare.
Thank you for your help though! You'll get an upvote if the site lets me.
For those facing a similar problem:
I made a simple class called PointC which inherits from Point, but with an added "count" value. The count was updated appropriately with each step of the breadth first search, then the end points were compared at the end to obtain the most optimal path.

Comparing 2 b-tree's to see if they contain the same values

Seeing that a 2 b-tree's could have the same values, yet a different shape, is there an algorithm to go through the values and compare if both tree's have the same keys?
The point is to be able to bail out if they contain different keys (as soon as possible).
A recursive algorithm probably won't work unless you are performing a lookup in both b-tree's at the same time I'm guessing.
I've seen algorithm's that traverse a b-tree, but I don't want to traverse both, and then compare the keys, I want something smarter that will bail out as early as possible if there is a difference.
Basically the function returns true/false.
The fundamental technique is to somehow have an object that represents the current point in the in-order traversal. Once you have two of those, one for each instance of the tree, you just keep pumping them for the next key, and the first time the two return a different next key, you're done.
In C# you'd use yield return to make a traversal that yields up a single key at a time, and keeps track of where it is in the tree. You can then pass two of those to SequenceEquals, and it will bail out as soon as it encounters the first difference. In Java you'd have to build that mechanism yourself, but it's not that hard to do.
Assuming you mean a b-tree then all you need to do is iterate over both at once. Any deviation between either iterator will prove that their contents differ. It is unlikely you will find a better algorithm than that without collecting more details as you build the trees.
If you are not talking about the b-tree which is described as:
... a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time.
then you need to sort it first then traverse it.

Dealing with identical data in a binary search tree

Im in the process of teaching myself data structures and I am currently working on a binary search tree. I was curious how you would sort the tree if you had identical data. For example say that my data consists of [4,6,2,8,4,5,7,3].
I set 4 as the root element
put 6 to the right of it
put 2 to the left of 4
put 8 to the right of 6
Then I get to 4 where do I put it since 4=4? To the left or the right?
Option #1
Option #2
Are either one of these correct or are they both wrong? If they are both wrong could you show me how they should be sorted. Thanks!
Usually binary trees do not allow data duplication. If you make a custom implementation you can store a count of elements. TreeSet in Java is an example - it contains only unique elements.
Actually the cases you listed broke the whole structure of the tree. Search operations will look weird now and couldn't be performed with O(ln n). It will take O(n) in worst case so you loose all the benefits of this data structure.
If this is a sort-tree, then what you have will work fine, either way; in the end you'll do a tree-walk and dump the data.
If this is a search-tree, then I'd just drop the extra (redundant) data once it's been encountered; "it exists". You did say this is a search-tree, and while not ideal, it's not actually broken - if you search for "4" you'll simply catch the root node (in this case), and never decend below that to see any other "4". It isn't optimal, having all the extra #'s around.
There will be best-case and worst-case situations regardless of which way you choose; don't worry too much about left/right decisions - generally just doesn't matter. IF you have a solid grasp of details in a known data-stream you'd be able to make an optimal decision for that specific case.

Euler depth first search algorithm

I've coded a depth first search from a graph using euler algorithm of getting a cycle and splice subcicles into the result.
The problem is, to very large data, it isn't fast enough to find the correct path, namely on the dfs worst case scenario.
I've ordered the adjacency list and start at a given point, to finish at the same starting point. My idea to improve was to make the search bidirectional but that adds alot of complexity dealing with the dead ends when I want to add order to the result.
My question is basically if there is some other way to get around the worst case scenario or how to properly deal with dead ends on bidirectional search so the result will stay numericly ordered?
Any input is welcome.

Data structures: Which should I use for these conditions?

This shouldn't be a difficult question, but I'd just like someone to bounce it off of before I continue. I simply need to decide what data structure to use based on these expected activities:
Will need to frequently iterate through in sorted order (starting at the head).
Will need to remove/restore arbitrary elements from the/a sorted view.
Later I'll be frequently resorting the data and working with multiple sorted views.
Also later I'll be frequently changing the position of elements within their sorted views.
This is in Java, by the way.
My best guess is that I'll either be rolling some custom Linked Hash Set (to arrange the links in sorted order) or possibly just using a Tree Set. But I'm still not completely sure yet. Recommendations?
Edit: I guess because of the arbitrary remove/restore, I should probably stick with a Tree Set, right?
Actually, not necessarily. Hmmm...
In theory, I'd say the right data structure is a multiway tree - preferably something like a B+ tree. Traditionally this is a disk-based data structure, but modern main memory has a lot of similar characteristics due to layers of cache and virtual memory.
In-order iteration of a B+ tree is very efficient because (1) you only iterate through the linked-list of leaf nodes - branch nodes aren't needed, and (2) you get extremely good locality.
Finding, removing and inserting arbitrary elements is log(n) as with any balanced tree, though with different constant factors.
Resorting within the tree is mostly a matter of choosing an algorithm that gives good performance when operating on a linked list of blocks (the leaf nodes), minimising the need to use leaf nodes - variants of quicksort or mergesort seem like likely candidates. Once the items are sorted in the branch nodes, just propogate the summary information back through the leaf nodes.
BUT - pragmatically, this is only something you'd do if you're very sure that you need it. Odds are good that you're better off using some standard container. Algorithm/data structure optimisation is the best kind of optimisation, but it can still be premature.
Standard LinkedHashSet or LinkedMultiset from google collections if you want your data structure to store not unique values.

Categories

Resources