presort data for balanced quadtree - java

The PMR Rectangle Quadtree is a quadtree which has an list of (Rectangle) objects in each leaf. This is called a bucket.
The structure of this quadtree is dependent on the order of inserting the elements.
The inventor of that quadtree proposed to achieve a balanced quadtree for data that are in advance known (static), that way that the (rectangles) objects to be inserted should be pre-sorted by x and y coordinates.
What exactly is meant by sorting by x and y coordinates to achieve a balanced quadtree?
Asume we take the SW corner of the rectangle, Does this mean sort by x and if equal x sort by y? Or doest it mean the first elem is the smallest x, the second the smallest y (independent of x) ?
The bible for that topic (Hanan Samet: Multidimensional and Metric Search Structures) does not explain that.

Seems to be a topic where the know how is not really widespread, I have to answer it myself:
The pre sorting order of the elements to be added to the quad tree should be in morton order. ( see also papers from Hanan Samet)
The morton index calculates an int value from given two (x,y) coordinates, in a way that two coordinates close together also have little difference
in their morton index.

Related

Finding neighbor in 2D array using delta

I was going through one of the online courses for Data strudcture and Algorithms. There I saw saw a topic "Delta based two dimensional array search". Which describes below:
A techinque to search adjoing array elements of four directions in coordinates of two directions.
Delta value is an array that saves coordinates of four direction in coordinates as well as difference between X and Y.
Delta value is employed to approach the elements located up down left right side of certain element.
As per m understanding, its basically searching neighbors in four directions of a given array. I found couple of good answers on stack overflow:
Finding the neighbors of 2D array
Finding 8 neighbours in 2d Array
Finding a neighbour in a 2d array
More efficient way to check neighbours in a two-dimensional array in Java
Finding neighbours in a two-dimensional array
However I am not clear with the 2nd point "Delta value is an array that saves coordinates of four direction in coordinates as well as difference between X and Y". Can anyone explain what does this point means? Also the stack overflow links that I mentioned above serve the purpose? Thank You

graph representation with minimum cost (time and space)

i have to represent a graph in java but neither as an adjacency list nor an adjacency matrix..
the basic idea is that if
deg[i]
is that exit degree of vertex i, then its neigboors can store in
edges[i][j] where
i <= j <= deg[i]
, but given that
edges[][]
must be initialized with some values i dont know how to make it differ from an adjacency matrix..
any suggestions?
In my knowledge there are only two ways to represent a graph in a language.
Either Use Adjacency matrix
Or Use Incidence matrix
You can make an incidence matrix like
E1 E2 E3 E4
V1 1 2 1
1 V2 2
1 2 1 V3
1 1 1 2
V4 1 1 2 1
You are fighting against lower bounds on this question. The two main representations of the graph are already very good for their respective use.
Adjacency list, minimizes space. You will be hard pressed to use less memory than 1 pointer per edge. Space: O(V*E). Search: O(V)
Adjacency matrix, is very fast, at the cost of v^2 space. Space: O(V^2). Search: O(1)
So, to make something that is better for both space and time you will have to combine the ideas from both. Also realize will just have better practical performance, theoretically you will not improve O(1) search, or O(V*E) size.
My idea would be to store all the graph nodes in one array. Then for each Node have an adjacency list represented as a bit vector. This would essentially be a matrix like representation, but only for those nodes that exist in the graph, giving you a smaller size than a matrix. Search would be slightly improved over an Adjacency list as the query node can be tested against the bit vector.
Also check out sparse matrix representations.

Multi-dimensional segment trees

The problem I have to solve is the 4D version of the 1D problem of stabbing queries: find which intervals a number belongs to. I am looking for a multi-dimensional implementation of segment trees. Ideally, it will be in Java and it will use fractional cascading.
Multi-dimensional implementations exist for kd-trees (k-NN searches) and range trees (given a bounding box, find all points in it) but for segment trees I've only found 1D implementations.
I'd be happy to consider other data structures with similar space/time complexity to address the same problem.
To expand on my comment, the binary-space-partitioning algorithm I have in mind is this.
Choose a coordinate x and a threshold t (random coordinate, median coordinate, etc.).
Allocate a new node and assign all of the intervals that intersect the half-plane x=t to it.
Recursively construct child nodes for (a) the intervals contained entirely within the lower half-space x<t and (b) the intervals contained entirely within the upper half-space x>t.
The stabbing query starts at the root, checks all of the intervals assigned to the current node, descends to the appropriate child, and repeats. It may be worthwhile to switch to brute force for small subtrees.
If too many intervals are getting stabbed by the half-plane x=t, you could try recursing on the (a) the intervals that intersect the lower half-space and (b) the intervals that intersect the upper half-space. This duplicates intervals, so the space requirement is no longer linear, and you probably have to switch over to brute force on collections of intervals for which subdivision proves unproductive.

Java: distance metric algorithm design

i am trying to work out the following problem in Java (although it could be done in pretty much any other language):
I am given two arrays of integer values, xs and ys, representing dataPoints on the x-axis. Their length might not be identical, though both are > 0, and they need not be sorted. What I want to calculate is the minimum distance measure between two data set of points. What I mean by that is, for each x I find the closest y in the set of ys and calculate distance, for instance (x-y)^2. For instance:
xs = [1,5]
ys = [10,4,2]
should return (1-2)^2 + (5-4)^2 + (5-10)^2
Distance measure is not important, it's the algorithm I am interested in. I was thinking about sorting both arrays and advance indexes in both of these arrays somehow to achieve something better than bruteforce (for each elem in x, scan all elems in ys to find min) which is O(len1 * len2).
This is my own problem I am working on, not a homework question. All your hints would be greatly appreciated.
I assume that HighPerformanceMark (first comment on your question) is right and you actually take the larger array, find for each element the closest one of the smaller array and sum up some f(dist) over those distances.
I would suggest your approach:
Sort both arrays
indexSmall=0
// sum up
for all elements e in bigArray {
// increase index as long as we get "closer"
while (dist(e,smallArray(indexSmall)) > dist(e,smallArray(indexSmall+1)) {
indexSmall++
}
sum += f(dist(e,smallArray(indexSmall)));
}
which is O(max(len1,len2)*log(max(len1,len2))) for the sorting. The rest is linear to the larger array length. Now dist(x,y) would be something like abs(x-y), and f(d)=d^2 or whatever you want.
You're proposed idea sounds good to me. You can sort the lists in O(n logn) time. Then you can do a single iteration over the longer list using a sliding index on the other to find the "pairs". As you progress through the longer list, you will never have to backtrack on the other. So now your whole algorithm is O(n logn + n) = O(n logn).
Your approach is pretty good and has O(n1*log(n1)+n2*log(n2)) time complexity.
If the arrays has different lengths, another approach is to:
sort the shorter array;
traverse the longer array from start to finish, using binary search to locate the nearest item in the sorted short array.
This has O((n1+n2)*log(n1)) time complexity, where n1 is the length of the shorter array.

Mapping Pixels to Data

I've written some basic graphing software in Clojure/Java using drawLine() on the graphics context of a modified JPanel. The plotting itself is working nicely, but I've come to an impasse while trying to converting a clicked pixel to the nearest data point.
I have a simple bijection between the list of all pixels that mark end points of my lines and my actual raw data. What I need is a surjection from all the pixels (say, 1200x600 px2) of my graph window to the pixels in my pixel list, giving me a trivial mapping from that to my actual data points.
e.g.
<x,y>(px) ----> <~x,~y>(pixel points) ----> <x,y>(data)
This is the situation as I'm imagining it now:
A pixel is clicked in the main graph window, and the MouseListener catches that event and gives me the <x,y> coordinates of the action.
That information is passed to a function that returns a predicate which determines whether or not a value passed to it is "good enough", and filter though the list with that pred, and take the first value it okays.
Possibly, instead of a predicate, it returns a function which is passed the list of the pixel-points, and returns a list of tuples (x index) which indicate how good the point is with the magnitude of x, and where that point is with index. I'd do this with both the x points and the y points. I then filter though that and find the one with the max x, and take that one to be the point which is most likely to be the one the user meant.
Are these reasonable solutions to this problem? It seems that the solution which involves confidence ratings (distance from pix-pt, perhaps) may be too processor heavy, and a bit memory heavy if I'm holding all the points in memory again. The other solution, using just the predicate, doesn't seem like it'd always be accurate.
This is a solved problem, as other graphing libraries have shown, but it's hard to find information about it other than in the source of some of these programs, and there's got to be a better way then to dig through the thousands of lines of Java to find this out.
I'm looking for better solutions, or just general pointers and advice on the ones I've offered, if possible.
So I'm guessing something like JFreeChart just wasn't cutting it for your app? If you haven't gone down that road yet, I'd suggest checking it out before attempting to roll your own.
Anyway, if you're looking for the nearest point to a mouse event, getting the point with the minimum Euclidean distance (if it's below some threshold) and presenting that will give the most predictable behavior for the user. The downside is that Euclidean distance is relatively slow for large data sets. You can use tricks like ignoring the square root or BSP trees to speed it up a bit. But if those optimizations are even necessary really depends on how many data points you're working with. Profile a somewhat naive solution in a typical case before going into optimization mode.
I think your approach is decent. This basically only requires one iteration through your data array, a little simple maths and no allocations at each step so should be very fast.
It's probably as good as you are going to get unless you start using some form of spatial partitioning scheme like a quadtree, which would only really make sense if your data array is very large.
Some Clojure code which may help:
(defn squared-distance [x y point]
(let [dx (- x (.x point))
dy (- y (.y point))]
(+ (* dx dx) (* dy dy))))
(defn closest
([x y points]
(let [v (first points)]
(closest x y (rest points) (squared-distance x y v) v)))
([x y points bestdist best]
(if (empty? points)
best
(let [v (first points)
dist (squared-distance x y v)]
(if (< dist bestdist)
(recur x y (rest points) dist v)
(recur x y (rest points) bestdist best))))))

Categories

Resources