java - search sorted list of rectangles

java - search sorted list of rectangles - java

I have a list of Rectangles, created in the usual way with:
List<Rectangle> rects = new ArrayList<>();
Some Rectangles are added (all with non-zero width and height). The number of Rectangles the List contains can be anywhere between 0 and 10,000, and will typically be between 4,000 and 6,000.
The list is sorted by ascending X-coordinate of the Rectangle origin, and then by ascending Y-coordinate for duplicate X-coordinates (though two or more rectangles with the same X-coordinate is rare).
I've verified the sorting is being done correctly (I'm using Collections.sort with a custom comparator).
I need a method that takes as input two ints, x and y, and returns the first Rectangle found containing the point (x,y), or null if no Rectangle in the list contains that point.
public Rectangle findContainingRectangle(int x, int y)
The naive method, which does give the desired functionality, is to just loop through the list and call the contains method on each Rectangle, but that is much too slow.
The List will be modified while the program is running, but at an insignificant rate compared to the rate at which the List needs to be searched, so an algorithm that requires a relatively slow initialization is fine.
I've looked at Collections.binarySearch but couldn't figure out how it might be used. I don't have much experience with Java so if there's another Collection that could be used similarly to a List but better suited to the type of search I need, then that's great (I have read the documentation on things like Maps and Sets but didn't recognize any advantage).

While maintaining a sorted list, you could use a binary search on the 'X' coordinate to find the candidates of the rectangles that contain the wanted 'X', and after which, use binary search on the 'Y' coordinate.
You should implement the binary search yourself, I can't see a way you can use the Collections.binarySearch method.
expected complexity: O(log n) as n the number of rectangles.
(It's a bit more because you might have duplicates)
However ,to do so, you should keep the array sorted while adding other instances, (sort after every insert).

Use HashSet. Map isn't appropriate here since you're not creating key-value pairs, and a Stream doesn't fit in this context either.
Be sure to override equals() and hashCode() in Rectangle, as described here: Why do I need to override the equals and hashCode methods in Java?

You can search your list using parallel stream like this
public Rectangle findContainingRectangle(final int x, final int y) {
List<Rectangle> rectangles = new ArrayList<>();
Rectangle rec = rectangles.parallelStream().filter((r)->{
if(r.getX()==x && r.getY()==y){
return true;
}
return false;
}).findFirst().get();
return rec;
}

Just run binary search a bunch of times - since the probability of same x is low as you say it wont take many times so it will still be logn
a) run binary search
b) remove item if found - and keep index where it was found
c) repeat binary search at a) with the remaining list until null is returned
d) then you have a small array of indexes and you can see which one is the smallest
e) then reinsert the removed elements at the designated spots

You can try and see a performance of a stream. I am not sure it will be fast enough but you can test it.
Rectangle rec = rects.stream().filter((r)->{
return r.contains(x, y);
}).findFirst().get();

You can create a Map.
Map is the best way to associate two values. You can associate the 'x' value and its first position in your List. Then you only have to loop from the first 'x' position to another 'x' in your list.
If you don't find the 'x' on the Map, they don't have the good rectangle on your list.
With this way you don't explore all bad 'x' entry.

Related

Fastest way to check which rectangle is clicked in a list of rectangles

I have a rectangle Object with x, y, width and height. I have a list of these rectangles which are displayed on a screen. It is guaranteed that none of them overlap. Given a user's click position (x and y coordinates), I want to see which of these rectangles were clicked (since they do not overlap, there is a maximum of one rect that can be clicked).
I can obviously look through all of them and check for each one if the user clicked it but this is very slow because there are many on the screen. I can use some kind of comparison to keep the rectangles sorted when I insert a new one into the list. Is there some way to use something similar to binary search in order to decrease the time it takes to find which rect was clicked?
Note: the rectangles can be any size.
Thanks:)
Edit: To get an idea of what I am making visit koalastothemax.com

It highly depends upon your application and details we're not quite aware of yet for what the best solution would be. BUT, with as little as I know, I'd say you can make a 2D array that points to your rectangles. That 2D array would map directly to the pixels on the screen. So if you make the array 10x20, then the coordinate x divided by screen width times 10 (casted to int) will be the first index and y divided screen height times 20 would be your y index. With your x and y index, you can map directly to the rectangle that it points to. Some indexes might be empty and some might point to more than one rectangle if they're not perfectly laid out, but that seems the easiest way to me without knowing much about the application.

I have tackled a very similar problem in the past when developing a simulation. In my case the coordinates were doubles (so no integer indexing was possible) and there could be hundreds of millions of them that needed to be searched.
My solution was to create an Axis class to represent each axis as a sequence of ranges. The ranges were guaranteed to go from a minimum to a maximum and the class was smart enough to split itself into pieces when new ranges were added. Each range has a single generic object stored. The class used a binary search to find a range quickly.
So roughly the class looks like:
class Axis<T> {
public Axis(double min, double max, Supplier<T> creator);
public Stream<T> add(double from, double to);
public T get(double coord);
}
The add method needs to return a stream because the added range may cover several ranges.
To store rectanges:
Axis<Axis<Rectangle>> rectanges = new Axis<>(0.0, 100.0,
() -> new Axis<>(0.0, 100.0, Rectangle::new));
rectangles.add(x, x + w).forEach(r -> r.add(y, y + h).forEach(Rectangle::setPresent));
And to find a rectangle:
rectangles.get(x).get(y);
Note that there's always an object stored so you need a representation such as Rectangle.NULL for 'not present'. Or you could make it Optional<Rectangle> (though that indirection eats a lot of memory and processing for large numbers of rectangles).
I've just given the high level design here rather than any implementation details so let me know if you want more info on how to make it work. Getting the logic right on the range splits is not trivial. But I can guarantee that it's very fast even with very large numbers of rectangles.

The fastest way I can come up with is definitely not the most memory efficient. This works by exploiting the fact that an amortized hash table has constant lookup time. It will map every point that a rectangle has to that rectangle. This is only really effective if your are using integers. You might be able to get it to work with floats if you use a bit of rounding.
Make sure that the Point class has a hash code and equals function.
public class PointCheck
{
public Map<Point, Rect> pointMap;
public PointCheck()
{
pointMap = new HashMap<>();
}
/**
* Map all points that contain the rectangle
* to the rectangle.
*/
public void addRect(Rect rect)
{
for(int i = rect.x; i < rect.x + rect.width; ++i)
{
for(int j = rect.y; j < rect.y + rect.height; ++i)
{
pointMap.put(new Point(i, j), rect);
}
}
}
/**
* Returns the rectangle clicked, null
* if there is no rectangle.
*/
public Rect checkClick(Point click)
{
return pointMap.get(click);
}
}
Edit:
Just thought I should mention this: All of the rectangles held in the value of the hash map are references to the original rectangle, they are not clones.

Optimally searching for 2D points in the given area (webservice)

I've got a kind of algorithmic & performance problem to solve with Java. I've got a large collection of 2D points (let's say there are about 100 000 of them). I want to get a set of them that are in the given area around the search point SP(X_sp, Y_sp), so that I'd like to get the points P(x y) that meets the criteria:
x is between X_sp - constValue and X_sp + constValue AND y is between Y_sp - constValue and Y_sp + constValue
To give you an idea of the number relations, constValue will be like 2, 5 or 10, and x, y will range between 0 and 1000. It's meant to be a webservice, so a possibility of searching around many different points at the same time must be taken into account.
As these are fixed points (not to change due to calculations or something), I thought that it would be optimal to provide one list of objects sorted by X and another one, but sorted by Y. Then, I'll first get the points within the X range, and, using references, get the set of this points from another list (sorted by Y). Then I'll narrow this selection by Y and in result get the points in the given area.
I don't know Java inside-out, so I'd like to consult with you the most optimized approach. Which objects should I use to store sorted points, which allow for fast search of objects within range? Or maybe I have to implement my custom algorithm for this task? Also, when it comes to storing the points in the database, are SQL queries sufficiently fast to deliver the results? Or maybe NoSQL dbs are better for this?
I'm going to perform my own tests, but I'm looking for a starting candidates.

I'd probably use a TreeMap<Integer, TreeSet<Integer>>, where the key to the map is the x coordinate and for each x coordinate, you have a list of y coordinates. You can then use floorEntry and ceilingEntry to find the x coordinates that fall within your range. Then for each TreeSet<Integer> set that you get, you can use ceiling and floor to get the appropriate entries.
Of course, this only gives you the coordinates of the bounds of your box (the four corners). But TreeSet also has a subset that will give you a range of values. You will have to use this twice; once for the list of x coordinates (you can get the key set using the keySet method of the map) that are within your bounds, then for each x coordinate, the y coordinates that are within the bounds. So the pseudocode would be sort of like this:
List<Point> result = new ArrayList<>();
int lowerX = points.ceilingKey(x - c);
int upperX = points.floorKey(x + c);
for each x coordinate in points.entrySet().subset(lowerX, upperX)
TreeSet<Integer> yCoordinates = points.get(x);
lowerY = yCoordinates.ceiling(y - c);
upperY = yCoordinates.ceiling(y + c);
for each y coordinate in yCoordinates.subset(lowerY, upperY)
result.add(new Point(x, y))
I haven't tested this out, so there are probably some bugs or something I've missed. Let me know and I'll correct the answer.
The floor and ceiling calls are log(n) I believe -- this is where you get the performance benefit because if you use a list, it would be O(n) to look that up.
Note: I don't know if this is the most performant. SO is typically not the place for such an open-ended question so you might have more luck elsewhere.

Finding Rectangle which contains a Point

In Java SE 7, I'm trying to solve a problem where I have a series of Rectangles. Through some user interaction, I get a Point. What I need to do is find the (first) Rectangle which contains the Point (if any).
Currently, I'm doing this via the very naieve solution of just storing the Rectangles in an ArrayList, and searching for the containing Rectangle by iterating over the list and using contains(). The problem is that, because this needs to be interactive for the user, this technique starts to be too slow for even a relatively small number of Rectangles (say, 200).
My current code looks something like this:
// Given rects is an ArrayList<Rectangle>, and p is a Point:
for(Rectangle r : rects)
{
if(r.contains(p))
{
return r;
}
}
return null;
Is there a more clever way to solve this problem (namely, in O(log n) instead of O(n), and/or with fewer calls to contains() by eliminating obviously bad candidates early)?

Yes, there is. Build 2 interval trees which will tell you if there is a rectangle between x1 to x2 and between y1 and y2. Then, when you have the co-ordinates of the point, perform O(log n) searches in both the trees.
That'll tell you if there are possibly rectangles around the point of interest. You still need to check if there is a common rectangle given by the two trees.

Fast way to sort really big vector

I have a really big vector that stores 100000 different values,ranging from 0 to 50000.
They represent the cylinders on a hard disk,and I want to sort this vector according to three different algorithms used for disk scheduling.
So far,I read those 100000 values from a file,store them into a vector and then sort them according to the desired algorithm(FCFS,SCAN,SSTF).The problem is,it takes too long,because I'm doing it in the least creative way possible:
public static Vector<Integer> sortSSTF(Vector<Integer> array){
Vector<Integer> positions = new Vector<Integer>(array);
Vector<Integer> return_array = new Vector<Integer>();
int current_pos = 0,minimum,final_pos;
while(positions.size() > 0){
minimum = 999999;
final_pos = current_pos;
for(int i=0 ; i < positions.size() ; i++){
//do some math
}
}
return_array.add(final_pos);
current_pos = final_pos;
positions.removeElement(final_pos);
}
return return_array;
}
My function takes a vector as a parameter,makes a copy of it,does some math to find the desired element from the copied array and store him in the other array,that should be ordered according to the selected algorithm.But in a array with N elements,it is taking N! iterations to complete,which is way too much,since the code should do that at least 10 times.
My question is, how can I make this sorting more efficient?

Java already has built-in methods to sort a List very quickly; see Collections.sort.
Vector is old and incurs a performance penalty due to its synchronization overhead. Use a List implementation (for example, ArrayList) instead.
That said, based on the content of your question, it sounds like you're instead having difficulty implementing the Shortest Seek Time First algorithm.
See related question Shortest seek time first algorithm using Comparator.
I don't think you can implement the SSTF or SCAN algorithm if you don't also supply the current position of the head as an argument to your sorting method. Assuming the initial value of current_postion is always 0 will just give you a list sorted in ascending order, in which case your method would look like this:
public static List<Integer> sortSSTF(List<Integer> cylinders) {
List<Integer> result = new ArrayList<Integer>(cylinders);
Collections.sort(result);
return result;
}
But that won't necessarily be a correct Shortest Seek Time First ordering if it's ever possible for current_pos > 0 when you first enter the method. Your algorithm will then probably look something like this:
Collections.sort(positions);
find the indices in positions that contain the nextLowest and nextHighest positions relative to current_pos (or currentPos, if following Java naming conventions)
whichever position is closer, remove that position from positions and add it to return_array (If it was nextLowest, also decrement nextLowestIndex. If it was nextHighest, increment nextHighestIndex)
repeat step 3 until positions is empty
return return_array.
Of course, you'll also need to check for nextLowestIndex < 0 and nextHighestIndex >= positions.size() in step 3.
Note that you don't need the for loop inside of your while loop--but you would use that loop in step 2, before you enter the while loop.

Storing and Retrieving points (related x and y) quickly in Java. List vs Array

I am attempting to recreate a board game in Java which involves me storing a set of valid places pieces can be placed (for the AI). I thought that perhaps instead of storing as a list of Points, it would be run-time faster if I had an array/list/dictionary of the X coordinates in which there was an array/list of the y coordinates, so once you found the x coordinate you would only have to check its Ys not all the remaining points'.
The trouble I have is that i must change the valid points often. I came up with some possible solutions but have difficulty picking/implementing them:
HashMap < Integer, ArrayList > with X as an integer key and the Ys as an ArrayList.
Problem: I would have to create a new ArrayList every time I add an X.
Also I am unsure about runtime performance of HashMap.
int[X][Y] array initialized to the board size with each point set to its relative location (point 2,3 sets[2][3]) unset point being an invalid integer.
Problem: I would have to iterate through all the points and check every point.
List of Points This would simply be a Linked/Array List of Points.
Problem: Lists are slower than arrays.
How would using a Linked list of Points compare to checking the whole array like above?
Perhaps I should use a 2d linked list? What would be the fastest runtime way to do this?

You're worrying about the wrong things. Accessing collection/map/array items is extremely fast. The graphical part will be way more performance-sensitive. Just use whatever data structure is most natural. It's unlikely that you're going to be storing enough items to really matter anyway. Build it first, then figure out where your performance problems really are.

if you use an ArrayList of Points you have nearly the same performance as with an array (in Java)
and I think this is the fastest solution, because as you already mentioned you have to iterate through the complete int-array and a HashMap and the relying ArrayLists have to be changed depending on changing/adding coordinates

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.