Clustering of a set of 3D points - java

I have a 2D array of size n representing n number of points in the 3D space, position[][] for XYZ (e.g. position[0][0] is X, position[0][1] is Y, and position[0][2] is Z coordinate of point 0.
What I need to do is to do clustering on the points, so to have n/k number of clusters of size k so that each cluster consists of the k closest points in the 3D space. For instance, if n=100 and k=5, I want to have 20 clusters of 5 points which are the closest neighbors in space.
How can I achieve that? (I need pseudo-code. For snippets preferably in Java)
What I was doing so far was a simple sorting based on each component. But this does NOT give me necessarily the closest neighbors.
Sort based on X (position[0][0])
Then sort based on Y (position[0][1])
Then sort based on Z (position[0][2])
for (int i=0; i<position.length; i++){
for (int j=i+1; j<position.length; j++){
if(position[i][0] > position[i+1][0]){
swap (position[i+1][0], position[i][0]);
}
}
}
// and do this for position[i][1] (i.e. Y) and then position[i+2][2] (i.e. Z)
I believe my question slightly differs from the Nearest neighbor search with kd-trees because neighbors in each iteration should not overlap with others. I guess we might need to use it as a component, but how, that's the question.

At start you do not have a octree but list of points instead like:
float position[n][3];
So to ease up the clustering and octree creation you can use 3D point density map. It is similar to creating histogram:
compute bounding box of your points O(n)
so process all points and determine min and max coordinates.
create density map O(max(m^3,n))
So divide used space (bbox) into some 3D voxel grid (use resolution you want/need) do a density map like:
int map[m][m][m]`
And clear it with zero.
for (int x=0;x<m;x++)
for (int y=0;y<m;y++)
for (int z=0;z<m;z++)
map[x][y][z]=0;
Then process all points determine its cell position from x,y,z and increment it.
for (int i=0;i<n;i++)
{
int x=(m-1)*(position[i][0]-xmin)/(xmax-xmin);
int y=(m-1)*(position[i][1]-ymin)/(ymax-ymin);
int z=(m-1)*(position[i][2]-zmin)/(zmax-zmin);
map[x][y][z]++;
// here you can add point i into octree belonging to leaf representing this cell
}
That will give you low res density map. The higher number in cell map[x][y][z] the more points are in it which means a cluster is there and you can also move point to that cluster in your octree.
This can be recursively repeated for cells that have enough points. To make your octree create density map 2x2x2 and recursively split each cell until its count is lesser then threshold or cell size is too small.
For more info see similar QAs
Finding holes in 2d point sets? for the density map
Effective gif/image color quantization? for the clustering

What you what is not Clustering. From what you said, I think you want to divided your N points into N/k groups, with each group have k points, while keeping the points in each cluster are closest in the 3D space.
Think an easy example, if you want to do the same thing one one dimension, that is, just sort the numbers, and the first k points into cluster 1, the second k points into cluster 2, and so on.
Then return the 3D space problem, the answer is the same. Just first find the point with minimum x-axis, y-axis and z-axis, altogether with its closest k-1 points into Cluster 1. Then for the lest points, find the minimum x-axis, y-axis and z-axis points, and k-1 closest points not clustered into Cluster 2, and so on.
Above process will get your results, but that maybe not meaningful in practice, maybe cluster algorithms such as k-means could help you.

Related

Algorithm to slice into square matrix a matrix

i'm searching for a algorithm that take a matrix (in fact, a double entry array) and return an array of matrix that:
is square (WIDTH = HEIGHT)
all of the element in the matrix has the same value.
I don't know if that is clear, so imagine that you have a image made of pixels that is red, blue or green and i want to get an array that contained the least possible squares. Like the pictures shows
EDIT:
Ok, maybe it's not clear: I've a grid of element that can have some values like that:
0011121
0111122
2211122
0010221
0012221
That was my input, and i want in output somethings like that:
|0|0|111|2|1|
|0|1|111|22|
|2|2|111|22|
|00|1|0|22|1|
|00|1|2|22|1|
When each |X| is an array that is a piece of the input array.
My goal is to minimize the number of output array
This problem does not seem to have an efficient solution.
Consider a subset of instances of your problem defined as follows:
There are only 2 values of matrix elements, say 0 and 1.
Consider only matrix elements with value 0.
Identify each matrix element m_ij with a unit square in a rectangular 2D grid whose lower left corner has the coordinates (i, n-j).
The set of unit squares SU chosen this way must be 'connected' and must not have 'holes'; formally, for each pair of units squares (m_ij, m_kl) \in SU^2: (i, j) != (k, l) there is a sequence <m_ij = m_i(0)j(0), m_i(1)j(1), ..., m_i(q)j(q) = m_kl> of q+1 unit squares such that (|i(r)-i(r+1)| = 1 _and_ j(r)=j(r+1)) _or_ (i(r)=i(r+1) _and_ |j(r)-j(r+1)| = 1 ); r=0...q (unit squares adjacent in the sequence share one side), and the set SUALL of all unit squares with lower left corner coordinates from the integers minus SU is also 'connected'.
Slicing matrices that admit for this construction into a minimal number of square submatrices is equivalent to tiling the smallest orthogonal polygon enclosing SU ( which is the union of all elements of SU ) into the minimum number of squares.
This SE.CS post gives the references (and one proof) that show that this problem is NP-complete for integer side lengths of the squares of the tiling set.
Note that according to the same post, a tiling into rectangles runs in polynomial time.
Some hints may be useful.
For representation of reduced matrix, maybe a vector is better because it's needed to be stored (start_x,start_y,value ... not sure if another matrix very useful).
Step 1: loop on x for n occurrences (start with y=0)
Step 2: loop on y for/untill n occurrences. Most of cases here will be m lees then n.
(case m greater then n excluded since cannot do a square) Fine, just keep the min value[m]
Step 3: mark on vector (start_x,start_y, value)
Repeat Step 1-3 from x=m until end x
Step 4: End x, adjust y starting from most left_x found(m-in vector, reiterate vector).
...
keep going till end matrix.
Need to be very careful of how boundary are made(squares) in order to include in result full cover of initial matrix.
Reformulate full-initial matrix can be recomposed exactly from result vector.
(need to find gaps and place it on vector derived from step_4)
Note ! This is not a full solution, maybe it's how to start and figure out on each steps what is to be adjusted.

How to generate forests in java

I am creating a game where a landscape is generated all of the generations work perfectly, a week ago I have created a basic 'forest' generation system which just is a for loop that takes a chunk, and places random amounts of trees in random locations. But that does not give the result I would like to achieve.
Code:
for(int t = 0; t <= randomForTrees.nextInt(maxTreesPerChunk); t++){
// generates random locations for the X, Z positions\\
// the Y position is the height on the terrain gain with the X, Z coordinates \\
float TreeX = random.nextInt((int) (Settings.TERRAIN_VERTEX_COUNT + Settings.TERRAIN_SIZE)) + terrain.getX();
float TreeZ = random.nextInt((int) (Settings.TERRAIN_VERTEX_COUNT + Settings.TERRAIN_SIZE)) + terrain.getZ();
float TreeY = terrain.getTerrainHeightAtSpot(TreeX, TreeZ);
// creates a tree entity with the previous generated positions \\
Entity tree = new Entity(TreeStaticModel, new Vector3f(TreeX, TreeY, TreeZ), 0, random.nextInt(360), 0, 1);
// checks if the tree is on land \\
if(!(tree.getPosition().y <= -17)){
trees.add(tree);
}
}
Result:
First of all take a look at my:
simple C++ Island generator
as you can see you can compute Biomes from elevation, slope, etc... more sophisticated generators create a Voronoi map dividing your map into Biomes regions assigning randomly (with some rules) biome types based on neighbors already assigned...
Back to your question you should place your trees more dense around some position instead of uniformly cover large area with sparse trees... So you need slightly different kind of randomness distribution (like gauss). See the legendary:
Understanding “randomness”
on how to get a different distribution from uniform one...
So what you should do is get few random locations that would be covering your region shape uniformly. And then generate trees with density dependent on minimal distance to these points. The smaller distance the dense trees placement.
What are you looking for is a low-discrepancy-sequence to generate random numbers. The generated numbers are not truely random, but rather uniformly distributed. This distinguishes them from random number generators, which do not automatically produce uniformly distributed numbers.
One example of such a sequence would be the Halton Sequence, and Apache Commons also has an implementation which you can use.
double[] nextVector = generator.nextVector();
In your case, using two dimensions, the resulting array also has two entries. What you still need to do is to translate the points into your local coordinates by adding the the central point of the square where you want to place the forest to each generated vector. Also, to increase the gap between points, you should consider scaling the vectors.

Minimum distance between two polygons [duplicate]

I want to find the minimum distance between two polygons with million number of vertices(not the minimum distance between their vertices). I have to find the minimum of shortest distance between each vertex of first shape with all of the vertices of the other one. Something like the Hausdorff Distance, but I need the minimum instead of the maximum.
Perhaps you should check out (PDF warning! Also note that, for some reason, the order of the pages is reversed) "Optimal Algorithms for Computing the Minimum Distance Between Two Finite Planar Sets" by Toussaint and Bhattacharya:
It is shown in this paper that the
minimum distance between two finite
planar sets if [sic] n points can be
computed in O(n log n) worst-case
running time and that this is optimal
to within a constant factor.
Furthermore, when the sets form a
convex polygon this complexity can be
reduced to O(n).
If the two polygons are crossing convex ones, perhaps you should also check out (PDF warning! Again, the order of the pages is reversed) "An Optimal Algorithm for Computing the Minimum Vertex Distance Between Two Crossing Convex Polygons" by Toussaint:
Let P = {p1,
p2,..., pm} and Q = {q1, q2,...,
qn} be two intersecting polygons whose vertices are specified
by their cartesian coordinates in
order. An optimal O(m + n)
algorithm is presented for computing
the minimum euclidean distance between
a vertex pi in P and a
vertex qj in Q.
There is a simple algorithm that uses Minkowski Addition that allows calculating min-distance of two convex polygonal shapes and runs in O(n + m).
Links:
algoWiki, boost.org, neerc.ifmo.ru (in russian).
If Minkowski subtraction of two convex polygons covers (0, 0), then they intersect

Java Nearest Intersection

I am making my own implementation of a raycaster in a game I am making, and I have come across a very hard problem. I have a player (the black dot), and I need to find the intersection nearest to the player. In the image below, the arrow is pointing to the intersection point I need.
What I guess I am trying to say is that I need a function something like this:
// Each line would take in 2 map values for it's 2 points
// In turn, the map would have to have an even number of points
public Point getNearestIntersection(int playerX, int playerY, int lineDir, Point[] map) {
// whatever goes here
}
I am going to have to do this about 50 times every frame, with about 100 lines. I would like to get 40 fps at the least if possible... Even if I divide it up into threads I still feel that it would cause a lot of lag.
The class Point has a method called distance which calculates the distance of two points. You then could loop all points to get the nearest. Could be something like this:
Point currentNearestIntersection;
double smallestDistance;
for (Point inter : intersections) {
double distance = player.distance(inter );
if (distance < smallestDistance) {
currentNearestIntersection= inter;
smallestDistance = distance;
}
}
axis/line intersection is in reality solving:
p(t)=p0+dp*t
q(u)=q0+(q1-q0)*u
p(t)=q(u)
t=? u=?
where:
p0 is your ray start point (vector)
dp is ray direction (vector)
q0,q1 are line endpoints (vectors)
p(t),q(u) are points on axis,line
t,u are line parameters (scalars)
This is simple system of 2 linear equations (but in vectors) so it lead to N solutions where N is the dimensionality of the problem so choose the one that is not division by zero ... Valid result is if:
t>=0 and u=<0.0,1.0>
if you use unit dp vector for direction of your ray then from computing intersection between axis and line the t parameter is directly distance from the ray start point. So you can directly use that ...
if you need to speed up the intersections computation see
brute force line/line intersection with area subdivision
And instead of remebering all intersections store always the one with smallest but non negative t ...
[Notes]
if you got some lines as a grid then you can compute that even faster exploiting DDA algorithm and use real line/line intersection only for the iregular rest... nice example of this is Wolfenstein pseudo 3D raycaster problem like this

What is the fastest way to calculate nearby points from large data set of points

I have a large set of 3d points, (20,000+), scattered throughout a 3d space. I need to identify which points are within a specific arbitrary range of each point in the set. For example, for each point, what is the group of points that is within a range of 10 units. The permutations for this are pretty big. So, what would be the most computationally efficient way to approach this ? (I need to solve this using java only.)
You can use k-d tree, which is basically a k-dimensional binary tree. Range search in k-d tree is very efficient.
Since it's a theorical question without code, I will throw my 2 cents here. If you don't use any geometrical DB like postgis (http://postgis.net/), I will suggest the following with the premise that the points have three coordinates (X, Y, Z).
Make three Arrays containing the id of the point and one of the coordinates. Sort them by coordinates. Then for each points, check if the last and the next are within range. If the two are not, eliminate that point. Make that for each array. You will then have a much less space to compute. Then for each points within the reach of a point, calculate the distance and flag, eliminate the fartest points.
Hope this will help.
You can use a space filling curve and approximate. Treat the points as a binary and interleave it. Then sort the numbers and exploit that the curve visits nearby points first. You can try many curves most likely it depends on the points.
Sounds like you need an R-tree. Or maybe a range tree like a kd-tree, it will return all points in a box, and then you just filter all at the desired distance from your query point.
Using an ArrayList that is preallocated to full size. (Uses a cube shape zone)
public class Point3D {
public int x, y, z;
public static List<Point3D> allWithinRange(List<Point3D> possiblePoints, int x, int y, int z, int inter) {
List<Point3D> list = new ArrayList<Point3D>(possiblePoints.size());
possiblePoints.stream()
.filter(it -> it.x <= x + inter && it.x >= x - inter)
.filter(it -> it.y <= y + inter && it.y >= y - inter)
.filter(it -> it.z <= z + inter && it.z >= z - inter)
.forEach(list::add);
return list;
}
}

Categories

Resources