Optimize algorithm from O(n^3) to O(n^2)

Optimize algorithm from O(n^3) to O(n^2) - java

The problem I am trying to solve is as follows:
Assume you are given set of points in a two dimensional space and how
can we get maximum number of colinear points.
I did the problem in Java.
First I created a method that checks for linearity:
return (y1 - y2) * (x1 - x3) = (y1 - y3) * (x1 - x2);
Then I used three for loops which makes my algorithm O(n^3). But I am trying to see if this can be reduce to O(n^2).
After searching on the net I found that my implementation is very similar to whats here. So the question is how can we improve the complexity. Any example would be great.
This is what I ended up doing:
int p = 2;
for (int i = 0; i < points.lenght(); i++) {
for (int j = i+1; j < points.length(); j++) {
int count = 2;
for (int k =0; i < points.length(); k++) {
if (k == i || k == j)
continue;
//use linearity function to check if they are linear...
}
p = max(p,count);
}
}

I came to something very similar to #hqt's solution and want to elaborate on the details they've left out.
Two elements of this class are equal if their ratio dx to dy ratio (i.e., the slope) is the same.
static class Direction {
Direction(Point p, Point q) {
// handle anti-parallel via normalization
// by making (dx, dy) lexicographically non-negative
if (p.x > q.x) {
dx = p.x - q.x;
dy = p.y - q.y;
} else if (p.x < q.x) {
dx = q.x - p.x;
dy = q.y - p.y;
} else {
dx = 0;
dy = Math.abs(p.y - q.y);
}
}
public boolean equals(Object obj) {
if (obj==this) return true;
if (!(obj instanceof Direction)) return false;
final Direction other = (Direction) obj;
return dx * other.dy == other.dx * dy; // avoid division
}
public int hashCode() {
// pretty hacky, but round-off error is no problem here
return dy==0 ? 42 : Float.floatToIntBits((float) dx / dy);
}
private final int dx, dy;
}
Now fill a Guava's Multimap<Direction, PointPair> by looping over all pairs (complexity O(n*n)). Iterate over all keys (i.e. directions) and process the List<PointPair> via a union find algorithm. The found partitions are sets of pairs of collinear points. If there are k collinear points, then you'll find a set containing all pairs of them.
Because of the union find algorithm, the complexity is O(n*n*log(n)), avoiding sorting didn't help.

You can use angular coefficient between two points with Ox to solve this problem. For example, for 3 points : A B C. If they're collinear if and only if line AB and line AC make a same angular coefficient with Ox line. So, here is pseudocode of mine :
// Type : an object to store information to use later
List<Type> res = new ArrayList<Type>();
for (int i = 0; i < points.lenght(); i++) {
for (int j = i+1; j < points.length(); j++) {
double coefficient = CoeffiecientBetweenTwoLine(
line(points[i], points[j]), line((0,0), (0,1));
res.add(new Type(points[i], points[j], coefficient);
}
}
After that, you use QuickSort, sort again above List base on Coefficient. And any coefficient equals, we can know which points are collinear. Complexity of this algorithm is O(N^2logN) (dominated by sorting a list with O(N^2) elements, only O(N^2) required to build the list).
#Edit:
So how can we know how many points collinear when we show equal coefficient ?
There are many ways to solve this problem.
At sort step, you can sort by first parameter (is which point in
that line) when two coefficient are equal. For example. After sort,
the result should be (in this case, if 1 3 and 4 are collinear) :
(1 3)
(1 4)
(3 4)
From above building, you just need to see streak of 1. in this example, is 2. so the result should be 3. (always k + 1)
Use formula : because number of pair that equals always : n*(n-1)/2
. So, you will have : n*(n-1)/2 = 3. and you can know n = 3 (n >=
0). That means you can solve quadratic equation here (but not too
difficult, because you always know it have solution, and just get
one solution that positive)
Edit 2
Above step to know how many collinear points is not true, because at case for example, A B and C D are two parallel line (and line AB is different from line CD), the result, they still have same coefficient with Ox. So, I think to fix this problem, you can use Union-Find Data structure to solve this problem. Step will be :
Sort again angular coefficient
For example : (1 2 3 4) is collinear and they're parallel with (5,6,7) and point 8 stands somewhere else. So, after sort, the result should be :
(1 2) (1 3) (1 4) (2 3) (2 4) (5 6) (5,7) (6,7) angular coefficient equals, but at two different line
(1,5) (1, 6) .. // Will have some pair connect between two set of parallel line. (1, 8)
(5, 8) (3, 8) .... // Random order. because don't know.
Use Union-Find Data structure to join tree: Start iterate from second element, if you see its angular coefficient equals with previous, join itself and join previous. For example,
(1,3) == (1,2) : join 1 and 2, join 1 and 3.
(1,4) == (1,3) : join 1 and 3, join 1 and 4. ....
(5,6) : join 2 and 4, join 5 and 6.
(5,7): join 5 and 7, join 5 and 6 ...
(1,8) : not join anything. (5,8) : not join anything ...
After you finish this step. All you have is a multi-tree, in each tree, is a set of points that they're collinear.
Above step, you see that some pairs are join multi-time. you can simply fix this by mark, if they're already join, ignore to enhance more in performance.
# : I think this solution is not good, I just do by my brain-thinking, not a real algorithm behind. So, any other clear ideas, please tell me.

Try Below
//just to create random 15 points
Random random = new Random();
ArrayList<Point> points = new ArrayList<Point>();
for(int i = 0; i < 15; i++){
Point p = new Point(random.nextInt(3), random.nextInt(3));
System.out.println("added x = " + p.x + " y = " + p.y);
points.add(p);
}
//code to count max colinear points
int p = 0;
for(int i = 0; i < points.size() -1; i++){
int colinear_with_x = 1;
int colinear_with_y = 1;
for(int j = i + 1; j < points.size(); j++){
if(points.get(i).x == points.get(j).x){
colinear_with_x++;
}
if(points.get(i).y == points.get(j).y){
colinear_with_y++;
}
}
p = max(p,colinear_with_x,colinear_with_y);
}

An approach, that relies heavily on a good hashed map:
As key use the a linear equation (defining a line), so that you have a map along the line of
map<key=(vector, point), value=quantity> pointsOnLine
where vector and point define the linear function that two points determine.
Then you iterate over all n points:
maxPoints = 2
for i=1 to n
for j=i+1 to n
newKey = lineParametersFromPoints(point[i], point[j])
if pointsOnLine.contains(newKey)
pointsOnLine[key] += 1
if maxPoints < pointsOnLine[key]
maxPoints = pointsOnLine[key]
else
pointsOnLine.add(key)
pointsOnLine[key] = 2
maxPoints then contains the maximum number of colinear points.
Please note (this is probably most important), that the hash-compare function of the map must check that the two lines represent the same linear function, even if the vectors are anti-parallel or the points on the two lines are not the same (but one satisfies the other equation).
This approach does of course heavily rely on the map having fast access and insertion times.

Algorithm :-
Simple algorithm which can do it in O(N^2*logN) :-
Pick a point p.
Find slope from p to all other points
sort the points according to slopes
scan the sorted array to check for maximum consecutive points with same slope value
maximum value + 1 is maximum points collinear with p included.
Do 1 to 5 for all points and find maximum collinear points found
Time complexity : O(N) for slope calculation, O(NlogN) for sorting and O(N^2*logN) for all N points.
Space complexity : O(N) for slopes and points

Related

Dynamic nested loops with dynamic bounds

I have a LinkedList< Point > points ,with random values:
10,20
15,30
13,43
.
.
I want to perform this kind of loop:
for (int i= points.get(0).x; i< points.get(0).y; i++){
for (int j= points.get(1).x; j< points.get(1).y; j++){
for (int k= points.get(2).x; k< points.get(2).y; k++){
...
...
}
}
}
How can I do that if I don't know the size of the list?

There's probably a better way to solve equations like that with less cpu and memory consumption but a brute-force approach like your's could be implemented via recursion or some helper structure to keep track of the state.
With recursion you could do it like this:
void permutate( List<Point> points, int pointIndex, int[] values ) {
Point p = points.get(pointIndex);
for( int x = p.x; x < p.y; x++ ) {
values[pointIndex] = x;
//this assumes pointIndex to be between 0 and points.size() - 1
if( pointIndex < points.size() - 1 ) {
permutate( points, pointIndex + 1; values );
}
else { //pointIndex is assumed to be equal to points.size() - 1 here
//you have collected all intermediate values so solve the equation
//this is simplified since you'd probably want to collect all values where the result is correct
//as well as pass the equation somehow
int result = solveEquation( values );
}
}
}
//initial call
List<Point> points = ...;
int[] values = new int[points.size()];
permutate( points, 0, values );
This would first iterate over the points list using recursive calls and advancing the point index by one until you reach the end of the list. Each recursive call would iterate over the point values and add the current one to an array at the respective position. This array is then used to calculate the equation result.
Note that this might result in a stack overflow for huge equations (the meaning of "huge" depends on the environment but is normally at several 1000 points). Performance might be really low if you check all permutations in any non-trivial case.

Java random with low percentage on boolean array (quantile function)

I have a boolean array of aproximattely 10 000 elements. I would like to with rather low,set probability (cca 0,1-0,01) change the value of the elements, while knowing the indexes of changed elements. The code that comes to mind is something like:
int count = 10000;
Random r = new Random();
for (int i = 0; i < count; i++) {
double x = r.nextDouble();
if (x < rate) {
field[i]=!field[i];
do something with the index...
}
}
However, as I do this in a greater loop (inevitably), this is slow. The only other possibility that I can come up with is using quantile function (gaussian math), however I have yet to find any free to use code or library to use. Do you have any good idea how to work around this problem, or any library (standard would be best) that could be used?

Basically, you have set up a binomial model, with n == count and p == rate. The relevant number of values you should get, x, can be modeled as a normal model with center n*p == count*rate and standard deviation sigma == Math.sqrt(p*(1-p)/n) == Math.sqrt(rate * (1-rate) / count).
You can easily calculate
int x = (int) Math.round(Math.sqrt(rate * (1-rate) / count)
* r.nextGaussian() + count * rate)
Then you can generate x random numbers in the range using the following code.
Set<Integer> indices = new HashSet<Integer>();
while(indices.size() < x){
indices.add(r.nextInt(count));
}
indices will now contain the correct indices, which you can use as you wish.
You'll only have to call nextInt a little more than x times, which should be much less than the count times you had to call it before.

Need help reducing triple for loop to increase efficiency

for (int i = 0; i < 3; ++i) {
for (int k = 0; k < 7; ++k) {
for (int h = i; h < 4 + i; ++h) {
result = state.getAt(k, h);
if (result == 1) {
++firstpl;
}
if (result == 2) {
++secondpl;
}
if (firstpl > 0 && secondpl > 0) {
break;
}
//y = k;
}
if (firstpl == 0 && secondpl == 0) {
break;
} else if (firstpl > secondpl) {
score += firstpl * firstpl;
//if(state.getHeightAt(y)-3 < 3) score += 3+firstpl*2;
} else {
score -= secondpl * secondpl;
//if(state.getHeightAt(y)-3 < 3) score -= 3+secondpl*2;
}
firstpl = 0;
secondpl = 0;
}
}
basically I have a 7 by 6 grid. I am going through 7 columns and looking at every 4 consecutive blocks vertically. Since there is 6 blocks upward. There is 3 four consecutive block for each column. State.getAt(k,h) takes in a x and y and returns a value.

I don't think you can improve on this, unless you can figure out an alternative representation for this "state" that allows this computation to be performed incrementally.
And since you have failed to properly explained what the state or the calculation actually mean, it is difficult for anyone but you to figure out whether an alternative approach is even feasible. (And I for one am not going to attempt to reverse engineer the meaning from your code.)
OK. For Connect4, the win / lose is a line of 4 checkers horizontally, vertically or diagonally in the 7x6 grid. So what you could do is represent the score-state as an array of counters, corresponding to each of the columns, rows and diagonals in which a winning line could be made. (7 + 5 + 4 + 4 = 20 of them => 20 counters) Then construct a static mapping from an (x,y) position to the indexes of lines that pass through that. When you add a checker at point (x,y) you look up the counters and increment them. When you remove a checker ... decrement.
I'm not sure how that relates to your existing scoring function ... but then I don't see how that function relates to a strategy that would win the game. Either way, you could potentially use the approach above to calculate scores incrementally.

Counting with booleans

I was wondering if I could add booleans up like numbers. I am making something that uses a grid, and I want it to find the surrounding squares and return a number.
EDIT:
This is how I count with booleans.
int count = 0;
for (int x = -1; x<=1;x++){
for (int y = -1; y <=1;y++){
if (grid[xPos+x][yPos+y]){
count++;
}
}
}

boolean[] bools = ...
int sum = 0;
for(boolean b : bools) {
sum += b ? 1 : 0;
}
This assumes that you want true to be 1 and false to be 0.

To add to Jeffrey's answer, don't forget:
If your at the center cell of your nested for loops, don't check the grid, and don't add to count. Else you're counting the cell itself in its neighbor count. In your situation, it's were (x == 0 && y == 0)
You will need to check if the cell is on the edge, and if so make sure you're not trying to count cells that are off the grid. I've done this using something like this: int xMin = Math.max(cellX - 1, 0); where xMin is the lower bound of one of the for loops. I do similar for y, and similar for the maximum side of the grid. In your code this will happen when xPos + x < 0 or xPos + x >= MAX_X (MAX_X is a constant for the max x value allowed for the grid), and similar for the y side of things.

What is your goal? Speed? Readability? Code terseness?
If you're seeking speed, think about minimizing the number of memory accesses. If you can force your booleans to be stored as bits, you could use >> and & to compare only the bits you care about in each row. Maybe something like this:
byte grid[m][n / 8];
int neighbor_count = 0;
for (int row = yPos - 1; row < yPos + 1; row++) {
// calculate how much to shift the bits over.
int shift = 5 - (xPos - 1 % 8);
if (shift > 0) {
// exercise for the reader - span bytes.
} else {
// map value of on-bits to count of on bits
static byte count[8] = [0, 1, 1, 2, 1, 2, 2, 3];
// ensure that only the lowest 3 bits are on.
low3 = (grid[row][xPos / 8] >> shift) & 7;
// look up value in map
neighbor_count += count[low3];
}
caveat coder: this is untested and meant for illustration only. It also contains no bounds-checking: a way around that is to iterate from 1 to max - 2 and have a border of unset cells. Also you should subtract 1 if the cell being evaluated is on.
This might end up being slower than what you have. You can further optimize it by storing the bitmap in int32s (or whatever's native). You could also use multithreading, or just implement Hashlife :)
Obviously this optimizes away from terseness and readability. I think you've got maximum readability in your code.
As Jeffrey alludes to, storing a sparse array of 'on' booleans might be preferable to an array of values, depending on what you're doing.

Java: random integer with non-uniform distribution

How can I create a random integer n in Java, between 1 and k with a "linear descending distribution", i.e. 1 is most likely, 2 is less likely, 3 less likely, ..., k least likely, and the probabilities descend linearly, like this:
I know that there are dozens of threads on this topic already, and I apologize for making a new one, but I can't seem to be able to create what I need from them. I know that using import java.util.*;, the code
Random r=new Random();
int n=r.nextInt(k)+1;
creates a random integer between 1 and k, distributed uniformly.
GENERALIZATION: Any hints for creating an arbitrarily distributed integer, i.e. f(n)=some function, P(n)=f(n)/(f(1)+...+f(k))), would also be appreciated, for example:
.

This should give you what you need:
public static int getLinnearRandomNumber(int maxSize){
//Get a linearly multiplied random number
int randomMultiplier = maxSize * (maxSize + 1) / 2;
Random r=new Random();
int randomInt = r.nextInt(randomMultiplier);
//Linearly iterate through the possible values to find the correct one
int linearRandomNumber = 0;
for(int i=maxSize; randomInt >= 0; i--){
randomInt -= i;
linearRandomNumber++;
}
return linearRandomNumber;
}
Also, here is a general solution for POSITIVE functions (negative functions don't really make sense) along the range from start index to stopIndex:
public static int getYourPositiveFunctionRandomNumber(int startIndex, int stopIndex) {
//Generate a random number whose value ranges from 0.0 to the sum of the values of yourFunction for all the possible integer return values from startIndex to stopIndex.
double randomMultiplier = 0;
for (int i = startIndex; i <= stopIndex; i++) {
randomMultiplier += yourFunction(i);//yourFunction(startIndex) + yourFunction(startIndex + 1) + .. yourFunction(stopIndex -1) + yourFunction(stopIndex)
}
Random r = new Random();
double randomDouble = r.nextDouble() * randomMultiplier;
//For each possible integer return value, subtract yourFunction value for that possible return value till you get below 0. Once you get below 0, return the current value.
int yourFunctionRandomNumber = startIndex;
randomDouble = randomDouble - yourFunction(yourFunctionRandomNumber);
while (randomDouble >= 0) {
yourFunctionRandomNumber++;
randomDouble = randomDouble - yourFunction(yourFunctionRandomNumber);
}
return yourFunctionRandomNumber;
}
Note: For functions that may return negative values, one method could be to take the absolute value of that function and apply it to the above solution for each yourFunction call.

So we need the following distribution, from least likely to most likely:
*
**
***
****
*****
etc.
Lets try mapping a uniformly distributed integer random variable to that distribution:
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
etc.
This way, if we generate a uniformly distributed random integer from 1 to, say, 15 in this case for K = 5, we just need to figure out which bucket it fits it. The tricky part is how to do this.
Note that the numbers on the right are the triangular numbers! This means that for randomly-generated X from 1 to T_n, we just need to find N such that T_(n-1) < X <= T_n. Fortunately there is a well-defined formula to find the 'triangular root' of a given number, which we can use as the core of our mapping from uniform distribution to bucket:
// Assume k is given, via parameter or otherwise
int k;
// Assume also that r has already been initialized as a valid Random instance
Random r = new Random();
// First, generate a number from 1 to T_k
int triangularK = k * (k + 1) / 2;
int x = r.nextInt(triangularK) + 1;
// Next, figure out which bucket x fits into, bounded by
// triangular numbers by taking the triangular root
// We're dealing strictly with positive integers, so we can
// safely ignore the - part of the +/- in the triangular root equation
double triangularRoot = (Math.sqrt(8 * x + 1) - 1) / 2;
int bucket = (int) Math.ceil(triangularRoot);
// Buckets start at 1 as the least likely; we want k to be the least likely
int n = k - bucket + 1;
n should now have the specified distribution.

Let me try another answer too, inspired by rlibby. This particular distribution is also the distribution of the smaller of two values chosen uniformly and random from the same range.

There are lots of ways to do this, but probably the easiest is just to generate
two random integers, one between 0 and k, call it x, one between 0 and h, call it y. If y > mx + b (m and b chosen appropriately...) then
k-x, else x.
Edit: responding to comments up here so I can have a little more space.
Basically my solution exploits symmetry in your original distribution, where p(x) is a linear function of x. I responded before your edit about generalization, and this solution doesn't work in the general case (because there is no such symmetry in the general case).
I imagined the problem like this:
You have two right triangles, each k x h, with a common hypotenuse. The composite shape is a k x h rectangle.
Generate a random point that falls on each point within the rectangle with equal probability.
Half the time it will fall in one triangle, half the time in the other.
Suppose the point falls in the lower triangle.
The triangle basically describes the P.M.F., and the "height" of the triangle over each x-value describes the probability that the point will have such an x-value. (Remember that we're only dealing with points in the lower triangle.) So by yield the x-value.
Suppose the point falls in the upper triangle.
Invert the coordinates and handle it as above with the lower triangle.
You'll have to take care of the edge cases also (I didn't bother). E.g. I see now that your distribution starts at 1, not 0, so there's an off-by-one in there, but it's easily fixed.

There is no need to simulate this with arrays and such, if your distribution is such that you can compute its cumulative distribution function (cdf). Above you have a probability distribution function (pdf). h is actually determined, since the area under the curve must be 1. For simplicity of math, let me also assume you're picking a number in [0,k).
The pdf here is f(x) = (2/k) * (1 - x/k), if I read you right. The cdf is just integral of the pdf. Here, that's F(x) = (2/k) * (x - x^2 / 2k). (You can repeat this logic for any pdf function if it's integrable.)
Then you need to compute the inverse of the cdf function, F^-1(x) and if I weren't lazy, I'd do it for you.
But the good news is this: once you have F^-1(x), all you do is apply it to a random value distribution uniformly in [0,1] and apply the function to it. java.util.Random can provide that with some care. That's your randomly sampled value from your distribution.

This is called a triangular distribution, although yours is a degenerate case with the mode equal to the minimum value. Wikipedia has equations for how to create one given a uniformly distributed (0,1) variable.

The first solution that comes to mind is to use a blocked-array. Each index would specify a range of values depending on how "probable" you want it to be. In this case, you would use a wider range for 1, less wider for 2, and so on until you reach a small value (lets say 1) for k.
int [] indexBound = new int[k];
int prevBound =0;
for(int i=0;i<k;i++){
indexBound[i] = prevBound+prob(i);
prevBound=indexBound[i];
}
int r = new Random().nextInt(prevBound);
for(int i=0;i<k;i++){
if(r > indexBound[i];
return i;
}
Now the problem is just finding a random number, and then mapping that number to its bucket.
you can do this for any distribution provided you can discretize the width of each interval.
Let me know if i am missing something either in explaining the algorithm or its correctness. Needless to say, this needs to be optimized.

Something like this....
class DiscreteDistribution
{
// cumulative distribution
final private double[] cdf;
final private int k;
public DiscreteDistribution(Function<Integer, Double> pdf, int k)
{
this.k = k;
this.cdf = new double[k];
double S = 0;
for (int i = 0; i < k; ++i)
{
double p = pdf.apply(i+1);
S += p;
this.cdf[i] = S;
}
for (int i = 0; i < k; ++i)
{
this.cdf[i] /= S;
}
}
/**
* transform a cumulative distribution between 0 (inclusive) and 1 (exclusive)
* to an integer between 1 and k.
*/
public int transform(double q)
{
// exercise for the reader:
// binary search on cdf for the lowest index i where q < cdf[i]
// return this number + 1 (to get into a 1-based index.
// If q >= 1, return k.
}
}

The Cumulative Distribution Function is x^2 for a triangular distribution [0,1] with mode (highest weighted probability) of 1, as shown here.
Therefore, all we need to do to transform a uniform distribution (such as Java's Random::nextDouble) into a convenient triangular distribution weighted towards 1 is: simply take the square root Math.sqrt(rand.nextDouble()), which can then multiplied by any desired range.
For your example:
int a = 1; // lower bound, inclusive
int b = k; // upper bound, exclusive
double weightedRand = Math.sqrt(rand.nextDouble()); // use triangular distribution
weightedRand = 1.0 - weightedRand; // invert the distribution (greater density at bottom)
int result = (int) Math.floor((b-a) * weightedRand);
result += a; // offset by lower bound
if(result >= b) result = a; // handle the edge case

The simplest thing to do it to generate a list or array of all the possible values in their weights.
int k = /* possible values */
int[] results = new int[k*(k+1)/2];
for(int i=1,r=0;i<=k;i++)
for(int j=0;j<=k-i;j++)
results[r++] = i;
// k=4 => { 1,1,1,1,2,2,2,3,3,4 }
// to get a value with a given distribution.
int n = results[random.nextInt(results.length)];
This best works for relatively small k values.ie. k < 1000. ;)
For larger numbers you can use a bucket approach
int k =
int[] buckets = new int[k+1];
for(int i=1;i<k;i++)
buckets[i] = buckets[i-1] + k - i + 1;
int r = random.nextInt(buckets[buckets.length-1]);
int n = Arrays.binarySearch(buckets, r);
n = n < 0 ? -n : n + 1;
The cost of the binary search is fairly small but not as efficient as a direct look up (for a small array)
For an arbitary distrubution you can use a double[] for the cumlative distrubution and use a binary search to find the value.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Optimize algorithm from O(n^3) to O(n^2) - java

Related

Dynamic nested loops with dynamic bounds

Java random with low percentage on boolean array (quantile function)

Need help reducing triple for loop to increase efficiency

Counting with booleans

Java: random integer with non-uniform distribution

Categories

Resources