Whilst searching on Google about Genetic Algorithms, I came across OneMax Problem, my search showed that this is one of the very first problem that the Genetic Algorithm was applied to. However, I am not exactly sure what is OneMax problem. Can anyone explain.
Any help is appreciated
The goal of One-Max problem is to create a binary string of length n where every single gene contains a 1. The fitness function is very simple, you just iterate through your binary string counting all ones. This is what the sum represents in the formula you provided with your post. It is just the number of ones in the binary string. You could also represent the fitness as a percentage, by dividing the number of ones by n * 0.01. A higher fitness would have a higher percentage. Eventually you will get a string of n ones with a fitness of 100% at some generation.
double fitness(List<int> chromosome) {
int ones = chromosome.stream().filter(g -> g == 1).count();
return ones / chromosome.size() * 0.01;
}
Related
I have written the following function to implement a type of mutation(creep) in my genetic algorithm project. Since I've used java's inbuilt random generation library, the probability of getting every index is uniform. I've been asked to modify the function such way that it uses binomial distribution instead of uniform. As far as I googled, I couldn't find any example/tutorial that demonstrates conversion of uniform to binomial. How do I achieve it?
int mutationRate = 0.001;
public void mutate_creep() {
if (random.nextDouble() <= mutationRate) {
// uniform random generation
int index = random.nextInt(chromoLen);
if(index%2 == 0) { // even index
chromo[index] += 1;
} else { // odd index
chromo[index] -= 1;
}
}
}
NOTE: I have already seen the solution at A efficient binomial random number generator code in Java. Since my problem here is specific to creep mutation algorithm, I'm not sure how it can be applied directly.
According to Wikipedia, you do this:
One way to generate random samples from a binomial distribution is to use an inversion algorithm. To do so, one must calculate the probability that P(X=k) for all values k from 0 through n. (These probabilities should sum to a value close to one, in order to encompass the entire sample space.) Then by using a pseudorandom number generator to generate samples uniformly between 0 and 1, one can transform the calculated samples U[0,1] into discrete numbers by using the probabilities calculated in step one.
I will leave it to you to "calculate the probability [...] for all values k from 0 through n". After that, it's a weighed distribution.
You can do that using a TreeMap, similar to how I show it in this answer.
I'm not looking for answers, as this is a internship interview question for their coding problem. Rather, i'm looking a clue to head in the right direction.
Basically, the user puts in 2 parameters. Number of items and price point. For example, if the user puts in 3 for items and $150 for price point, the algorithm should find as many combinations as possible that is close to the price point of 150.
I've thought really hard about this problem, and my initial attempt was to just divide the price point by the total number of items. But this answer only gives me a restricted range for each item.
Is this question a P NP type question?
This is a variation of Subset-Sum problem with an additional dimension of number of items. This problem is NP-Complete - so there is no known polynomial solution to it, but there is a pseudo polynomial one, assuming the prices are relatively small integers.
The dynamic programming is has 1 additional dimension from the 'usual' subset sum problem, because there is an additional constraint - the number of elements you want to chose. It is basically based on the following recursive approach:
base:
f(0,i,0) = 1 //a combination with the required number of items and the desired price
f(x,0,k) = 0 (x != 0) //no item is left to chose from and the price was not found
f(x,i,-1) = 0 //used more elements than desired
step:
f(x,i,k) = f(x,i-1,k) + f(x-price[i],i-1,k-1)
^ ^
did not take element i used element i
This approach is basically brute-force, checking all possibilities at each step, but avoiding double calculations for smaller subproblems that were already solved.
The dynamic programming solution to this problem will be solved in O(n*k*W) where n is the number of items in the collection, k is the given number of items you want to select (3 in your example) and W is the desired weight/price.
Edits and clarifications:
If you wish to allow an element to be picked more than once, change the step to:
f(x,i,k) = f(x,i-1,k) + f(x-price[i],i,k-1)
^
giving a chosen element a chance to be re-chosen
If you wish to allow some 'tolerance' (allow combinations that sums to W' such that |W-W'| <= CONSTANT, you can do it by changing the first two stop clauses to the following:
f(x,0,k) = 0 (|x| > CONSTANT)
f(x,i,0) = 1 (|x| <= CONSTANT)
An alternative will be a solution that is O(n^k) which is generating all combinations with k element and examining each of them.
The problem is same as subset-sum problem which can be solved using DP solution similar to knapsack problem but with slight variation as you can have subset which can be greater than price point. This variation can be managed using a smart adjustment in the solution to general subset-sum problem :-
DP solution for subset-sum :-
1. Sum = Price Point.
2. SubSum(Sum,n) = Maximum(SubSum(Sum-Price[n],n),SubSum(Sum,n-1)))
3. SubSum(PricePoint,n) = Maximum Closest Subset Sum to Price Point <= Price Point
The above gives the subset sum which is closest but less or equal to Price Point but there are cases where the subset sum which slightly greater than price point is correct subset sum so you need to evaluate Subset Sum for a value larger than Price Point. The upper bound of that value up to which you need to evaluate is Bound = PricePoint + MinPrice where MinPrice is minimum price of among all items. Than find value the SubSum(x,n) such that it is closest to PricePoint
I have written this code to compute the sine of an angle. This works fine for smaller angles, say upto +-360. But with larger angles it starts giving faulty results. (When I say larger, I mean something like within the range +-720 or +-1080)
In order to get more accurate results I increased the number of times my loop runs. That gave me better results but still that too had its limitations.
So I was wondering if there is any fault in my logic or do I need to fiddle with the conditional part of my loop? How can I overcome this shortcoming of my code? The inbuilt java sine function gives correct results for all the angles I have tested..so where am I going wrong?
Also can anyone give me an idea as to how do I modify the condition of my loop so that it runs until I get a desired decimal precision?
import java.util.Scanner;
class SineFunctionManual
{
public static void main(String a[])
{
System.out.print("Enter the angle for which you want to compute sine : ");
Scanner input = new Scanner(System.in);
int degreeAngle = input.nextInt(); //Angle in degree.
input.close();
double radianAngle = Math.toRadians(degreeAngle); //Sine computation is done in terms of radian angle
System.out.println(radianAngle);
double sineOfAngle = radianAngle,prevVal = radianAngle; //SineofAngle contains actual result, prevVal contains the next term to be added
//double fractionalPart = 0.1; // This variable is used to check the answer to a certain number of decimal places, as seen in the for loop
for(int i=3;i<=20;i+=2)
{
prevVal = (-prevVal)*((radianAngle*radianAngle)/(i*(i-1))); //x^3/3! can be written as ((x^2)/(3*2))*((x^1)/1!), similarly x^5/5! can be written as ((x^2)/(5*4))*((x^3)/3!) and so on. The negative sign is added because each successive term has alternate sign.
sineOfAngle+=prevVal;
//int iPart = (int)sineOfAngle;
//fractionalPart = sineOfAngle - iPart; //Extracting the fractional part to check the number of decimal places.
}
System.out.println("The value of sin of "+degreeAngle+" is : "+sineOfAngle);
}
}
The polynomial approximation for sine diverges widely for large positive and large negative values. Remember, since varies from -1 to 1 over all real numbers. Polynomials, on the other hand, particularly ones with higher orders, can't do that.
I would recommend using the periodicity of sine to your advantage.
int degreeAngle = input.nextInt() % 360;
This will give accurate answers, even for very, very large angles, without requiring an absurd number of terms.
The further you get from x=0, the more terms you need, of the Taylor expansion for sin x, to get within a particular accuracy of the correct answer. You're stopping around the 20th term, which is fine for small angles. If you want better accuracy for large angles, you'll just need to add more terms.
I was wondering what may be the reason to use this median function, instead of just calculating the min + (max - min) / 2:
// used by the random number generator
private static final double M_E12 = 162754.79141900392083592475;
/**
* Return an estimate of median of n values distributed in [min,max)
* #param min the minimum value
* #param max the maximum value
* #param n
* #return an estimate of median of n values distributed in [min,max)
**/
private static double median(double min, double max, int n)
{
// get random value in [0.0, 1.0)
double t = (new Random()).nextDouble();
double retval;
if (t > 0.5) {
retval = java.lang.Math.log(1.0-(2.0*(M_E12-1)*(t-0.5)/M_E12))/12.0;
} else {
retval = -java.lang.Math.log(1.0-(2.0*(M_E12-1)*t/M_E12))/12.0;
}
// We now have something distributed on (-1.0,1.0)
retval = (retval+1.0) * (max-min)/2.0;
retval = retval + min;
return retval;
}
The only downside of my approach would maybe be its deterministic nature, I'd say?
The whole code can be found here, http://www.koders.com/java/fid42BB059926626852A0D146D54F7D66D7D2D5A28D.aspx?s=cdef%3atree#L8, btw.
Thanks
[trying to cover a range here because it's not clear to me what you're not understanding]
first, the median is the middle value. the median of [0,0,1,99,99] is 1.
and so we can see that the code given is not calculating the median (it's not finding a middle value). instead, it's estimating it from some theoretical distribution. as the comment says.
the forumla you give is for the mid-point. if many values are uniformly distributed between min and max then yes, that is a good estimation of the median. in this case (presumably) the values are not distributed in that way and so some other method is necessary.
you can see why this could be necessary by calculating the mid point of the numbers above - your formula would give 49.5.
the reason for using an estimate is probably that it is much faster than finding the median. the reason for making that estimate random is likely to avoid a bad worst case on multiple calls.
and finally, sorry but i don't know what the distribution is in this case. you probably need to search for the data structure and/or author name to see if you can find a paper or book reference (i thought it might be assuming a power law, but see edit below - it seems to be adding a very small correction) (i'm not sure if that is what you are asking, or if you are more generally confused).
[edit] looking some more, i think the log(...) is giving a central bias to the uniformly random t. so it's basically doing what you suggest, but with some spread around the 0.5. here's a plot of one case which shows that retval is actually a pretty small adjustment.
I can't tell you what this code is attempting to achieve; for a start it doesn't even use n!
But from the looks of it, it's simply generating some sort of exponentially-distributed random value in the range [min,max]. See http://en.wikipedia.org/wiki/Exponential_distribution#Generating_exponential_variates.
Interestingly, Googling for that magic number brings up lots of relevant hits, none of which are illuminating: http://www.google.co.uk/search?q=162754.79141900392083592475.
You are given a list of n numbers L=<a_1, a_2,...a_n>. Each of them is
either 0 or of the form +/- 2k, 0 <= k <= 30. Describe and implement an
algorithm that returns the largest product of a CONTINUOUS SUBLIST
p=a_i*a_i+1*...*a_j, 1 <= i <= j <= n.
For example, for the input <8 0 -4 -2 0 1> it should return 8 (either 8
or (-4)*(-2)).
You can use any standard programming language and can assume that
the list is given in any standard data structure, e.g. int[],
vector<int>, List<Integer>, etc.
What is the computational complexity of your algorithm?
In my first answer I addressed the OP's problem in "multiplying two big big numbers". As it turns out, this wish is only a small part of a much bigger problem which I'm going to address now:
"I still haven't arrived at the final skeleton of my algorithm I wonder if you could help me with this."
(See the question for the problem description)
All I'm going to do is explain the approach Amnon proposed in little more detail, so all the credit should go to him.
You have to find the largest product of a continuous sublist from a list of integers which are powers of 2. The idea is to:
Compute the product of every continuous sublist.
Return the biggest of all these products.
You can represent a sublist by its start and end index. For start=0 there are n-1 possible values for end, namely 0..n-1. This generates all sublists that start at index 0. In the next iteration, You increment start by 1 and repeat the process (this time, there are n-2 possible values for end). This way You generate all possible sublists.
Now, for each of these sublists, You have to compute the product of its elements - that is come up with a method computeProduct(List wholeList, int startIndex, int endIndex). You can either use the built in BigInteger class (which should be able to handle the input provided by Your assignment) to save You from further trouble or try to implement a more efficient way of multiplication as described by others. (I would start with the simpler approach since it's easier to see if Your algorithm works correctly and first then try to optimize it.)
Now that You're able to iterate over all sublists and compute the product of their elements, determining the sublist with the maximum product should be the easiest part.
If it's still to hard for You to make the connections between two steps, let us know - but please also provide us with a draft of Your code as You work on the problem so that we don't end up incrementally constructing the solution and You copy&pasting it.
edit: Algorithm skeleton
public BigInteger listingSublist(BigInteger[] biArray)
{
int start = 0;
int end = biArray.length-1;
BigInteger maximum;
for (int i = start; i <= end; i++)
{
for (int j = i; j <= end; j++)
{
//insert logic to determine the maximum product.
computeProduct(biArray, i, j);
}
}
return maximum;
}
public BigInteger computeProduct(BigInteger[] wholeList, int startIndex,
int endIndex)
{
//insert logic here to return
//wholeList[startIndex].multiply(wholeList[startIndex+1]).mul...(
// wholeList[endIndex]);
}
Since k <= 30, any integer i = 2k will fit into a Java int. However the product of such two integers might not necessarily fit into a Java int since 2k * 2k = 22*k <= 260 which fill into a Java long. This should answer Your question regarding the "(multiplication of) two numbers...".
In case that You might want to multiply more than two numbers, which is implied by Your assignment saying "...largest product of a CONTINUOUS SUBLIST..." (a sublist's length could be > 2), have a look at Java's BigInteger class.
Actually, the most efficient way of multiplication is doing addition instead. In this special case all you have is numbers that are powers of two, and you can get the product of a sublist by simply adding the expontents together (and counting the negative numbers in your product, and making it a negative number in case of odd negatives).
Of course, to store the result you may need the BigInteger, if you run out of bits. Or depending on how the output should look like, just say (+/-)2^N, where N is the sum of the exponents.
Parsing the input could be a matter of switch-case, since you only have 30 numbers to take care of. Plus the negatives.
That's the boring part. The interesting part is how you get the sublist that produces the largest number. You can take the dumb approach, by checking every single variation, but that would be an O(N^2) algorithm in the worst case (IIRC). Which is really not very good for longer inputs.
What can you do? I'd probably start from the largest non-negative number in the list as a sublist, and grow the sublist to get as many non-negative numbers in each direction as I can. Then, having all the positives in reach, proceed with pairs of negatives on both sides, eg. only grow if you can grow on both sides of the list. If you cannot grow in both directions, try one direction with two (four, six, etc. so even) consecutive negative numbers. If you cannot grow even in this way, stop.
Well, I don't know if this alogrithm even works, but if it (or something similar) does, its an O(N) algorithm, which means great performance. Lets try it out! :-)
Hmmm.. since they're all powers of 2, you can just add the exponent instead of multiplying the numbers (equivalent to taking the logarithm of the product). For example, 2^3 * 2^7 is 2^(7+3)=2^10.
I'll leave handling the sign as an exercise to the reader.
Regarding the sublist problem, there are less than n^2 pairs of (begin,end) indices. You can check them all, or try a dynamic programming solution.
EDIT: I adjusted the algorithm outline to match the actual pseudo code and put the complexity analysis directly into the answer:
Outline of algorithm
Go seqentially over the sequence and store value and first/last index of the product (positive) since the last 0. Do the same for another product (negative) which only consists of the numbers since the first sign change of the sequence. If you hit a negative sequence element swap the two products (positive and negative) along with the associagted starting indices. Whenever the positive product hits a new maximum store it and the associated start and end indices. After going over the whole sequence the result is stored in the maximum variables.
To avoid overflow calculate in binary logarithms and an additional sign.
Pseudo code
maxProduct = 0
maxProductStartIndex = -1
maxProductEndIndex = -1
sequence.push_front( 0 ) // reuses variable intitialization of the case n == 0
for every index of sequence
n = sequence[index]
if n == 0
posProduct = 0
negProduct = 0
posProductStartIndex = index+1
negProductStartIndex = -1
else
if n < 0
swap( posProduct, negProduct )
swap( posProductStartIndex, negProductStartIndex )
if -1 == posProductStartIndex // start second sequence on sign change
posProductStartIndex = index
end if
n = -n;
end if
logN = log2(n) // as indicated all arithmetic is done on the logarithms
posProduct += logN
if -1 < negProductStartIndex // start the second product as soon as the sign changes first
negProduct += logN
end if
if maxProduct < posProduct // update current best solution
maxProduct = posProduct
maxProductStartIndex = posProductStartIndex
maxProductEndIndex = index
end if
end if
end for
// output solution
print "The maximum product is " 2^maxProduct "."
print "It is reached by multiplying the numbers from sequence index "
print maxProductStartIndex " to sequence index " maxProductEndIndex
Complexity
The algorithm uses a single loop over the sequence so its O(n) times the complexity of the loop body. The most complicated operation of the body is log2. Ergo its O(n) times the complexity of log2. The log2 of a number of bounded size is O(1) so the resulting complexity is O(n) aka linear.
I'd like to combine Amnon's observation about multiplying powers of 2 with one of mine concerning sublists.
Lists are terminated hard by 0's. We can break the problem down into finding the biggest product in each sub-list, and then the maximum of that. (Others have mentioned this).
This is my 3rd revision of this writeup. But 3's the charm...
Approach
Given a list of non-0 numbers, (this is what took a lot of thinking) there are 3 sub-cases:
The list contains an even number of negative numbers (possibly 0). This is the trivial case, the optimum result is the product of all numbers, guaranteed to be positive.
The list contains an odd number of negative numbers, so the product of all numbers would be negative. To change the sign, it becomes necessary to sacrifice a subsequence containing a negative number. Two sub-cases:
a. sacrifice numbers from the left up to and including the leftmost negative; or
b. sacrifice numbers from the right up to and including the rightmost negative.
In either case, return the product of the remaining numbers. Having sacrificed exactly one negative number, the result is certain to be positive. Pick the winner of (a) and (b).
Implementation
The input needs to be split into subsequences delimited by 0. The list can be processed in place if a driver method is built to loop through it and pick out the beginnings and ends of non-0 sequences.
Doing the math in longs would only double the possible range. Converting to log2 makes arithmetic with large products easier. It prevents program failure on large sequences of large numbers. It would alternatively be possible to do all math in Bignums, but that would probably perform poorly.
Finally, the end result, still a log2 number, needs to be converted into printable form. Bignum comes in handy there. There's new BigInteger("2").pow(log); which will raise 2 to the power of log.
Complexity
This algorithm works sequentially through the sub-lists, only processing each one once. Within each sub-list, there's the annoying work of converting the input to log2 and the result back, but the effort is linear in the size of the list. In the worst case, the sum of much of the list is computed twice, but that's also linear complexity.
See this code. Here I implement exact factorial of a huge large number. I am just using integer array to make big numbers. Download the code from Planet Source Code.