This is a problem given in HackWithInfy2019 in hackerrank.
I am stuck with this problem since yesterday.
Question:
You are given array of N integers.You have to find a pair (i,j)
which maximizes the value of GCD(a[i],a[j])+(j - i)
and 1<=i< j<=n
Constraints are:
2<= N <= 10^5
1<= a[i] <= 10^5
I've tried this problem using python
Here is an approach that could work:
result = 0
min_i = array[1 ... 100000] initialized to 0
for j in [1, 2, ..., n]
for d in divisors of a[j]
let i = min_i[d]
if i > 0
result = max(result, d + j - i)
else
min_i[d] = j
Here, min_i[d] for each d is the smallest i such that a[i] % d == 0. We use this in the inner loop to, for each d, find the first element in the array whose GCD with a[j] is at least d. When j is one of the possible values for which gcd(a[i], a[j]) + j - i is maximal, when the inner loop runs with d equal to the required GCD, result will be set to the correct answer.
The maximum possible number of divisors for a natural number less than or equal to 100,000 is 128 (see here). Therefore the inner loop runs at most 128 * 100,000 = 12.8 million times. I imagine this could pass with some optimizations (although maybe not in Python).
(To iterate over divisors, use a sieve to precompute the smallest nontrivial divisor for each integer from 1 to 100000.)
Here is one way of doing it.
Create a mutable class MinMax for storing the min. and max. index.
Create a Map<Integer, MinMax> for storing the min. and max. index for a particular divisor.
For each value in a, find all divisors for a[i], and update the map accordingly, such that the MinMax object stores the min. and max. i of the number with that particular divisor.
When done, iterate the map and find the entry with largest result of calculating key + value.max - value.min.
The min. and max. values of that entry is your answer.
Related
I want to count all pythagorean triples (primitive and non-primitive) whose hypotenuse is <= N for a given N. This OEIS link gives this count for powers of 10. A simple but somewhat efficient pseudocode can be easily derived from this link or from wikipedia, using the famous Euclid's formula. Interpreted in Java, this can be made into:
public class countPTS {
public static void main(String[] args)
{
long N = 1000000L;
long count = 0;
for (long m = 2 ; m * m + 1 <= N ; m++)
for (long n = 1 + m % 2 ; n < m && n * n + m * m <= N ; n += 2)
if (gcd(m , n) == 1)
count += N / (n * n + m * m);
System.out.println(count);
}
public static long gcd(long a, long b)
{
if (a == 0)
return b;
return gcd(b % a, a);
}
}
Which gives the correct results, but is a bit "slow". Regardless of the actual time complexity of this algorithm in big O notation, it seems to grow like a little bit worse than O(n). It can be easily checked that almost the entire time is spent in gcd calculations.
If C(N) is the count of such triples, then this piece of code written and ran on the latest version of Eclipse IDE, using a modern PC and single threaded, yields C(10^10) = 34465432859 in about 90 seconds. This makes me wonder how the high values in the OEIS link were obtained, the largest being C(10^17).
I suspect that an algorithm with a better time complexity was used, perhaps O(n^2/3) or O(n^3/4), maybe with some more advanced mathematical insight. It is somewhat plausible that the above algorithm was ran but with significant improvements and perhaps multithreaded. Can anyone shed light on this topic and higher values?
TD;LR: There is an approach that is slightly worse than O(N^(2/3)). It consists of enumerating primitive triples with an efficient algorithm up to N^(2/3) and counting the resulting Pythagorean triples, and then using inclusion-exclusion to count Pythagorean primitive triples past that and finding a count of the rest. The rest of this is a detailed explanation of how this all works.
Your direct calculation over primitive triples cannot be made better than O(N). The reason why is that over 60% of pairs of integers are relatively prime. So we can put a lower bound on the number of pairs of relative primes as follows.
There are floor(sqrt(n/2)) choose 2 = O(N) pairs of integers at most sqrt(n/2). The ones that are relatively prime with one even give rise to a primitive Pythagorean triple.
60% of them are relatively prime.
This 60% is contained in the 75% that do not have both numbers even. Therefore when we get rid of pairs with both numbers odd, at least 60% - 25% = 35% of pairs are relatively prime with one of them even.
Therefore there are at least O(n) distinct Pythagorean triples to find whose hypotenuse is less than n.
Given that there are O(N) primitive triples, you'll have to do at least O(N) work to enumerate them. So you don't want to enumerate them all. Instead we'll enumerate up to some point, and then we'll do something else with the rest. But let's make the enumeration more efficient.
Now as you noticed, your GCD checks are taking most of your time. Can we enumerate all of them with hypotenuse less than, say, L more efficiently?
Here is the most efficient approach that I found.
Enumerate the primes up to sqrt(L) with any decent technique. This is O(sqrt(L) log(L) with the Sieve of Eratosthenes, that is good enough.
Using this list, we can produce the set of odd numbers up to sqrt(L), complete with their prime factorizations, fairly efficiently. (Just recursively produce prime factorizations that fit in the range.)
Given an odd number and its prime factorizations, we can run through the even numbers and quickly find which are relatively prime. That gives us the primitive Pythagorean triples.
For step 3, create a PriorityQueue whose elements start off pairs (2*p, 2*p) for each distinct prime factor p. For example for 15 we'd have a queue with (6, 6) and (10, 10). We then do this:
i = the odd number we started with
j = 2
while i*i + j*j < L:
if queue.peek()[0] == i:
while queue.peek()[0] == i:
(x, y) = queue.pop()
queue.add((x+y, y))
else:
i*i + j*j is the hypotenuse of a primitive Pythagorean triple
This is very slightly superlinear. In fact the expected complexity of finding all the numbers out to n that are relatively prime to i is O(n * log(log(log(i)))). Why? Because the average number of distinct prime factors is on average O(log(log(i))). The queue operations are O(log(THAT)) and we do on average O(n) queue operations. (I'm waving my hands past some number theory, but the result is correct.)
OK, what is our alternative to actually enumerating many of the primitive triples?
The answer is to use inclusion-exclusion to count the number below some bound, L Which I will demonstrate by counting the number of primitive triples whose hypotenuse is at most 100.
The first step is to implement a function EvenOddPairCount(L) that counts how many pairs of even/odd numbers there are whose squares sum up to at most L . To do this we traverse the fringe of maximal even/odds. This takes O(sqrt(L)) to traverse. Here is that calculation for 100:
1*1 + 8*8, 4 pairs
3*3 + 8*8, 4 pairs
5*5 + 8*5, 4 pairs
7*7 + 6*6, 3 pairs
9*9 + 2*2, 1 pair
Which gives 16 pairs.
Now the problem is that we have counted things like 3*3 + 6*6 = 45 that are not relatively prime. But we can subtract off EvenOddPairCount(L/9) to find those that are both divisible by 3. We can likewise subtract off EvenOddPairCount(L/25) to find the ones that are divisible by 5. But now the ones divisible by 15 have been added once, and subtracted twice. So we have to add back in EvenOddPairCount(L/(15*15)) to again not count those.
Keep going and you get a sum is over all distinct products x of odd primes of EvenOddPairCount(L/(x*x)) where we add if we had an even number of prime factors (including 0), and subtract if we had an odd number. Since we already have the odd primes, we can easily produce the sequence of inclusion-exclusion terms. It starts off 1, -9, -25, -49, -121, -169, +225, -289, -361, +441 and so on.
But how much work does this take? If l = sqrt(L) it takes l + l/3 + l/5 + l/7 + l/11 + l/13 + l/15 + ... + 1/l < l*(1 + 1/2 + 1/3 + 1/4 + ... + 1/l) = O(l log(l)) = O(sqrt(L) log(L)).
And so we can calculate PrimitiveTriples(L) in time O(sqrt(L) log(L)).
What does this buy us? Well, PrimitiveTriples(N) - PrimitiveTriples(N/2) gives us the number of primitive Pythagorean triples whose hypotenuse is in the range from N/2 to N. And 2*(PrimitiveTriples(N/2) - PrimitiveTriples(N/3)) gives us the number of Pythagorean triples which are multiples of primitive ones in the range from N/3 to N/2. And 3*(PrimitiveTriples(N/3) - PrimitiveTriples(N/4) gives us the number of Pythagorean triples in the range from N/4 to N/3.
Therefore we can enumerate small primitive Pythagorean triples, and figure out how many Pythagorean triples are a multiple of those. And we can use inclusion-exclusion to count the multiples of large Pythagorean triples. Where is the cutoff between the two strategies?
Let's ignore the various log factors for the moment. If we cut off at N^c then we do roughly O(N^c) work on the small. And we need to calculate N^(1-c) thresholds of PrimitiveTriples, the last of which takes work roughly O(N^(c/2)). So let's try setting N^c = N^(1-c) * N^(c/2). That works out when c = (1-c) + c/2 which happens when c = 2/3.
And this is the final algorithm.
Find all primes up to sqrt(N).
Use those primes to enumerate all primitive triples up to O(n^(2/3)) and figure out how many Pythagorean triples they give.
Find all of the inclusion exclusion terms up to N (there are O(sqrt(N)) of them and can be produced directly from the factorization).
Calculate PythagoreanTriples(N/d) for all d up to N^(1/3). Use that to count the rest of the triples.
And all of this runs in time o(N^(2/3 + epsilon)) for any epsilon greater than 0.
You can avoid the GCD calculation altogether:
Make a list of all primes < sqrt(N)
Iterate through all subsets of those primes with total product < sqrt(N)
For each subset, the primes in the set will be "m primes", while the other primes will be "n primes"
Let "m_base" be the product of all m primes
For m, iterate through all combinations of at m primes with product < sqrt(N)/m_base, yielding m = m_base * that combination and:
Calculate the maximum value of n for each m
For n, iterate through all combinations of n primes with product < max
Hey i have the task to write a code to compute the median from an array. I My teacher gave me the following instructions:
Algorithm for the list l with length n > 1 and the position i:
divide the n elements from the list L in ⌊n/5⌋ groups with 5 elements and <= 1 group with n mod 5 elements.
compute the median from each of the ⌈n/5⌉ groups
compute recursively the median x of the medians from step 2
partition the list L in 2 lists L1 with all numbers < x and L2 with all numbers > x. Also compute the length l1 and l2 of the lists (x will be on the position k = l1 + 1)
if i = k return x, if i < k compute the first element of L1 recursively and if i > k compute the (i-k)th element in L2 recursively.
So my question is, what exactly is the "i"? i already wrote the code and everything is working good, except step 5 because i don't know what the i is and how to use it. How is it defined and how does it change in the recursion?
This is honestly a too much common term in programming and also a very basic concept. Here, 'i' means current index. So, for instance, if you're on the index 4 and l1 is 3 then i == k (as it's mentioned before that k = l1 + 1), you have to return x!
The algorithm you’ve listed solves the following problem:
Given an (unsorted) array A and an index i, return the ith smallest item in array A.
So, for example, if you have an array A of length 5, then if i = 0 you’re asking for the smallest element in the array, if i = 2 you’re asking for the median, and if i = 4 you’re asking for the largest element of the array.
(This use of the variable i is specific to the problem statement as it was given to you. It’s not something that generally has this meaning.)
If not, witch complexity it would be? Thanks:
public static int f(int n, int x) {
for (int i = n; i > 0; i /= 2) {
for (int j = 0; j < i; j++) {
x += j; // Assume, this operation costs 1.
}
}
return x;
}
This is an interesting one. The assumption of log^2(n) is wrong. Henry gave a good reductio ad absurdum why it cannot be log^2(n) in the comments:
We can see that, O(log^2(n)) ⊊ O(n).
the first iteration of the inner loop takes O(n).
Since O(log^2(n)) ⊊ O(n), the assumption must be wrong because the first iteration alone is ∈ O(n).
This also provides us with a lower bound for the algorithm: Since the first iteration of the algorithm is ∈ O(n), then whole algorithm takes at least Ω(n).
Now let us get to estimating the execution time. Normally, the first approach is to estimate the inner and outer loop separately and multiplying them together. Clearly, the outer loop has complexity log(n). Estimating the inner loop, however, is not trivial. So we can estimate it with n (which is an overestimation) and get a result of n log(n). This is an upper bound.
To get a more precise estimation, let us make two observations:
The inner loop basically adds up all values of outer loop variable i
Loop variable i follows the pattern of n, n/2, n/4, ..., 1, 0
Let us assume that n = 2^k, k ∈ ℕ, k > 0, i.e. n is a power of 2. Then n/2 = 2^(k-1), n/4 = 2^(k-2), ... To generalize from this assumtion, if n is not a power of 2, we set it to the next smaller power of 2. This is, in fact, an exact estimation. I leave the reasoning as to why as an exercise for the reader.
It is a well-known fact that 2^k + 2^(k-1) + 2^(k-2) + ... + 1 (+ 0) =sum_(i=0)^k 2^i = 2^(k+1) - 1. Since our input is n = 2^k, we know that 2^(k+1)= 2 * 2^k = 2 * n ∈ O(n). The algorithm's runtime complexity is, in fact, Θ(n), i.e. this is an upper and a lower bound. It is also a lower bound since the estimation we made is exact. Alternatively, we can use our observation of the Algorithm being ∈ Ω(n) and thus arrive this way at Θ(n).
First of all, look at the outer loop. You can see it iterates until i < 1 or i = 0. So, outer loop executes for values for i = N, N/2, N/4 … N/2^k (executing k number of times)
N/2^k < 1
N<2^k
k>log(N)
so, outer loop executes logN times.
Now, looking at inner loop. First of all, it executes for N times, then N/2 times then N/4 times until it reaches 1. Basically, executing logN times.
So, time complexity will be N + N/2 + … logN terms.
By Geometric progression:
a=N, r= 1/2, n= logn (Remember logn has base 2)
Also, using a^logb = b^loga and log2 is 1.
N((1- (1/2)^logN)/(1-1/2)) = 2N(1-(1^logN)/(N^log2)) = 2N(1-1/N)=2(N-1) = 2*N = O(N)
So, time complexity is O(N)
Linear O(n)
Total cost = n (first outer loop iteration) + n/2 (second outer loop iteration) + n/4 (third) + ... etc to a total of log(n) iterations. This sum is bounded by 2n (sum of a geometric series with a = n, r = 1/2).
How can I search through an array and find every combination of three values whose sum is divisible by a given number(x) in java.
In other words, every combination where (n1+n2+n3) % x == 0.
I know this would be a simple solution using a triple for loop but I need something with a time complexity of O(N^2).
Any idea's?
1 - Hash every (element % x) in the list.
2 - Sum every pair of elements in the list (lets call the sum y) and do modded_y = y % x.
3 - Check if x - modded_y is in the hash table. If it is and it's not one of the other 2 numbers then you found a possible combination. Keep iterating and hashing the combinations you found so that they don't repeat.
This is called a meet in the middle strategy.
It's complexity is O(n + (n^2 * 1)) = O(n^2)
I will call the given array A1.
1) create two structs
struct pair{
int first;
int second;
bool valid;
}
struct third{
int value;
int index;
}
2) using a nested loop, initialize an array B1 of all possible Pairs.
3) Loop through B1. if (A1[B1[i].first] + A1[B1[i].second])%x==0 then set B1[i].valid to true
4) create an array A3 of Thirds that stores the index and value of every element from A1 that is divisible by x.
5) using a nested loop, go through each element of A3 and each element of B1. If B1.valid = true
print A1[B1[i].first] and A1[B1[i].second] with an element from A1[A3.index].
that should give you all combinations without using any triple loops.
I have a Java computational problem in which I am given an array of integers:
For example:
3 -2 -10 0 1
and I am supposed to compute what is the minimal integer and maximum triplet that
can be formed from these integers. (In this case, min=-30,max=60)
I initially thought that the maximum would always be positive and minimum would always be
negative.
Hence,
My initial algorithm was:
Scan the array and take out the 3 largest elements inside, store into an array.
At the same time, take out the 3 smallest elements inside, store into another array.
By inequalities, we can deduce the following:
+ve = (-)(-)(+) or (+)(+)(+)
-ve = (+)(+)(-) or (-)(-)(-)
Hence, I used the elements from the two arrays that I computed to try to obtain the maximal and minimal triplet. (i.e. In order to obtain the maximal triplet, I compared the triplet formed by the largest 3 with the triplet formed by the smallest 2 and the largest integer)
However, I realized that if all the given integers were negative, my algorithm would be defeated because of the fact that the maximal would be negative. (Vice-versa for minimal)
I know that I can simply add more checks to solve this problem or simply just use the brute force O(N^3) solution. But there must be a better way to solve this problem.
This problem must be solved by recursion and only in O(N) time.
I am in a fix. Could someone please guide me?
Thanks.
There is a O(n) solution if you find the 3 maximum and 2 minimum values in linear times.
But you can also use nlog(n) sorting (i.e. quick sort) to do that job easily.
Then here the solution to find the maximum product triplet in C with explanations in comments -
int cmpfunc (const void * a, const void * b)
{
return ( *(int*)a - *(int*)b );
}
int solution(int A[], int N) {
long product = A[0] * A[1] * A[2];
if (N == 3) {
return product;
}
// Nlog(N)
qsort(A, N, sizeof(int), cmpfunc);
if (A[N - 3] >= 0) {
// if there is at least 3 non-negative value
// then take three maximum values
product = A[N - 1] * A[N - 2] * A[N - 3];
if (A[1] < 0) {
// if there is at least 2 negative value
if (product < A[N - 1] * A[0] * A[1]) {
// then take maximum positive and two minimum negative, if that is more than 3 positive values
product = A[N - 1] * A[0] * A[1];
}
}
} else if (A[N - 1] >= 0) {
// else if there is least 1 non-negative value
// then take maximum positive and two minimum negative
product = A[N - 1] * A[0] * A[1];
} else {
// otherwise, take 3 maximum negative values
product = A[N - 1] * A[N - 2] * A[N - 3];
}
return product;
}
First, you only have to solve one of the two problems, say find the biggest triple product. With this you can find the least by negating all the input values, finding the biggest triple product, and negating to find the answer.
So let's focus on finding the biggest. You have it pretty well worked out. Take the maximum positive number first (if there is one). Then pick either the pair of the two biggest remaining positive numbers or the two biggest (in magnitude) negative numbers, whichever pair has the largest product.
If there are no positive numbers at all, then pick the three smallest negative numbers.
Certainly this can all be done in O(n) time, but this is not an algorithm where recursion has a natural place. You'd have to use trivial tail recursion to substitute for loops.