Counting efficiently all the pairs with given sum

Counting efficiently all the pairs with given sum - java

I came across the following question.
Given a list where each item represents the duration expressed in seconds of a song, return the total number of pairs of songs such their durations sum to minutes (e.g., 1m0s, 2m0s,..)
Example:
Input: [10,50,20,110,40]
Output: 3 (considering pairs at indexes (0,1),(0,3),(2,4))
I can only think to a brute force approach where I consider all pairs of songs. The time complexity of this approach is O(n^2).
Is there any better way of doing it?

The given problem can be reduced to the fact that we need to discover pairs (a,b) such that (a + b) mod 60 == 0 from the given list A.
Observation #1: For any integer x, (x mod 60) lies from o to 59.
Initialise an array of length 60 with default value set to 0 the index i of which will store the number of elements in the list A, such that x mod 60 = i for all x belonging to A
int freq[60] = {0};
for(int i = 0; i < A.size(); i++)
freq[(A[i] % 60)]++;
Now iterate over the array A again and for for every x, we need the count for the index 60 - (x mod 60) from our cumulative frequency map, which will corresponds to the number of elements it can form a pair with. The case where (x mod 60) == 30 would be a tricky one, which will require us to subtract 1 from the frequency count.
int ans = 0;
for(int i = 0; i < A.size(); i++) {
ans += freq[60 - (A[i] % 60)];
if(A[i] % 60 == 30) ans--;
}
The overall complexity of the solution is O(n).

Think on the lines of hashing, creating buckets and modulo division. All the possible minutes will go into one of 60 possible buckets. Then think how many possible combinations can there be when you choose two of second values from any two buckets. Then use nC2 to count. Here's my solution in Java.
public int numPairsDivisibleBy60(int[] time) {
int k = 60;
int[] mods = new int[k];
for (int i = 0; i < time.length; i++)
mods[time[i] % k]++;
// n(n-1)/2 pairs for multiples of k and numbers which leave remainder as half multiple of k
int count = ((mods[0] * (mods[0] - 1)) / 2) +
((mods[k / 2] * (mods[k / 2] - 1)) / 2);
for (int i = 1; i < k / 2; i++)
count += mods[i] * mods[k - i];
return count;
}

Related

Calculate all possible sums in an array from its sub arrays

I have an array of numbers, now I have to find sum of elements by generating all the possible subarrays of the given array and applying some conditions.
The condition is for each subarray get the minimum and also find the total of elements in it and multiply both (minimum * total). Finally, add all these multiplied values for all subarrays.
Here is the problem statement:
Find the sum of all possible sub-arrays using the below formula:
Sum(left, right) = (min of arr[i]) * (∑ arr[i]), where i ranges from
left to right.
Example:
Array = [2,3,2,1]
The sub arrays are: [start_index, end_index]
[0,0] subarray = [2], min is 2 and total of items = 2. min * total = 2*2=4
[0,1] subarray = [2,3], min is 2 and total of items = 5. min * total = 2*5=10
[0,2] subarray = [2,3,2], min is 2 and total of items = 7. min * total = 2*7=14
[0,3] subarray = [2,3,2,1], min is 1 and total of items = 8. min * total = 1*8=8
[1,1] subarray = [3], min is 3 and total of items = 3. min * total = 3*3 = 9
[1,2] subarray = [3,2], min is 2 and total of items = 5. min * total = 2*5 = 10
[1,3] subarray = [3,2,1], min is 1 and total of items = 6. min * total = 1*6 = 6
[2,2] subarray = [2], min is 2 and total of items = 2. min * total = 2*2 = 4
[2,3] subarray = [2,1], min is 1 and total of items = 3. min * total = 1*3 = 3
[3,3] subarray = [1], min is 1 and total of items = 1. min * total = 1*1 = 1
Total = 4 + 10 + 14 + 8 + 9 + 10+ 6 + 4 + 3 + 1 = 69
So the answer is 69 in this case.
Constraints:
Each array element is in range 1 to 10^9. Array size 1 to 10^5. Return response in modulo 10^9+7
This is the code I tried.
public static int process(List<Integer> list) {
int n = list.size();
int mod = 7 + 1000_000_000;
long result = 0;
for (int i = 0; i < n; i++) {
long total = 0;
int min = list.get(i);
for (int j = i; j < n; j++) {
int p = list.get(j);
total = (total + p) % mod;
min = Math.min(min, p);
result = (result + (min * total) % mod) % mod;
}
}
return (int) result;
}
I want to reduce the time complexity of this algorithm?
What can be a better approach to solve this task?
Update:
David Eisenstat has given a great answer, but Im finding it to difficult to understand and come with a Java program, can someone provide a java solution for the approach or provide a pseudo code so i can come up with a program.

As user1984 observes, we can't achieve o(n²) by doing constant work for each sub-array. Here's how we get to O(n).
The sub-array minimum is the hardest factor to deal with, so we factor it out. Assume that the elements are pairwise distinct to avoid double counting in the math below (the code won't change). Letting A range over sub-arrays and x over elements,
sum_{A} [(sum_{y in A} y) (min A)] =
sum_{x} [x sum_{A such that min(A) = x} (sum_{y in A} y)].
Focusing on sum_{A | min(A) = x} (sum_{y in A} y) first, the picture is that we have a sub-array like
a b x c d e
where the element to the left of a (if it exists) is less than x, the element to the right of e (if it exists) is less than x, and all of the elements shown are greater than x. We want to sum over all sub-sub-arrays containing x.
a b x
b x
x
a b x c
b x c
x c
a b x c d
b x c d
x c d
a b x c d e
b x c d e
x c d e
We still don't have time to sum over these sub-sub-arrays, but fortunately there's a pattern. Here are the number of times each element appears in a sub-sub-array.
a: 4 = 1 * 4 appearances
b: 8 = 2 * 4 appearances
x: 12 = 3 * 4 appearances
c: 9 = 3 * 3 appearances
d: 6 = 3 * 2 appearances
e: 3 = 3 * 1 appearances
This insight reduces the processing time for one sub-array to O(n), but there are still n sub-arrays, so we need two more ideas.
Now is the right time to figure out what the sub-arrays look like. The first sub-array is the whole array. We split this array at the minimum element and recursively investigate the left sub-array and the right separately.
This recursive structure is captured by the labeled binary tree where
The in-order traversal is the array elements in order;
Every node has a label less than its children. (I'm still assuming distinct elements. In practice what we can do is to declare the array index to be a tiebreaker.)
This is called a treap, and it can be constructed in linear time by an algorithm with a family resemblance to precedence parsing. For the array [3,1,4,5,9,2,6], for example, see below.
1
/ \
3 2
/ \
4 6
\
5
\
9
The final piece is being able to aggregate the sum patterns above. Specifically, we want to implement an API that might look like this in C++:
class ArraySummary {
public:
// Constructs an object with underlying array [x].
ArraySummary(int x);
// Returns an object representing the concatenation of the underlying arrays.
ArraySummary Concatenate(ArraySummary that);
// Returns the sum over i of (i+1)*array[i].
int WeirdSum();
};
The point of this interface is that we don't actually need to store the whole array to implement WeirdSum(). If we store
The length length of the underlying array,
The usual sum sum of the underlying array,
The weird sum weird_sum of the underlying array;
then we can implement the constructor as
length = 1;
sum = x;
weird_sum = x;
and Concatenate() as
length = length1 + length2;
sum = sum1 + sum2;
weird_sum = weird_sum1 + weird_sum2 + length1 * sum2;
We need two of these, one in each direction. Then it's just a depth-first traversal (actually if you implement precedence parsing, it's just bottom-up).

Your current solution has time complexity O(n^2), assuming that list.get is O(1). There are exactly 1 + 2 + ... + n-1 + n operations which can be expressed as n * (n + 1)/2, hence O(n^2).
Interestingly, n * (n + 1)/2 is the number of sub arrays that you can get from an array of length n, as defined in your question and evident from your code.
This implies that you are doing one operation per sub array and this is the requird minimum operations for this task, since you need to look at least once at each sub array.
My conclusion is that it isn't possible to reduce the time complexity of this task, unless there is some mathematical formula that helps to do so.
This doesn't necessary mean that there aren't ways to optimize the code, but that would need testing and may be language specific. Regardless, it wouldn't change the time complexity in terms of n where n is the length of the input array.
Appreciate any input on my logic. I'm learning myself.

The answer provided by David Eisenstat is very efficient with complexity of O(n).
I would like to share another approach, that although it has time complexity of O(n^2), it may be more simple and may be easier for some (me included) to fully understand.
Algorithm
initiate two dimensional array of size matrix[n][n], each cell will hold pair of Integers <sum, min>. we will denote for each Matrix[i, j] the first element of the pair as Matrix[i, j].sum and the second as Matrix[i, j].min
Initiate the diagonal of the matrix as follows:
for i in [0, n-1]:
Matrix[i][i] = <arr[i], arr[i]>
for i in [0, n-1]:
for j in[i, n-1]:
Matrix[i, j] = <
Matrix[i - 1, j].sum + arr[i, j],
Min(Matrix[i - 1, j].min, arr[i, j])
>
Calculate the result:
result = 0
for i in [0, n-1]:
for j in[i, n-1]:
result += Matrix[i, j].sum * Matrix[i, j].min
Time Complexity Analysis
Step 1: initiating two dimensional array ofsize [n,n] will take in theory O(n^2) as it may require to initiate all indices to 0, but if we skip the initialization of each cell and just allocate the memory this could take O(1)
Step 2 : Here we iterate from 0 to n-1 doing constant work each iteration and therefore the time complexity is O(n)
Step 3: Here we iterate over half of the matrix cells(all that are right of the diagonal), doing constant work each iteration, and therefore the time complexity is O((n - 1) + (n - 2) + .... + (1) + (0)) = O(n^2)
Step 4: Similar analysis to step 3, O(n^2)
In total we get O(n^2)
Explanation for solution
This is simple example of Dynamic programming approach.
Let's define sub[i, j] as the subarray between index i and j while 0 =< i, j <= n-1
Then:
Matrix[i, j].sum = sum x in sub[i, j]
Matrix[i, j].min = min x in sub[i, j]
Why?
for sub[i,i] it's obvious that:
sum x in sub[i, i] = arr[i]
min x in sub[i, i] = arr[i]
Just like we calculate in step 2.
Convince yourself that:
sum sub[i,j] = sum sub[i-1,j] + arr[i, j]
min sub[i,j] = Min(min sub[i-1,j], arr[i, j])
This explains step 3.
In Step 4 we just sums up everything to get the required result.

It can be with the O(n) solution.
Intuition
First of all, we want to achieve all subarrays like this.
a1 a2 a3 min b1 b2 b3 where min is minimum. We will use a monotonic increasing stack to achieve it. In every iteration, if the stack's top value is greater than the next element, we will pop the stack and calculate the sum until the condition is not met.
Secondly, we want to figure out how to calculate the total sum if we have an a1 a2 a3 min b1 b2 b3 subarray. Here, we will use a prefix of prefix sum.
Prefix Sum
At first, we need the prefix sum. Assume that p indicates prefix sum, we want to achieve p1 p2 p3 p4 p5 p6 p7. Our prefix sum will be like this;
p1: a1
p2: a1 + a2
p3: a1 + a2 + a3
.
p6 : a1 + a2 + a3 + min + b1 + b2
p7: a1 + a2 + a3 + min + b1 + b2 + b3
Within prefix sum now we can calculate the sum of between two indexes. The sum of (start, end] is pend - pstart. If start: 1 and end: 3 that means p3 - p1 = (a1 + a2 + a3) - (a1) = a2 + a3.
Prefix of Prefix Sum
How can we calculate all possible subarray sums that include our min value?
We separate this calculation to the left side and right side.
The left side included min will be a1 a2 a3 min.
The right side included min will be min b1 b2 b3.
For example, some of the possible sums can be:
a1 + a2 + a3 + min
a1 + a2 + a3 + min + b1
a3 + min + b1 + b2 + b3
min + b1 + b2 + b3
We need to find all the [bj, ai] sums. Where i means all the left side indexes and j means all the right side indexes. Now We need to use the prefix of prefix sum. It will give us all possible sums between two indexes. Let's say P. It will be sum(Pj) - sum(Pi).
Now, how do we calculate our sum(Pj) - sum(Pi)?
So Pj is P7 - P4. It is the right side possible sum.
Same way Pi is P4 - P1. It is the left side possible sum.
How many combinations for sum(Pj) are there?
leftSize * (P7 - P4). Same way for sum(Pi) it will be rightSize * (P4 - P1).
Final equation to calculate subarray [a1 a2 a3 min b1 b2 b3] is: min * ((leftSize * (P7 - P4)) - (rightSize * (P4 - P1))).
Algorithm
public static int process(List<Integer> list) {
int n = list.size();
int mod = (int) 1e9 + 7;
int[] preSum = new int[n + 2];
Deque<Integer> stack = new ArrayDeque<>();
int pre = 0;
int result = 0;
for (int i = 0; i <= n; i++) {
int num = i < n ? list.get(i) : 0;
// current prefix sum
pre = (pre + num) % mod;
// prefix of prefix sum array
preSum[i + 1] = (preSum[i] + pre) % mod;
while (!stack.isEmpty() && list.get(stack.peek()) > num) {
int mid = stack.pop();
int left = stack.isEmpty() ? -1 : stack.peek();
int lSize = mid - left;
int rSize = i - mid;
long lSum = left < 0 ? preSum[mid] : preSum[mid] - preSum[left];
long rSum = preSum[i] - preSum[mid];
result = (int) (result + (list.get(mid) * ((rSum * lSize - lSum * rSize) % mod)) % mod) % mod;
}
stack.push(i);
}
return (result + mod) % mod;
}
Time complexity: O(n)
Space complexity: O(n)
References
Thanks to #lee215 for one pass solution.
Thanks to #forAc for the explanation of the final equation.
https://leetcode.com/problems/sum-of-total-strength-of-wizards/discuss/2061985/JavaC%2B%2BPython-One-Pass-Solution

Explain the implementation of Euler's Totient Implementation

I have seen this code in a coding platform to efficiently calculate the euler's totient for different values.
I am not being able to understand this implementation. I really want to learn this. Could anyone please help me explain this?
for(int i = 1; i < Maxn; i++) { // phi[1....n] in n * log(n)
phi[i] += i;
for(int j = 2 * i; j < Maxn; j += i) {
phi[j] -= phi[i];
}
}

First, lets note that for prime values p, phi(p) = p - 1. This should be fairly intuitive, because all numbers less than a prime must be coprime to said prime. So then we start into our outer for loop:
for(int i = 1; i < Maxn; i++) { // phi[1....n] in n * log(n)
phi[i] += i;
Here we add the value of i to phi(i). For the prime case, this means we need phi(i) to equal -1 beforehand, and all other phi(i) must be adjusted further to account for the number of coprime integers. Focusing on the prime case, lets convince ourselves that these do equal -1.
If we step through the loop, at case i=1, we'll end up iterating over all other elements in our inner loop, subtracting 1.
for(int j = 2 * i; j < Maxn; j += i) {
phi[j] -= phi[i];
}
For any other values to be subtracted j must equal the prime p. But that would require j = 2 * i + i * k to equal p, for some iteration k. That cannot be, because 2 * i + i * k == i * (2 + k) implying that p can be divided evenly by i, which it cannot (since its prime). Thus, all phi(p) = p - 1.
For non-prime i, we need to subtract out the number of coprime integers. We do this in the inner for loop. Reusing the formula from before, if i divides j, we get j / i = (2 + k). So every value less than i can be multiplied by (2 + k) to be less than j, yet have a common factor of (2 + k) with j (thus, not coprime).
However, if we subtracted out (i - 1) multiples containing (2 + k) factors, we'd count the same factors multiple times. Instead, we only count those which are coprime to i, or in other words phi(i). Thus, we are left with phi(x) = x - phi(factor_a) - phi(factor_b) ... to account for all the (2 + k_factor) multiples of coprimes less than said factor, which now share a factor of (2 + k_factor) with x.
Putting this into code gives us exactly what you have above:
for(int i = 1; i < Maxn; i++) { // phi[1....n] in n * log(n)
phi[i] += i;
for(int j = 2 * i; j < Maxn; j += i) {
phi[j] -= phi[i];
}
}

By the way, just out of interest, there's also an O(n) algorithm to achieve the same. We know Euler's product formula for the totient is
phi(n) = n * product(
(p - 1) / p)
where p is a distinct prime that divide n
For example,
phi(18) = 18 * (
(2-1)/2 * (3-1)/3)
= 18 * 2/6
= 18 * 1/3
= 6
Now consider a number m = n * p for some prime p.
phi(n) = n * product(
(p' - 1) / p')
where p' is a distinct prime that divide n
If p divides n, since p already appears in the calculation for phi(n), we do not need to add it to the product section, rather we just add it to the initial multiplier
phi(m) = phi(p * n) = p * n * product(
(p' - 1) / p')
= p * phi(n)
Otherwise, if p does not divide n, we need to use the new prime,
phi(m) = phi(p * n) = p * n * product(
(p' - 1) / p') * (p - 1) / p
= p * (p - 1) / p * n * product(
(p' - 1) / p')
= (p - 1) * phi(n)
Either way, we can calculate the totient of a number multiplied by a prime only from the prime and the number's own totient, which can be aggregated in O(n) by repeatedly multiplying the numbers we've generated so far by the next prime we find until we reach Maxn. We find the next prime by incrementing an index to the successor we haven't recorded a totient for (prime generation here is a benefit).

Identify number of iterations of while loop

I have this code from my computer science class:
int input=15;
while (input < n ) { input = input *3;}
This code has the ceiling of log3(n/15) loops. How can we obtain this result?

I think he is talking about the analytical solution to complexity. I think it's something like this (long time ago i did logaritms):
15 * 3^x = n // x would be the number of iterations until value reaches n
ln(15*(3^x)) = ln(n)
ln(15) + ln(3^x) = ln(n)
ln(15) + x*ln(3) = ln(n)
x = (ln(n) - ln(15)) / ln(3)
x = ln(n/15) / ln(3)
x = log3(n/15) / log3(3)
x = log3(n/15)

For what values of n does the code loop k times?
It must be that 15 < n, and 15*3 < n and 15*3*3 < n and .... and 15*3^(k-1) < n. Also, it must be that 15*3^k >= n (otherwise the code would do at least one more loop).
That is, 3^k >= n/15 > 3^(k-1), and taking logs (base 3), k >= log3(n/15) > k-1.
Thus k is the smallest integer greater than or equal to log3(n/15), or equivalently: k = ceil(log3(n/15)) as required.

Running Time of this Simple Program - Time Complexity

I am trying to figure out what the time complexity of this simple program is, but I can't seem to understand what would be the best way to do this.
I have written down the time complexity side by side for each line
1 public int fnA (int n) {
2 int sum = 0; O(1)
3 for (int i = 0; i < n; i++) { O(n)
4 int j = i; O(n)
5 int product = 1; O(1)
6
7 while (j > 1) { O(n)
8 product ∗= j; O(log n)
9 j = j / 2; O(log n)
10 }
11 sum += product; O(1)
12 }
13 return sum; O(1)
14 }
Am I correct to assume these running times and that the final running time is: O(n)
If not, would somebody be able to explain where it is I am going wrong?
Overall:
1 + n + n + 1 + n + logn + logn + 1 + 1
= 3n + 2logn + 4
Final: O(n)

Time complexity for that algorithm is O(NlogN).
The for loop is executed N times (from 0 to N).
The while loop is executed logN times since your are dividing the number to half each time.
Since your are executing the while inside the for, your are executing a logN operation N times, from there it is the O(NlogN).
All remaining operations (assign, multiplication, division, sum) you can assume that takes O(1)

The crux of the above program is the while loop and it is the defining factor and rest of the lines will not have complexity more than O(n) and assuming that arithmetic operations will run in O(1) time.
while (j > 1) {
product ∗= j;
j = j / 2;
}
The above loop will have a run time of O(log(j)) and j is varying from 1 to n, so its the series...
-> O(log(1) + log(2) + log(3) + log(4).....log(n))
-> O(log(1*2*3*4...*n))
-> O(log(n!))
and O(log(n!)) is equal to O(n log(n))
For the proof for above refer this

No for every i, there is logn loop running and hence for n elements the total complexity is nlogn.
Since you know that the following loop takes logn .
while (j > 1) {
product ∗= j;
j = j / 2;
}
Now this particular loop is executed for every i. And so this will be executed n times. So it becomes nlogn.

To start with, you could count all operations. For example:
1 public int fnA (int n) {
2 int sum = 0; 1
3 for (int i = 0; i < n; i++) {
4 int j = i; n
5 int product = 1; n
6
7 while (j > 1) {
8 product ∗= j; ?
9 j = j / 2; ?
10 }
11 sum += product; n
12 }
13 return sum; 1
14 }
Now we could do the counting: which sums up to: 2 + 3n + nlog(n)
In a lot of programs the counting is more complex and usually has one outstanding higher order term, for example: 2+3n+2n2. When talking about performance we really care about when n is large, because when n is small, the sum is small anyway. When n is large, higher order term drawf the rest, so in this example 2n2 is really the term that matters. So that's the concept of tilde approximation.
With that in mind, usually one could quickly identify the portion of code that gets executed most often and use its count to represent overall time complexity. In example given by OP, it would look like this:
for (int i = 0; i < n; i++) {
for (int j = i; j > 1; j /= 2)
product *= j;
}
which gives ∑log2n. Usually the counting involves discrete mathamatics, one trick I have learned is to just replace with it integral and do caculus: ∫ log2n = nlog(n)

restricted subset sum into a specified range

i have an array which contains only 2 types of numbers(x and x-1) eg:- {5,5,4,4,5,5,5} and i am given a range like 12-14(inclusive). i already know the length of the array is constant 7 and i also know how many elements of each type there are in an array(count)
now i need to find if there is any combination of elements in the array whose sum falls into that range.
All i need is the number of elements in the subset whose sum falls in that range.
i have solved this problem by using brute force in the following way but it is very in efficient.
here count is the number of x-1's in the array
for(int i=0;i<=7-count;i++){
for(int j=0;j<=count;j++){
if(x*(i)+(x-1)*j>=min && x*(i)+(x-1)*j<=max){
output1=i+j;
}
}
}
could some one plz tell me if there is a better way of solving this
example:-
the array given is {5,5,4,4,5,5,5} and the range given is 12-14.
so i would pick {5,5,4} subset whose sum is 14 and so the answer to the number of elements in the subset will be 3.{5,4,4} can also be picked in this solution

You can improve your brute force by using some analysis.
with N being the array length and n being the result:
0 <= n <=N
0 <= j <= count
0 <= i <= N - count
n = i + j -> j <= n
sum = x * i + (x - 1) * j = x * n - j
min <= x * n - j <= max -> x * n - max <= j <= x * n - min
min <= x * n - j -> n >= (min + j) / x >= min / x
x * n - j <= max -> n <= (max + j) / x <= (max + count) / x
summing up you can use your cycle but with other range:
for (int n = min / x; n <= min (N, (max + count) / x); n++)
{
for (int j = max (0, x * n - max); j <= min (count, x * n - min, n); j++)
{
sum = x * n - j;
if (sum >= min && sum <= max)
{
output1 = n;
}
}
}
P.S.: here's some picture that may help to understand the idea
graph http://i.zlowiki.ru/110917_768e5221.jpg

say you want to find out the number of as and bs which add to n When testing a number of a you only need to use division to find the number of b.
i.e.
number of a * a + number of b * b = n
so
number of b = (n - number of a * a)/b;
EDIT: If this number is a whole number you have a solution.
To test if the division is a whole number you can do
(`n` - `number of a` * `a`) % `b` == 0
if you have a spread of the range which is smaller than b you can do
(`min` - `number of a` * `a`) % `b` <= `max` - `min`
if the spread is greater or equal to b you always have a number of solutions.
I am assuming b is positive.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Counting efficiently all the pairs with given sum - java

Related

Calculate all possible sums in an array from its sub arrays

Explain the implementation of Euler's Totient Implementation

Identify number of iterations of while loop

Running Time of this Simple Program - Time Complexity

restricted subset sum into a specified range

Categories

Resources