Divide and conquer sum of array iterative - java

Is it possible to get the sum of an array using divide and conquer? I've tried it, but I always miss some numbers, or I calculate a number twice.
int[] arr = new int[]{1,2,3,4,5};
public int sum(int[] arr) {
int begin = 0;
int end = array.length - 1;
int counter = 0;
while (begin <= end) {
int mid = (begin + end) / 2;
counter += arr[end] + arr[mid];
end = mid - 1;
}
return counter;
}

Of course, Diveide-and-conquer computation of the array's sum is possible. But I cannot see a UseCase for it? You're still computing at least the same amount of sums, you're still running into issues if the sum of arrayis greater than Integer.MAX_VALUE, ...
There is also no performance benefit like Codor showed in his answer to a related question.
Starting with Java 8 you can compute the sum in 1 line using streams.
int sum = Arrays.stream(array).sum();
The main flaw with your above code is the fact that you're only summing up index mid(2) and end(4). After that you skip to the lower bracket (index mid = 0 and end = 2). So you're missing index 3. This problem will become even more prevelant with larger arrays because you're skipping even more indices after the while's first iteration.
A quick Google search brought up this nice-looking talk about the general Divide-and-Conquer principle using a mechanism called Recursion. Maybe it can guide you to a working solution.

Related

Divide N people into K groups: Why is the big O of this algorithim O(N^2 * K)?

The description of the problem and it's solution(s) can be found here
https://www.geeksforgeeks.org/count-the-number-of-ways-to-divide-n-in-k-groups-incrementally/
Basically the problem is given N people, how many ways can you divide them into K groups, such that each group is greater than or equal in number of people to the one that came before it?
The solution is to recurse through every possibility, and it's complexity can be cut down from O(NK) to O(N2 * K) through dynamic programming.
I understand the complexity of the old recursive solution, but have trouble understanding why the dynamic programming solution has O(N2 * K) complexity. How does one come to this conclusion about the dynamic programming solution's time complexity? Any help would be appreciated!
First of all, big O notation gives us an idea about the relation between two functions t(n)/i(n) when n -> infinity. To be more specific, it's an upper bound for such relation, which means it's f(n) >= t(n)/i(n). t(n) stands for the speed of growth of time spent on execution, i(n) describes the speed of growth of input. In function space (we work with functions there rather than with numbers and treat functions almost like numbers: we can divide or compare them, for example) the relation between two elements is also a function. Hence, t(n)/i(n) is a function.
Secondly, there are two methods of determining bounds for that relation.
The scientific observational approach implies the next steps. Let's see how much time it takes to execute an algorithm with 10 pieces of input. Then let's increase the input up to 100 pieces, and then up to 1000 and so on. The speed of growth of input i(n) is exponential (10^1, 10^2, 10^3, ...). Suppose, we get the exponential speed of growth of time as well (10^1 sec, 10^2 sec, 10^3 sec, ... respectively).
That means t(n)/i(n) = exp(n)/exp(n) = 1, n -> infinity (for the scientific purity sake, we can divide and compare functions only when n -> infinity, it doesn't mean anything for the practicality of the method though). We can say at least (remember it's an upper bound) the execution time of our algorithm doesn't grow faster than its input does. We might have got, say, the quadratic exponential speed of growth of time. In that case t(n)/i(n) = exp^2(n)/exp(n) = a^2n/a^n = exp(n), a > 1, n -> infinity, which means our time complexity is O(exp(n)), big O notation only reminds us that it's not a tight bound. Also, it's worth pointing out that it doesn't matter which speed of growth of input we choose. We might have wanted to increase our input linearly. Then t(n)/i(n) = exp(n)*n/n = exp(n) would express the same as t(n)/i(n) = exp^2(n)/exp(n) = a^2n/a^n = exp(n), a > 1. What matters here is the quotient.
The second approach is theoretical and mostly used in the analysis of relatively obvious cases. Say, we have a piece of code from the example:
// DP Table
static int [][][]dp = new int[500][500][500];
// Function to count the number
// of ways to divide the number N
// in groups such that each group
// has K number of elements
static int calculate(int pos, int prev,
int left, int k)
{
// Base Case
if (pos == k)
{
if (left == 0)
return 1;
else
return 0;
}
// if N is divides completely
// into less than k groups
if (left == 0)
return 0;
// If the subproblem has been
// solved, use the value
if (dp[pos][prev][left] != -1)
return dp[pos][prev][left];
int answer = 0;
// put all possible values
// greater equal to prev
for (int i = prev; i <= left; i++)
{
answer += calculate(pos + 1, i,
left - i, k);
}
return dp[pos][prev][left] = answer;
}
// Function to count the number of
// ways to divide the number N in groups
static int countWaystoDivide(int n, int k)
{
// Intialize DP Table as -1
for (int i = 0; i < 500; i++)
{
for (int j = 0; j < 500; j++)
{
for (int l = 0; l < 500; l++)
dp[i][j][l] = -1;
}
}
return calculate(0, 1, n, k);
}
The first thing to notice here is a 3-d array dp. It gives us a hint of the time complexity of a DP algorithm because usually, we traverse it once. Then we are concerned about the size of the array. It's initialized with the size 500*500*500 which doesn't give us a lot because 500 is a number, not a function, and it doesn't depend on the input variables, strictly speaking. It's done for the sake of simplicity though. Effectively, the dp has size of k*n*n with assumption that k <= 500 and n <= 500.
Let's prove it. Method static int calculate(int pos, int prev, int left, int k) has three actual variables pos, prev and left when k remains constant. The range of pos is 0 to k because it starts from 0 here return calculate(0, 1, n, k); and the base case is if (pos == k), the range of prev is 1 to left because it starts from 1 and iterates through up to left here for (int i = prev; i <= left; i++) and finally the range of left is n to 0 because it starts from n here return calculate(0, 1, n, k); and iterates through down to 0 here for (int i = prev; i <= left; i++). To recap, the number of possible combinations of pos, prev and left is simply their product k*n*n.
The second thing is to prove that each range of pos, prev and left is traversed only once. From the code, it can be determined by analysing this block:
for (int i = prev; i <= left; i++)
{
answer += calculate(pos + 1, i,
left - i, k);
}
All the 3 variables get changed only here. pos grows from 0 by adding 1 on each step. On each particular value of pos, prev gets changed by adding 1 from prev up to left, on each particular combination of values of pos and prev, left gets changed by subtracting i, which has the range prev to left, from left.
The idea behind this approach is once we iterate over an input variable by some rule, we get corresponding time complexity. We could iterate over a variable stepping on elements by decreasing the range by twice on each step, for example. In that case, we would get logarithmical complexity. Or we could step on every element of the input, then we would get linear complexity.
In other words, we without any doubts assume the minimum time complexity t(n)/i(n) = 1 for every algorithm from common sense. Meaning that t(n) and i(n) grow equally fast. That also means we do nothing with the input. Once we do something with the input, t(n) becomes f(n) times bigger than i(n). By the logic shown in the previous lines, we need to estimate f(n).

Is this method to get the closest number in a sorted List most effective?

I have large arrays of integers (with sizes between 10'000 and 1'400'000). I want to get the first integer bigger to a value. The value is never inside the array.
I've looked for various solutions but I have only found :
methods that estimate each values and are not designed for sorted lists or arrays (with O(n) time complexity).
methods that are recursive and/or not designed for very large lists or arrays (with O(n) or more time complexity, in other languages though, so I'm not sure).
I've designed my own method. Here it is :
int findClosestBiggerInt(int value, int[] sortedArray) {
if( sortedArray[0]>value ||
value>sortedArray[sortedArray.length-1] ) // for my application's convenience only. It could also return the last.
return sortedArray[0];
int exp = (int) (Math.log(sortedArray.length)/Math.log(2)),
index = (int) Math.pow(2,exp);
boolean dir; // true = ascend, false = descend.
while(exp>=0){
dir = sortedArray[Math.min(index, sortedArray.length-1)]<value;
exp--;
index = (int)( index+ (dir ? 1 : -1 )*Math.pow(2,exp) );
}
int answer = sortedArray[index];
return answer > value ? answer : sortedArray[index+1];
}
It has a O(log n) time complexity. With an array of length 1'400'000, it will loop 21 times inside the while block. Still, I'm not sure that it cannot be improved.
Is there a more effective way to do it, without the help of external packages ? Any time saved is great because this calculation occurs quite frequently.
Is there a more effective way to do it, without the help of external packages ? Any time saved is great because this calculation occurs quite frequently.
Well here is one approach that uses a map instead of an array.
int categorizer = 10_000;
// Assume this is your array of ints.
int[] arrayOfInts = r.ints(4_000, 10_000, 1_400_000).toArray();
You can group them in a map like so.
Map<Integer, List<Integer>> ranges =
Arrays.stream(arrayOfInts).sorted().boxed().collect(
Collectors.groupingBy(n -> n / categorizer));
Now, when you want to find the next element higher, you can get the list that would contain the number.
Say you want the next number larger than 982,828
int target = 982,828;
List<Integer> list = map.get(target/categorizer); // gets the list at key = 98
Now just process the list with your favorite method. One note. In some circumstances it is possible that your highest number could be in the other lists that come after this one, depending on the gap. You would need to account for this, perhaps by adjusting how the numbers are categorized or by searching subsequent lists. But this can greatly reduce the size of the lists you're working with.
As Gene's answer indicates, you can do this with binary search. The built-in java.util.Arrays class provides a binarySearch method to do that for you:
int findClosestBiggerInt(final int value, final int[] sortedArray) {
final int index = Arrays.binarySearch(sortedArray, value + 1);
if (index >= 0) {
return sortedArray[index];
} else {
return sortedArray[-(index + 1)];
}
}
You'll find that to be much faster than the method you wrote; it's still O(log n) time, but the constant factors will be much lower, because it doesn't perform expensive operations like Math.log and Math.pow.
Binary search is easily modified to do what you want.
Standard binary search for exact match with the target maintains a [lo,hi] bracket of integers where the target value - if it exists - is always inside. Each step makes the bracket smaller. If the bracket ever gets to size zero (hi < lo), the element is not in the array.
For this new problem, the invariant is exactly the same except for the definition of the target value. We must take care never to shrink the bracket in a way that might eliminate the next bigger element.
Here's the "standard" binary search:
int search(int tgt, int [] a) {
int lo = 0, hi = a.length - 1;
// loop while the bracket is non-empty
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
// if a[mid] is below the target, ignore it and everything smaller
if (a[mid] < tgt) lo = mid + 1;
// if a[mid] is above the target, ignore it and everything bigger
else if (a[mid] > tgt) hi = mid - 1;
// else we've hit the target
else return mid;
}
// The bracket is empty. Return "nothing."
return -1;
}
In our new case, the part that obviously needs a change is:
// if a[mid] is above the target, ignore it and everything bigger
else if (a[mid] > tgt) hi = mid - 1;
That's because a[mid] might be the answer. We can't eliminate it from the bracket. The obvious thing to try is keep a[mid] around:
// if a[mid] is above the target, ignore everything bigger
else if (a[mid] > tgt) hi = mid;
But now we've introduced a new problem in one case. If lo == hi, i.e. the bracket has shrunk to 1 element, this if doesn't make progress! It sets hi = mid = lo + (hi - lo) / 2 = lo. The size of the bracket remains 1. The loop never terminates.
Therefore, the other adjustment we need is to the loop condition: stop when the bracket reaches size 1 or less:
// loop while the bracket has more than 1 element.
while (lo < hi) {
For a bracket of size 2 or more, lo + (hi - lo) / 2 is always smaller than hi. Setting hi = mid makes progress.
The last modification we need is checking the bracket after the loop terminates. There are now three cases rather than one in the original algorithm:
empty or
contains one element, which is the answer,
or it's not.
It's easy to sort these out just before returning. In all, we have:
int search(int tgt, int [] a) {
int lo = 0, hi = a.length - 1;
while (lo < hi) {
int mid = lo + (hi - lo) / 2;
if (a[mid] < tgt) lo = mid + 1;
else if (a[mid] > tgt) hi = mid;
else return mid;
}
return lo > hi || a[lo] < tgt ? -1 : lo;
}
As you point out, for a 1.4 million element array, this loop will execute no more than 21 times. My C compiler produces 28 instructions for the whole thing; the loop is 14. 21 iterations ought to be a handful of microseconds. It requires only small constant space and generates zero work for the Java garbage collector. Hard to see how you'll do better.

Sorting by Subtraction

So I was working on this sorting algorithm in java, and was wondering if anybody has seen anything like this before. Specifically, this algorithm is meant to sort very large lists with a very small range of numbers. If you have seen that or have any suggestions to enhance the algorithm, can you say something about that below? I have the code here:
public static int[] sort(int[] nums)
{
int lowest = Integer.MAX_VALUE;
for (int n : nums)
{
if (n < lowest)
lowest = n;
}
int index = 0;
int down = 0;
while (index < nums.length)
{
for (int i = index; i < nums.length; i++)
{
if (nums[i] == lowest)
{
int temp = nums[i] + down;
nums[i] = nums[index];
nums[index] = temp;
index++;
}
else
nums[i]--;
}
down++;
}
return nums;
}
If I'm not mistaken, that is your standard-issue BubbleSort. It's simple to implement but has poor performance: O(n^2). Notice the two nested loops: as the size of the array increases, the runtime of the algorithm will increase exponentially.
It's named Bubble Sort because the smallest values will "bubble" to the front of the array, one at a time. You can read more about it on Wikipedia.
So the algorithm seems to work but it does a lot of unnecessary work in the process.
Basically you are throwing in the necessity to subtract from a number x times before you add x back and try and swap it where x is the difference between the number and the lowest number in the array. Take [99, 1] for example. With your algorithm you would update the array to [98, 1] in the first for loop iteration and then the next you would make the swap [1, 98] then you have to make 97 more iterations to bring your down variable up to 98 and your array to [1,1] state then you add 98 to it and swap it with itself. Its an interesting technique for sure but its not very efficient.
The best algorithm for any given job really depends on what you know about your data. Look into other sorting algorithms to get a feel for what they do and why they do it. Make sure that you walk through the algorithm you make and try to get rid of unnecessary steps.
To enhance the algorithm first I would get rid of finding the lowest in the set and remove the addition and subtraction steps. If you know that your numbers will all be integers in a given range look into bucket sorting otherwise you can try merge or quicksort algorithms.

No. of ways to divide an array

I want to find The number of ways to divide an array into 3 contiguous parts such that the sum of the three parts is equal
-10^9 <= A[i] <= 10^9
My approach:
Taking Input and Checking for Base Case:
for(int i=0;i<n;i++){
a[i]= in.nextLong();
sum+=a[i];
}
if(sum%3!=0)System.out.println("0");
If The answer is not above Then Forming the Prefix and Suffix Sum.
for(int i=1;i<=n-2;i++){
xx+=a[i-1];
if(xx==sum/3){
dp[i]=1;
}
}
Suffix Sum and Updating the Binary Index Tree:
for(int i=n ;i>=3;i--){
xx+=a[i-1];
if(xx==sum/3){
update(i, 1, suffix);
}
}
And Now simple Looping the array to find the Total Ways:
int ans=0;
for(int i=1;i<=n-2;i++){
if(dp[i]==1)
{
ans+= (query(n, suffix) - query(i+1, suffix));
// Checking For the Sum/3 in array where index>i+1
}
}
I Getting the wrong answer for the above approachI don't Know where I have made mistake Please Help to Correct my mistake.
Update and Query Function:
public static void update(int i , int value , int[] arr){
while(i<arr.length){
arr[i]+=value;
i+=i&-i;
}
}
public static int query(int i ,int[] arr){
int ans=0;
while(i>0){
ans+=arr[i];
i-=i&-i;
}
return ans;
}
As far as your approach is concerned its correct. But there are some points because of which it might give WA
Its very likely that sum overflows int as each element can magnitude of 10^9, so use long long .
Make sure that suffix and dp array are initialized to 0.
Having said that using a BIT tree here is an overkill , because it can be done in O(n) compared to your O(nlogn) solution ( but does not matter if incase you are submitting on a online judge ).
For the O(n) approach just take your suffix[] array.And as you have done mark suffix[i]=1 if sum from i to n is sum/3, traversing the array backwards this can be done in O(n).
Then just traverse again from backwards doing suffix[i]+=suffix[i-1]( apart from base case i=n).So now suffix[i] stores number of indexs i<=j<=n such that sum from index j to n is sum/3, which is what you are trying to achieve using BIT.
So what I suggest either write a bruteforce or this simple O(n) and check your code against it,
because as far as your approach is concerned it is correct, and debugging is something not suited for
stackoverflow.
First, we calculate an array dp, with dp[i] = sum from 0 to i, this can be done in O(n)
long[]dp = new long[n];
for(int i = 0; i < n; i++)
dp[i] = a[i];
if(i > 0)
dp[i] += dp[i - 1];
Second, let say the total sum of array is x, so we need to find at which position, we have dp[i] == x/3;
For each i position which have dp[i] == 2*x/3, we need to add to final result, the number of index j < i, which dp[j] == x/3.
int count = 0;
int result = 0;
for(int i = 0; i < n - 1; i++){
if(dp[i] == x/3)
count++;
else if(dp[i] == x*2/3)
result += count;
}
The answer is in result.
What wrong with your approach is,
if(dp[i]==1)
{
ans+= (query(n, suffix) - query(i+1, suffix));
// Checking For the Sum/3 in array where index>i+1
}
This is wrong, it should be
(query(n, suffix) - query(i, suffix));
Because, we only need to remove those from 1 to i, not 1 to i + 1.
Not only that, this part:
for(int i=1;i<=n-2;i++){
//....
}
Should be i <= n - 1;
Similarly, this part, for(int i=n ;i>=3;i--), should be i >= 1
And first part:
for(int i=0;i<n;i++){
a[i]= in.nextLong();
sum+=a[i];
}
Should be
for(int i=1;i<=n;i++){
a[i]= in.nextLong();
sum+=a[i];
}
A lot of small errors in your code, which you need to put in a lot of effort to debugging first, jumping to ask here is not a good idea.
In the question asked we need to find three contiguous parts in an array whose sum is the same.
I will mention the steps along with the code snippet that will solve the problem for you.
Get the sum of the array by doing a linear scan O(n) and compute sum/3.
Start scanning the given array from the end. At each index we need to store the number of ways we can get a sum equal to (sum/3) i.e. if end[i] is 3, then there are 3 subsets in the array starting from index i till n(array range) where sum is sum/3.
Third and final step is to start scanning from the start and find the index where sum is sum/3. On finding the index add to the solution variable(initiated to zero), end[i+2].
The thing here we are doing is, start traversing the array from start till len(array)-3. On finding the sum, sum/3, on let say index i, we have the first half that we require.
Now, dont care about the second half and add to the solution variable(initiated to zero) a value equal to end[i+2]. end[i+2] tells us the total number of ways starting from i+2 till the end, to get a sum equal to sum/3 for the third part.
Here, what we have done is taken care of the first and the third part, doing which we have also taken care of the second part which will be by default equal to sum/3. Our solution variable will be the final answer to the problem.
Given below are the code snippets for better understanding of the above mentioned algorithm::-
Here we are doing the backward scanning to store the number of ways to get sum/3 from the end for each index.
long long int *end = (long long int *)calloc(numbers, sizeof(long long int);
long long int temp = array[numbers-1];
if(temp==sum/3){
end[numbers-1] = 1;
}
for(i=numbers-2;i>=0;i--){
end[i] = end[i+1];
temp += array[i];
if(temp==sum/3){
end[i]++;
}
}
Once we have the end array we do the forward loop and get our final solution
long long int solution = 0;
temp = 0;
for(i=0;i<numbers-2;i++){
temp+= array[i];
if(temp==sum/3){
solution+=end[i+2];
}
}
solution stores the final answer i.e. the number of ways to split the array into three contiguous parts having equal sum.

Project Euler 14: Issue with array indexing in a novel solution

The problem in question can be found at http://projecteuler.net/problem=14
I'm trying what I think is a novel solution. At least it is not brute-force. My solution works on two assumptions:
1) The less times you have iterate through the sequence, the quicker you'll get the answer. 2) A sequence will necessarily be longer than the sequences of each of its elements
So I implemented an array of all possible numbers that could appear in the sequence. The highest number starting a sequence is 999999 (as the problem only asks you to test numbers less than 1,000,000); therefore the highest possible number in any sequence is 3 * 999999 + 1 = 2999998 (which is even, so would then be divided by 2 for the next number in the sequence). So the array need only be of this size. (In my code the array is actually 2999999 elements, as I have included 0 so that each number matches its array index. However, this isn't necessary, it is for comprehension).
So once a number comes in a sequence, its value in the array becomes 0. If subsequent sequences reach this value, they will know not to proceed any further, as it is assumed they will be longer.
However, when i run the code I get the following error, at the line introducing the "wh:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3188644
For some reason it is trying to access an index of the above value, which shouldn't be reachable as it is over the possible max of 29999999. Can anyone understand why this is happening?
Please note that I have no idea if my assumptions are actually sound. I'm an amateur programmer and not a mathematician. I'm experimenting. Hopefully I'll find out whether it works as soon as I get the indexing correct.
Code is as follows:
private static final int MAX_START = 999999;
private static final int MAX_POSSIBLE = 3 * MAX_START + 1;
public long calculate()
{
int[] numbers = new int[MAX_POSSIBLE + 1];
for(int index = 0; index <= MAX_POSSIBLE; index++)
{
numbers[index] = index;
}
int longestChainStart = 0;
for(int index = 1; index <= numbers.length; index++)
{
int currentValue = index;
if(numbers[currentValue] != 0)
{
longestChainStart = currentValue;
while(numbers[currentValue] != 0 && currentValue != 1)
{
numbers[currentValue] = 0;
if(currentValue % 2 == 0)
{
currentValue /= 2;
}
else
{
currentValue = 3 * currentValue + 1;
}
}
}
}
return longestChainStart;
}
Given that you can't (easily) put a limit on the possible maximum number of a sequence, you might want to try a different approach. I might suggest something based on memoization.
Suppose you've got an array of size 1,000,000. Each entry i will represent the length of the sequence from i to 1. Remember, you don't need the sequences themselves, but rather, only the length of the sequences. You can start filling in your table at 1---the length is 0. Starting at 2, you've got length 1, and so on. Now, say we're looking at entry n, which is even. You can look at the length of the sequence at entry n/2 and just add 1 to that for the value at n. If you haven't calculated n/2 yet, just do the normal calculations until you get to a value you have calculated. A similar process holds if n is odd.
This should bring your algorithm's running time down significantly, and prevent any problems with out-of-bounds errors.
You can solve this by this way
import java.util.LinkedList;
public class Problem14 {
public static void main(String[] args) {
LinkedList<Long> list = new LinkedList<Long>();
long length =0;
int res =0;
for(int j=10; j<1000000; j++)
{
long i=j;
while(i!=1)
{
if(i%2==0)
{
i =i/2;
list.add(i);
}
else
{
i =3*i+1;
list.add(i);
}
}
if(list.size()>length)
{
length =list.size();
res=j;
}
list.clear();
}
System.out.println(res+ " highest nuber and its length " + length);
}}

Categories

Resources