How to set the mid element when doing a binary search?

How to set the mid element when doing a binary search? - java

This may sound very naive question. Please excuse me for that. I was working on a problem that involved binary search. And the general way to do that I had learned is
set low = 0, set high = array.Length - 1
while(low < high)
mid = (low+high)/2
if (array[mid] == element) return mid
if (array[mid] > element) set start = mid +1
if array[mid] < element ) set high = mid
Please note the way i am setting the mid point. But some of example/solution that I saw sets the mid differently and I am not able to wrap my head around that. Please see the code snippet below. Any explanation what does setting the mid = l + (r-l)/2 mean would be greatly appreciated.
int BinarySearch(int A[], int l, int r, int key)
{
int m;
while( l <= r )
{
m = l + (r-l)/2;
if( A[m] == key ) // first comparison
return m;
if( A[m] < key ) // second comparison
l = m + 1;
else
r = m - 1;
}
return -1;
}

The goal in both cases is to find the middle element.
(low + high) / 2: The sum of the left and right indices is divided by 2 to get the mid point. The problem with this is that the sum could overflow.
If the sum is an even number, we pick the lower one. Example - With left and right as [2, 5] - We pick mid as 3
l + (r - l) / 2: The difference between the right and the left index is found and is divided by 2. It is then added to the left index to find the midpoint. It might be easier to visualize this with an example.
Example : [5, 11] -> (11 - 5) / 2 = 3 The middle in this interval is now 3 places or hops from the left index. So, add the left index to find the index of the mid element which is 5 + 3 = 8

Middle is equal to the starting index of the left value plus the range between right and left divided by two.
Say our range is 5 to 10.
5 + (10-5)/2 == 5 + 5/2 == 5 + 2.5 == 7.5, the middle of our range

This trick is done to avoid integer overflow problem .
(i)use unsinged right shift operator
int mid=(high+low)>>>1
or
(ii) int mid=low +(high-low)/2
(low+high)/2 will overflow if the sum exceeds Integer.MAX_VALUE and you will get wrong results or an RuntimeException
You can read this article by Josh Bloch:
https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html

Related

Why does this binary search implementation cause stack overflow in Ruby but not Java?

Ruby
def binary_search(arr, l, r, x)
if r >= 1 then
mid = l + (r - 1) / 2
if arr[mid] == x then
return mid
end
if arr[mid] > x then
return binary_search(arr, l, mid - 1, x)
end
return binary_search(arr, mid + 1, r, x)
end
return -1
end
Java
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l) {
int mid = l + (r - l) / 2;
// If the element is present at the
// middle itself
if (arr[mid] == x)
return mid;
// If element is smaller than mid, then
// it can only be present in left subarray
if (arr[mid] > x)
return binarySearch(arr, l, mid - 1, x);
// Else the element can only be present
// in right subarray
return binarySearch(arr, mid + 1, r, x);
}
// We reach here when element is not present
// in array
return -1;
}
When x (the target element) is in the right half of a sorted array, a stack overflow occurs and I get this error in Ruby
SystemStackError (stack level too deep)
why does this happen in ruby and not in java? I'm running the program in irb. the java implementation came straight from here https://www.geeksforgeeks.org/binary-search/ .

First, let's refactor your code to be a little bit clearer. Note: there are zero behavior changes in this code, it is mostly re-formatting with a little bit of refactoring.
def binary_search(arr, l, r, x)
return -1 unless r >= 1
mid = l + (r - 1) / 2
case arr[mid] <=> x
when 0 then mid
when -1 then binary_search(arr, mid + 1, r, x)
when 1 then binary_search(arr, l, mid - 1, x)
end
end
There are two main problems with your binary search.
First is your termination condition for "element not found". The way your binary search works is that it moves either the left "fence" to the right or the right "fence" to the left, depending on where the desired element must be in the array. That way, the search area gets smaller and smaller.
Now, when the two "fences" meet (or they even pass by each other), there is no more "search area", which means that the element wasn't found. However, you are not checking whether the two "fences" meet (l == r) or even overtake each other (l >= r), you are checking whether r meets the left boundary of the original array (r == 1).
This means you will have many, many more useless recursions until you finally give up when the element is not found.
Actually, it is not even that simple, because when l and r pass by each other, your midpoint calculation is also wrong, because now all of a sudden r is smaller than l, or in other words, the right fence is on the left side of the left fence!
Well … except that your midpoint calculation is broken anyway. For example, if l = 10 and r = 12, the midpoint between the two is obviously mid = 11, but according to your formula, it is:
10 + (12 - 1) / 2
10 + ( 11 ) / 2
10 + 5
15
Oops! The middle between left and right is actually way to the right of right! This means that, depending on where in the array, your search value is, you are actually making the search area bigger instead of smaller: you are moving r back to the right, so that you are searching a part of the array again that you have already searched! Again, this means that you need many more recursions in your Ruby version than in your Java version.
The formula for the current distance between l and r is r - l. Now we need half of that distance, which is (r - l) / 2. If we want to find the point halfway between l and r, we need to walk half the distance from l to r, so we need to add the above distance to l: l + (r - l) / 2.
Here's what the fixed code looks like:
def binary_search(arr, l, r, x)
return -1 if r < l
mid = l + (r - l) / 2
case arr[mid] <=> x
when 0 then mid
when -1 then binary_search(arr, mid + 1, r, x)
when 1 then binary_search(arr, l, mid - 1, x)
end
end

Largest sum from absolute differences of N elements in an array

This isn't really a homework, but rather a practice and optimization, but this seemed like the best section for this type of questions.
It is a dynamical programming issue, and it's the following:
-Given an unsorted array of N elements, pick K number of elements from it, such that their absolute difference is the largest.
An absolute difference is calculated between adjacent elements here. So if we have an array of 5 elements: 1 5 3 2 1, and k = 3, the absolute differences would be:
1 5 3 = |5-1| + |3-5| = 6
1 5 2 = |5-1| + |2-5| = 7
1 5 1 = [5-1| + |1-5| = 8
etc
With 1 5 1 being the largest and needed one with 8
What i've tried so far is solving this by finding all possible combinations of K numbers with a recursion and then returning the biggest(brute force).
This showed as a terrible idea, because when tried with an array of N=50 and k=25 for example, there are 1.264106064E+14 combinations.
The recursion i used is a simple one used for printing all K-digit integers from an array, just instead of printing them, keeping them in an array:
static void solve(int[] numbers, int k, int startPosition, int[] result) {
if (k == 0) {
System.out.println(absoluteDifferenceSum(result));
return;
}
for (int i = startPosition; i <= numbers.length - k; i++) {
result[result.length - k] = numbers[i];
solve(numbers, k - 1, i + 1, result);
}
}
What I want to achieve is the optimal complexity (which i suppose can't be lower than O(n^2) here, but i'm out of ideas and don't know how to start. Any help is appreciated!

Generally, we can have a naive O(n^2 * k) formulation, where f(i, k) represents the best result for selecting k elements when the ith element is rightmost in the selection,
f(i, k) = max(
f(j, k - 1) + abs(Ai - Aj)
)
for all j < i
which we can expand to
max( f(j, k - 1) + Ai - Aj )
= Ai + max(f(j, k - 1) - Aj)
when Ai >= Aj
and
max( f(j, k - 1) + Aj - Ai )
= -Ai + max(f(j, k - 1) + Aj)
when Ai < Aj
Since the right summand is independent of Ai, we can build a tree with nodes that store both f(j, k - 1) - Aj, as well as f(j, k - 1) + Aj. Additionally, we'll store in each node both maxes for each subtree. We'll need O(k) trees. Let's skip to the examination of the tree for k = 2 when we reach the last element:
1
5 -> 4 -> (-1, 9)
3 -> 2 -> (-1, 5)
2 -> 3 -> (-1, 5)
3 -> (-3, 5)
tree for k = 2 so far:
3 (-1, 5)
/ \
2 (-1, 5) 5 (-1, 9)
1 is less than 3 so we first add -1
to the max for the right subtree
stored for f(j, k - 1) + Aj
-1 + 9 = 8
(note that this represents {1,5,1})
then we continue left in the tree
and compare 8 to a similar calculation
with node 2: -1 + 5 = 4
(note that this represents {5,2,1})
This way, we can reduce the time complexity to O(n log n * k) with space O(n * k).

Difference in calculating middle index of array?

I have been trying to solve QuickSort and I got thru a scenario where we are selecting pivot element as the middle one.
http://www.java2novice.com/java-sorting-algorithms/quick-sort/
// calculate pivot number, I am taking pivot as middle index number
int pivot = array[lowerIndex+(higherIndex-lowerIndex)/2];
How difference this is with the below way to get the middle index?
int pivot = array[(lowerIndex+higherIndex)/2];
I remember I have seen this many times before also. And I am sure I am missing a scenario where this helpful when we get a odd number or something.
I tried few sample values but I get the same response for both ways.
What am I missing?
Thanks for your respone.

It is more likely that
(lowerIndex+higherIndex)/2
overflows rather than
lowerIndex+(higherIndex-lowerIndex)/2.
For example for lowerIndex == higherIndex == Integer.MAX_VALUE / 2 + 1.
Edit:
Mathematical proof of equivalence of the expressions
l + (r - l)/2 (in java notation)
= l + round_towards_zero((r - l) / 2) (in math notation)
= round_towards_zero(l + (r - l) / 2) (since l is an integer)
= round_towards_zero((2 * l + r - l) / 2)
= round_towards_zero(r + l) / 2)
= (l + r) / 2 (in java notation)

How to get peak change of array list in O(1)

I have an array of integer
e.g. int a[10]=[3,4,6,8,10,9,6,5,4,2],
i mean the contents inside the array will be in increasing order then decreasing,
I want to find out the peak point of change (last greater value and index)
e.g. in above case is 10, within order of 1,
please note we can do in O(n) by comparing and making the note of change but I need help to solve this in less than O(n) complexity.
Thanks in advance.

Adding to #u_seem_surprised answer, this definetly cannot be done better than Omega(logn).
Proof:
The index of the peak has n different possibilities for an array of size n.
It means, for each index i, we can find an array that the result should be this index, i.
Using comparisons based model, we need Omega(log_2(n)) to determine which of these n values is the correct one.
So we can conclude, lower bound for this problem is Omega(log(n)).

We can do this using Binary search suppose we are at our current mid value, the by looking at the index mid - 1 and at the index mid + 1, we can check whether the sequence is decreasing or increasing, and accordingly we can make a decision about which half we want to search the answer in.
We know we have found the answer when :
arr[mid - 1] < arr[mid] && arr[mid + 1] < arr[mid]
Pseudo code :
int start = 0, end = n;
while(start <= end){
int mid = (start + end)/2;
if(mid - 1 >= 0 && mid + 1 <= n && arr[mid-1] >= arr[mid] && arr[mid] >= arr[mid+1]){
//decreasing part
end = mid-1;
}else if(mid - 1 >= 0 && mid + 1 <= n && arr[mid-1] <= arr[mid] && arr[mid] <= arr[mid+1]){
//increasing part
start = mid+1;
}else{
//answer found, take care of corner cases
cout << arr[mid] << endl;
break;
}
}
The complexity of this solution is O(log2(n)), I don't think a better solution is possible, since you need to search peak, and that can only be done by making comparisions so a O(1) solution might not be possible.
Demo in c++ : http://ideone.com/UXwyaT

In-place interleaving of the two halves of a string

Given a string of even size, say:
abcdef123456
How would I interleave the two halves, such that the same string would become this:
a1b2c3d4e5f6
I tried attempting to develop an algorithm, but couldn't. Would anybody give me some hints as to how to proceed? I need to do this without creating extra string variables or arrays. One or two variable is fine.
I just don't want a working code (or algorithm), I need to develop an algorithm and prove it correctness mathematically.

You may be able to do it in O(N*log(N)) time:
Want: abcdefgh12345678 -> a1b2c3d4e5f6g7h8
a b c d e f g h
1 2 3 4 5 6 7 8
4 1-sized swaps:
a 1 c 3 e 5 g 7
b 2 d 4 f 6 h 8
a1 c3 e5 g7
b2 d4 f6 h8
2 2-sized swaps:
a1 b2 e5 f6
c3 d4 g7 h8
a1b2 e5f6
c3d4 g7h8
1 4-sized swap:
a1b2 c3d4
e5f6 g7h8
a1b2c3d4
e5f6g7h8
Implementation in C:
#include <stdio.h>
#include <string.h>
void swap(void* pa, void* pb, size_t sz)
{
char *p1 = pa, *p2 = pb;
while (sz--)
{
char tmp = *p1;
*p1++ = *p2;
*p2++ = tmp;
}
}
void interleave(char* s, size_t len)
{
size_t start, step, i, j;
if (len <= 2)
return;
if (len & (len - 1))
return; // only power of 2 lengths are supported
for (start = 1, step = 2;
step < len;
start *= 2, step *= 2)
{
for (i = start, j = len / 2;
i < len / 2;
i += step, j += step)
{
swap(s + i,
s + j,
step / 2);
}
}
}
char testData[][64 + 1] =
{
{ "Aa" },
{ "ABab" },
{ "ABCDabcd" },
{ "ABCDEFGHabcdefgh" },
{ "ABCDEFGHIJKLMNOPabcdefghijklmnop" },
{ "ABCDEFGHIJKLMNOPQRSTUVWXYZ0<({[/abcdefghijklmnopqrstuvwxyz1>)}]\\" },
};
int main(void)
{
unsigned i;
for (i = 0; i < sizeof(testData) / sizeof(testData[0]); i++)
{
printf("%s -> ", testData[i]);
interleave(testData[i], strlen(testData[i]));
printf("%s\n", testData[i]);
}
return 0;
}
Output (ideone):
Aa -> Aa
ABab -> AaBb
ABCDabcd -> AaBbCcDd
ABCDEFGHabcdefgh -> AaBbCcDdEeFfGgHh
ABCDEFGHIJKLMNOPabcdefghijklmnop -> AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPp
ABCDEFGHIJKLMNOPQRSTUVWXYZ0<({[/abcdefghijklmnopqrstuvwxyz1>)}]\ -> AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz01<>(){}[]/\

Generically that problem is quite hard -- and it reduces to finding permutation cycles. The number and length of those varies quite a lot depending on the length.
The first and last cycles are always degenerate; the 10 entry array has 2 cycles of lengths 6 and 2 and the 12 entry array has a single cycle of length 10.
Withing a cycle one does:
for (i=j; next=get_next(i) != j; i=next) swap(i,next);
Even though the function next can be implemented as some relatively easy formula of N, the problem is postponed to do book accounting of what indices have been swapped. In the left case of 10 entries, one should [quickly] find the starting positions of the cycles (they are e.g. 1 and 3).

Ok lets start over. Here is what we are going to do:
def interleave(string):
i = (len(string)/2) - 1
j = i+1
while(i > 0):
k = i
while(k < j):
tmp = string[k]
string[k] = string[k+1]
string[k+1] = tmp
k+=2 #increment by 2 since were swapping every OTHER character
i-=1 #move lower bound by one
j+=1 #move upper bound by one
Here is an example of what the program is going to do. We are going to use variables i,j,k. i and j will be the lower and upper bounds respectively, where k is going to be the index at which we swap.
Example
`abcd1234`
i = 3 //got this from (length(string)/2) -1
j = 4 //this is really i+1 to begin with
k = 3 //k always starts off reset to whatever i is
swap d and 1
increment k by 2 (k = 3 + 2 = 5), since k > j we stop swapping
result `abc1d234` after the first swap
i = 3 - 1 //decrement i
j = 4 + 1 //increment j
k= 2 //reset k to i
swap c and 1, increment k (k = 2 + 2 = 4), we can swap again since k < j
swap d and 2, increment k (k = 4 + 2 = 6), k > j so we stop
//notice at EACH SWAP, the swap is occurring at index `k` and `k+1`
result `ab1c2d34`
i = 2 - 1
j = 5 + 1
k = 1
swap b and 1, increment k (k = 1 + 2 = 3), k < j so continue
swap c and 2, increment k (k = 3 + 2 = 5), k < j so continue
swap d and 3, increment k (k = 5 + 2 = 7), k > j so were done
result `a1b2c3d4`
As for proving program correctness, see this link. It explains how to prove this is correct by means of a loop invariant.
A rough proof would be the following:
Initialization: Prior to the first iteration of the loop we can see that i is set to
(length(string)/2) - 1. We can see that i <= length(string) before we enter the loop.
Maintenance. After each iteration, i is decremented (i = i-1, i=i-2,...) and there must be a point at which i<length(string).
Termination: Since i is a decreasing sequence of positive integers, the loop invariant i > 0 will eventually equate to false and the loop will exit.

The solution is here J. Ellis and M. Markov. In-situ, stable merging by way of perfect shuﬄe.
The Computer Journal. 43(1):40-53, (2000).
Also see the various discussions here:
https://cs.stackexchange.com/questions/332/in-place-algorithm-for-interleaving-an-array/400#400
https://cstheory.stackexchange.com/questions/13943/linear-time-in-place-riffle-shuffle-algorithm.

Alright, here's a rough draft. You say you don't just want an algorithm, but you are taking hints, so consider this algorithm a hint:
Length is N.
k = N/2 - 1.
1) Start in the middle, and shift (by successive swapping of neighboring pair elements) the element at position N/2 k places to the left (1st time: '1' goes to position 1).
2) --k. Is k==0? Quit.
3) Shift (by swapping) the element at N/2 (1st time:'f' goes to position N-1) k places to the right.
4) --k.
Edit: The above algorithm is correct, as the code below shows. Actually proving that it's correct is waaay beyond my capabilities, fun little question though.
#include <iostream>
#include <algorithm>
int main(void)
{
std::string s("abcdefghij1234567890");
int N = s.size();
int k = N/2 - 1;
while (true)
{
for (int j=0; j<k; ++j)
{
int i = N/2 - j;
std::swap(s[i], s[i-1]);
}
--k;
if (k==0) break;
for (int j=0; j<k; ++j)
{
int i = N/2 + j;
std::swap(s[i], s[i+1]);
}
--k;
}
std::cout << s << std::endl;
return 0;
}

Here's an algorithm and working code. It is in place, O(N), and conceptually simple.
Walk through the first half of the array, swapping items into place.
Items that started in the left half will be swapped to the right
before we need them, so we use a trick to determine where they
were swapped to.
When we get to the midpoint, unscramble the unplaced left items that were swapped to the right.
A variation of the same trick is used to find the correct order for unscrambling.
Repeat for the remaining half array.
This goes through the array making no more than N+N/2 swaps, and requires no temporary storage.
The trick is to find the index of the swapped items. Left items are swapped into a swap space vacated by the Right items as they are placed. The swap space grows by the following sequence:
Add an item to the end(into the space vacated by a Right Item)
Swap an item with the oldest existing (Left) item.
Adding items 1..N in order gives:
1 2 23 43 435 465 4657 ...
The index changed at each step is:
0 0 1 0 2 1 3 ...
This sequence is exactly OEIS A025480, and can be calculated in O(1) amortized time:
def next_index(n):
while n&1: n=n>>1
return n>>1
Once we get to the midpoint after swapping N items, we need to unscramble. The swap space will contain N/2 items where the actual index of the item that should be at offset i is given by next_index(N/2+i). We can advance through the swaps space, putting items back in place. The only complication is that as we advance, we may eventually find a source index that is left of the target index, and therefore has already been swapped somewhere else. But we can find out where it is by doing the previous index look up again.
def unscramble(start,i):
j = next_index(start+i)
while j<i: j = next_index(start+j)
return j
Note that this only an indexing calculation, not data movement. In practice, the total number of calls to next_index is < 3N for all N.
That's all we need for the complete implementation:
def interleave(a, idx=0):
if (len(a)<2): return
midpt = len(a)//2
# the following line makes this an out-shuffle.
# add a `not` to make an in-shuffle
base = 1 if idx&1==0 else 0
for i in range(base,midpt):
j=next_index(i-base)
swap(a,i,midpt+j)
for i in range(larger_half(midpt)-1):
j = unscramble( (midpt-base)//2, i);
if (i!=j):
swap(a, midpt+i, midpt+j)
interleave(a[midpt:], idx+midpt)
The tail-recursion at the end can easily be replaced by a loop. It's just less elegant with Python's array syntax. Also note that for this recursive version, the input must be a numpy array instead of a python list, because standard list slicing creates copies of the indexes that are not propagated back up.
Here's a quick test to verify correctness. (8 perfect shuffles of a 52 card deck restore it to the original order).
A = numpy.arange(52)
B = A.copy()
C =numpy.empty(52)
for _ in range(8):
#manual interleave
C[0::2]=numpy.array(A[:26])
C[1::2]=numpy.array(A[26:])
#our interleave
interleave(A)
print(A)
assert(numpy.array_equal(A,C))
assert(numpy.array_equal(A, B))

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to set the mid element when doing a binary search? - java

Middle is equal to the starting index of the left value plus the range between right and left divided by two. Say our range is 5 to 10. 5 + (10-5)/2 == 5 + 5/2 == 5 + 2.5 == 7.5, the middle of our range

Related

Why does this binary search implementation cause stack overflow in Ruby but not Java?

Largest sum from absolute differences of N elements in an array

Difference in calculating middle index of array?

How to get peak change of array list in O(1)

In-place interleaving of the two halves of a string

Categories

Resources