How nextClearBit() of BitSet in java actually works? - java

This method in BitSet class is used to return the index of the first bit that is set to false
import java.util.BitSet;
public class BitSetDemo {
public static void main(String[] args) {
BitSet b = new BitSet();
b.set(5);
b.set(9);
b.set(6);
System.out.println(""+b);
System.out.println(b.nextClearBit(5));
System.out.println(b.nextClearBit(9));
}
}
Output :
{5, 6, 9}
7
10
In this code, 6 is set after 9 but it shows that the values are stored consecutively ((b.nextClearBit(5) returns next value which is 7). So, how BitSet store these values ?

The javadoc of nextClearBit says:
Returns the index of the first bit that is set to false that occurs on or after the specified starting index.
You have set 5, 6 and 9 to true. That means that starting from 5, the first index set to false is 7. And starting from 9, the first index set false is 10. Which according to your own output is also what is returned.
If you want to know how BitSet works and what it does, read its Javadoc and look at the source. It is included with the JDK.

BitSet uses bits to store the information, like this:
╔═══╦═══╦═══╦═══╦═══╦═══╦═══╦═══╦═══╦═══╦═══╗
Bits: ║ 0 ║ 1 ║ 0 ║ 0 ║ 1 ║ 1 ║ 0 ║ 0 ║ 0 ║ 0 ║ 0 ║
...╚═══╩═══╩═══╩═══╩═══╩═══╩═══╩═══╩═══╩═══╩═══╝
Position: 10 9 8 7 6 5 4 3 2 1 0
Whenever you use set(n) - it sets the bit in the corresponding position. The underlying implementation is with a series of longs - but for understanding the API, it's enough to imagine it as a long array of bits - zeros and ones - like in the drawing. It extends itself if it needs to.
When it needs to look for the next clear bit after 5, goes to the bit number 5, and starts searching until it reaches a zero. Actually, the implementation is a lot faster, relying on bit-manipulation tricks, but again, to understand the API, that's how you can imagine it.

Your question indicates you might have thought that the result of b.nextClearBit(i) was somehow affected by the order in which the different bits where set to true or false.
This is false because BitSet does not remember the order in which indices were given values.
next means "next in the order of the indices" and not "next in the order of having been values assigned".
b.nextClearBit(i) returns the smallest index j larger or equal than i for which b.get(i) == false.

BitSet is a set. The order (of insertion) is irrelevant. The method just gives the index of the next higher clear bit.
The internal implementation has been explained in a previous question. For each method, you can check the source. (The code may contain obscure "bit bashing" (also available in java.lang.Integer/java.lang.Long, may be implemented as intrinsics).)

Related

Spark - Find rows with same ID but values with opposite sign

I have a spark dataset with 2 columns - id & value.
It may have some values with same id but values with opposite sign (same absolute value). For example,
id
value
a
5
b
10
a
-5
b
10
a
5
b
10
a
-5
b
5
a
5
b
1
My use-case is to flag all such pairs of rows where ID is same but one value is positive and the other is negative (but absolute value is same). For example:
id
value
flag
a
5
true
b
10
true
a
-5
true
b
-10
true
a
5
true
b
10
false
a
-5
true
b
5
false
a
5
false
b
1
false
Please note that one positive value must be paired with at most one other negative value and vice versa.
I came across a solution in SQL (might need some modifications but the idea is similar): Need to display records which has positive and negative value
But since I’m new to spark, I’m not able to convert it into an equivalent spark code. Any help would be highly appreciated.
Thanks!
This would work:
df.alias("t1").join(df.alias("t2"), (col("t1.id")===col("t2.id")) && (col("t1.value")===col("t2.value").*(-1)), "left")
.withColumn("Flag", when(col("t2.id").isNull, "false").otherwise("true"))
.select("t1.*", "Flag")
.show()
Input:
Output:

How bit manipulation works?

There was a question asked:
"Presented with the integer n, find the 0-based position of the second
rightmost zero bit in its binary representation (it is guaranteed that
such a bit exists), counting from right to left.
Return the value of 2position_of_the_found_bit."
I had written below solution which works fine.
int secondRightmostZeroBit(int n) {
return (int)Math.pow(2,Integer.toBinaryString(n).length()-1-Integer.toBinaryString(n).lastIndexOf('0',Integer.toBinaryString(n).lastIndexOf('0')-1)) ;
}
But below was the best voted solution which I also liked as it has just few characters of codding and serving the purpose, but I could not understand it. Can someone explain how bit manipulation is helping to achieve it .
int secondRightmostZeroBit(int n) {
return ~(n|(n+1)) & ((n|(n+1))+1) ;
}
Consider some number having at least two 0 bits. Here is an example of such a number with the 2 rightmost 0 bits marked (x...x are bits we don't care about which can be either 0 or 1, and 1...1 are the sequences of zero or more 1 bits to the right and to the left of the rightmost 0 bit) :
x...x01...101...1 - that's n
If you add 1 to that number you get :
x...x01...110...0 - that's (n+1)
which means the right most 0 bit flipped to 1
therefore n|(n+1) would give you:
x...x01...111...1 - that's n|(n+1)
If you add 1 to n|(n+1) you get:
x...x100........0 - that's (n|(n+1))+1
which means the second right most 0 bit also flips to 1
Now, ~(n|(n+1)) is
y...y10.........0 - that's ~(n|(n+1))
where each y bit is the inverse of the corresponding x bit
therefore ~(n|(n+1)) & ((n|(n+1))+1) gives
0...010.........0
where the only 1 bit is at the location of the second rightmost 0 bit of the input number.

Can I shift bits into an integer that is initially 0 using a for loop that iterates 8 times to get some 8-bit number in Java? [duplicate]

This question already has answers here:
java bit manipulation
(5 answers)
Closed 8 years ago.
I am working on implementing an 8-bit adder abstracted with code in Java. This 8-bit adder is built from 8 full adder circuits. For those who don't know what a full adder is, it's a circuit that computes the sum of 2 bits.
My intention was8-bit to use a for loop to add each corresponding bit of the adders 2, 8-bit inputs such that a new bit of the 8-bit result is computed each time the for loop iterates.
Would it be possible to store the new computed bit of each iteration in a variable holding the 8-bit result using bit shifting?
Here's an example to help explain what I am asking. The bold bit would be the one that is shifted into the int holding the result.
0b00001010
+
0b00001011
First Iteration (addition starting w/ LSB)
Sum: 1
Result: 0b00000001
Carry: 0
Second Iteration
Sum: 0
Result: 0b00000001
Carry: 1
Third Iteration
Sum: 1
Result: 0b00000101
Carry: 0
Fourth Iteration
Sum: 0
Result: 0b00000101
Carry: 1
Fifth Iteration
Sum: 1
Result: 0b00010101
Carry: 0
Sixth, Seventh, Eigth Iteration
Sum: 0, 0, 0 respectively
Result: 0b00010101
Carry: 0, 0, 0 respectively
The shift operators in java are : >>>, << and >> , e.g.
System.out.println(1 << 1); // print 2
System.out.println(1 << 2); // print 4
You can't insert 1 from thin air with shifting. To insert 1 try bitwise operators: | and &
If you want to get that exact sequence, you can do it with this operation:
n = (n<<1 | (1&~n));
Starting from n=0, this gives 0b00000001, 0b00000010, 0b00000101, 0b00001010 etc.
refred already mentioned the shift operations.
shift operations are in particular useful when you are creating a bit mask (so lets say 1 bit of the whole number is set, or when a consecutive amount of bits within a bit field should be set.
Remember that
a = a << 1 is equal to
a = a*2 or a*=2 respectively.
And in anology
a = a >> 2 is equal to
a = a/2 or a/= 2
Now whenever you don't have consecutive amounts of bits that should be set you have to use binary operations like & and |.
So since you will need binary operations anyway, it does not make that much sense to use shift operations in every case because you could simple write down the hex value. But it would be possible. I made several steps to make this clear:
int a = 1; //0b00000001
a <<= 2 //0b00000100
a |= 1 //0b00000101
a <<= 1 //0b00001010

What does this boolean "(number & 1) == 0" mean?

On CodeReview I posted a working piece of code and asked for tips to improve it. One I got was to use a boolean method to check if an ArrayList had an even number of indices (which was required). This was the code that was suggested:
private static boolean isEven(int number)
{
return (number & 1) == 0;
}
As I've already pestered that particular user for a lot of help, I've decided it's time I pestered the SO community! I don't really understand how this works. The method is called and takes the size of the ArrayList as a parameter (i.e. ArrayList has ten elements, number = 10).
I know a single & runs the comparison of both number and 1, but I got lost after that.
The way I read it, it is saying return true if number == 0 and 1 == 0. I know the first isn't true and the latter obviously doesn't make sense. Could anybody help me out?
Edit: I should probably add that the code does work, in case anyone is wondering.
Keep in mind that "&" is a bitwise operation. You are probably aware of this, but it's not totally clear to me based on the way you posed the question.
That being said, the theoretical idea is that you have some int, which can be expressed in bits by some series of 1s and 0s. For example:
...10110110
In binary, because it is base 2, whenever the bitwise version of the number ends in 0, it is even, and when it ends in 1 it is odd.
Therefore, doing a bitwise & with 1 for the above is:
...10110110 & ...00000001
Of course, this is 0, so you can say that the original input was even.
Alternatively, consider an odd number. For example, add 1 to what we had above. Then
...10110111 & ...00000001
Is equal to 1, and is therefore, not equal to zero. Voila.
You can determine the number either is even or odd by the last bit in its binary representation:
1 -> 00000000000000000000000000000001 (odd)
2 -> 00000000000000000000000000000010 (even)
3 -> 00000000000000000000000000000011 (odd)
4 -> 00000000000000000000000000000100 (even)
5 -> 00000000000000000000000000000101 (odd)
6 -> 00000000000000000000000000000110 (even)
7 -> 00000000000000000000000000000111 (odd)
8 -> 00000000000000000000000000001000 (even)
& between two integers is bitwise AND operator:
0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1
So, if (number & 1) == 0 is true, this means number is even.
Let's assume that number == 6, then:
6 -> 00000000000000000000000000000110 (even)
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
1 -> 00000000000000000000000000000001
-------------------------------------
0 -> 00000000000000000000000000000000
and when number == 7:
7 -> 00000000000000000000000000000111 (odd)
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
1 -> 00000000000000000000000000000001
-------------------------------------
1 -> 00000000000000000000000000000001
& is the bitwise AND operator. && is the logical AND operator
In binary, if the digits bit is set (i.e one), the number is odd.
In binary, if the digits bit is zero , the number is even.
(number & 1) is a bitwise AND test of the digits bit.
Another way to do this (and possibly less efficient but more understandable) is using the modulus operator %:
private static boolean isEven(int number)
{
if (number < 0)
throw new ArgumentOutOfRangeException();
return (number % 2) == 0;
}
This expression means "the integer represents an even number".
Here is the reason why: the binary representation of decimal 1 is 00000000001. All odd numbers end in a 1 in binary (this is easy to verify: suppose the number's binary representation does not end in 1; then it's composed of non-zero powers of two, which is always an even number). When you do a binary AND with an odd number, the result is 1; when you do a binary AND with an even number, the result is 0.
This used to be the preferred method of deciding odd/even back at the time when optimizers were poor to nonexistent, and % operators required twenty times the number of cycles taken by an & operator. These days, if you do number % 2 == 0, the compiler is likely to generate code that executes as quickly as (number & 1) == 0 does.
Single & means bit-wise and operator not comparison
So this code checks if the first bit (least significant/most right) is set or not, which indicates if the number is odd or not; because all odd numbers will end with 1 in the least significant bit e.g. xxxxxxx1
& is a bitwise AND operation.
For number = 8:
1000
0001
& ----
0000
The result is that (8 & 1) == 0. This is the case for all even numbers, since they are multiples of 2 and the first binary digit from the right is always 0. 1 has a binary value of 1 with leading 0s, so when we AND it with an even number we're left with 0.
The & operator in Java is the bitwise-and operator. Basically, (number & 1) performs a bitwise-and between number and 1. The result is either 0 or 1, depending on whether it's even or odd. Then the result is compared with 0 to determine if it's even.
Here's a page describing bitwise operations.
It is performing a binary and against 1, which returns 0 if the least significant bit is not set
for your example
00001010 (10)
00000001 (1)
===========
00000000 (0)
This is Logical design concept bitwise & (AND)operater.
return ( 2 & 1 ); means- convert the value to bitwise numbers and comapre the (AND) feature and returns the value.
Prefer this link http://www.roseindia.net/java/master-java/java-bitwise-and.shtml

How does this code involving xor actually works?

I have a variable that represents the XOR of 2 numbers. For example: int xor = 7 ^ 2;
I am looking into a code that according to comments finds the rightmost bit that is set in XOR:
int rightBitSet = xor & ~(xor - 1);
I can't follow how exactly does this piece of code work. I mean in the case of 7^2 it will indeed set rightBitSet to 0001 (in binary) i.e. 1. (indeed the rightmost bit set)
But if the xor is 7^3 then the rightBitSet is being set to 0100 i.e 4 which is also the same value as xor (and is not the rightmost bit set).
The logic of the code is to find a number that represents a different bit between the numbers that make up xor and although the comments indicate that it finds
the right most bit set, it seems to me that the code finds a bit pattern with 1 differing bit in any place.
Am I correct? I am not sure also how the code works. It seems that there is some relationship between a number X and the number X-1 in its binary representation?
What is this relationship?
The effect of subtracting 1 from a binary number is to replace the least significant 1 in it with a 0, and set all the less significant bits to 1. For example:
5 - 1 = 101 - 1 = 100 = 4
4 - 1 = 100 - 1 = 011 = 3
6 - 1 = 110 - 1 = 101 = 5
So in evaluating x & ~(x - 1): above x's least significant 1, ~(x - 1) has the same set bits as ~x, so above x's least significant 1, x & ~(x-1) has no 1 bits. By definition, x has a 1 bit at its least significant 1, and as we saw above ~(x - 1) will, too, but ~(x - 1) will have 0s below that point. Therefore, x & ~(x - 1) will have only one 1 bit, at the least significant bit of x.

Categories

Resources