For some reason, the following two bitwise operations provide different results yet it seems intuitive that they should provide the same result since the masks used should be the same. What am I missing here? Why would the results of using the two masks vary?
public class BitShiftTest {
private long bitString = -8784238533840732024L ;
private final int MAX_BITS_POSITION = 63 ;
public static void main(String[] args) {
BitShiftTest bst = new BitShiftTest() ;
System.out.printf("Before applying mask: %s\n", Long.toBinaryString(bst.bitString));
System.out.printf("Using Mask 1: %s\n", Long.toBinaryString(bst.clearBitMask(60)));
System.out.printf("Using Mask 2: %s\n", Long.toBinaryString(bst.clearBitMaskAlternative(60)));
}
public long clearBitMask(int position) {
return bitString & ~(1 << position) ;
}
public long clearBitMaskAlternative(int position) {
return bitString & (0x8000000000000000L >>> MAX_BITS_POSITION - position) ;
}
}
The results produced are
Before applying mask: 1000011000011000000111011001100000101000001000000000000010001000
Using Mask 1: 1000011000011000000111011001100000101000001000000000000010001000
Using Mask 2: 0
You are assuming that ~(1<<position) and 0x8000000000000000L >>> MAX_BITS_POSITION - position are equal, but that's not true.
Basically, you're missing a ~ in the alternative case - otherwise it looks as if you're just trying to extract one bit, not just clear it. It should be:
~(0x8000000000000000L >>> MAX_BITS_POSITION - position)
But also, as noted by #RolandIllig 1 << position is actually done in int arithmetic. You're not seeing this in your output because neither the 60th nor 28th bits happen to be set in the long you are masking. 1L << position fixes this.
1 << 63 is of type int and is therefore equal to Integer.MIN_VALUE, since the 1 is shifted out of the number. (In other languages the result would be undefined since you shift by more bits than an int has.
To fix this, use 1L << 63, which operates on long.
Related
If I had a byte instead of an integer, I could easily create a boolean array with 256 positions and check:
boolean[] allBytes = new boolean[256];
if (allBytes[value & 0xFF] == true) {
// ...
}
Because I have an integer, I can't have an array with size 2 billion. What is the fastest way to check if an integer is true or false? A set of Integers? A hashtable?
EDIT1: I want to associate for every possible integer (2 billion) a true or false flag.
EDIT2: I have ID X (integer) and I need a quick way to know if ID X is ON or OFF.
A BitSet can't handle negative numbers. But there's a simple way around:
class BigBitSet {
private final BitSet[] bitSets = new BitSet[] {new BitSet(), new BitSet()};
public boolean get(int bitIndex) {
return bitIndex < 0 ? bitSets[1].get(~bitIndex)
: bitSets[0].get(bitIndex);
}
...
}
The second BitSet is for negative numbers, which get translated via the '~' operator (that's better than simply negating as it works for Integer.MIN_VALUE, too).
The memory consumption may get up to 4 Gib, i.e., about 524 MB.
I feel stupid for even elaborating on this.
The smallest unit of information your computer can store is a bit, right? A bit has two states, you want two states, so lets just say bit=0 is false and bit=1 is true.
So you need as many bits as there are possible int's, 2^32 = 4,294,967,296. You can fit 8 bits into a byte, so you need only 2^32 / 8 = 536,870,912 bytes.
From that easily follows code to address each of these bits in the bytes...
byte[] store = new byte[1 << 29]; // 2^29 bytes provide 2^32 bits
void setBit(int i) {
int byteIndex = i >>> 3;
int bitMask = 1 << (i & 7);
store[byteIndex] |= bitMask;
}
boolean testBit(int i) {
int byteIndex = i >>> 3;
int bitMask = 1 << (i & 7);
return (store[byteIndex] & bitMask) != 0;
}
java.util.BitSet provides practically the same premade in a nice class, only you can use it to store a maximum of 2^31 bits since it does not work with negative bit indices.
Since you're using Java, use BitSet. It's fast and easy. If you prefer, you could also use an array of primitive longs or BigInteger, but this is really what BitSet is for.
http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html
I need a specific bit in a byte value stored as int value. My code is as shown below.
private int getBitValue(int byteVal, int bitShift){
byteVal = byteVal << bitShift;
int bit = (int) (byteVal >>>7);
return bit;
}
It is working when I give the bitshift as 1 but when I give the bitshift as 2 and the byteVal as 67(01000011 in binary), I get the value of 'byteVal' as 268 while 'byteVal' should be 3(000011 in binary) after the first line in the method(the left shift). What am I doing wrong here?
For some reason when I try your code I don't get what you get. For your example, if you say byteVal = 0b01000011 and bitShift = 2, then this is what I get:
byteVal = 0b01000011 << 2 = 0b0100001100
bit = (int) (0b0100001100 >>> 7) = (int) (0b010) // redundant cast
returned value: 0b010 == 2
I believe what you intended to do was shift the bit you wanted to the leftmost position, and then shift it all the way to the right to get the bit. However, your code won't do that for a few reasons:
You need to shift left by (variable length - bitShift) to get the desired bit to the place you want. So in this case, what you really want is to shift byteVal left by 6 places, not 2.
int variables are 32 bits wide, not 8. (so you actually want to shift byteVal left by 30 places)
In addition, your question appears to be somewhat contradictory. You state you want a specific bit, yet your example implies you want the bitShift-th least significant bits.
An easier way of getting a specific bit might be to simply shift right as far as you need and then mask with 1: (also, you can't use return with void, but I'm assuming that was a typo)
private int getBitValue(int byteVal, int bitShift) {
byteVal = byteVal >> bitShift; // makes the bitShift-th bit the rightmost bit
// Assumes bit numbers are 0-based (i.e. original rightmost bit is the 0th bit)
return (int) (byteVal & 1) // AND the result with 1, which keeps only the rightmost bit
}
If you want the bitShift-th least significant bits, I believe something like this would work:
private int getNthLSBits(int byteVal, int numBits) {
return byteVal & ((1 << numBits) - 1);
// ((1 << numBits) - 1) gives you numBits ones
// i.e. if numBits = 3, (1 << numBits) - 1 == 0b111
// AND that with byteVal to get the numBits-th least significant bits
}
I'm curious why the answer should be 3 and I think we need more information on what the function should do.
Assuming you want the value of the byteVal's lowest bitShift bits, I'd do the following.
private int getBitValue(int byteVal, int bitShift){
int mask = 1 << bitShift; // mask = 1000.... (number of 0's = bitShift)
mask--; // mask = 000011111 (number of 1's = bitShift)
return (byteVal & mask);
}
At the very least, this function will return 1 for getBitValue(67, 1) and 3 for getBitValue(67,2).
Is there a way to shift left or right without the byte loss, so that the bytes filled are the ones that are beeing taken?
e.g.:10010 shr 2 => 10100
or: 11001 shl 4 => 11100
the loss of information seems quite inconvenient, since you're not supposed to use it for math anyway..
i just want to send packages over the network in different byte order, so shifting back is important to me
What you're trying to do is bitwise rotation which is supported in Java.
public class Binary {
public static void main(String[] args) {
Integer i = 18;
System.out.println(Integer.toBinaryString(i));
i = Integer.rotateRight(i, 2);
System.out.println(Integer.toBinaryString(i));
}
}
This will print out:
10010
10000000000000000000000000000100
The 2 bits which were shifted off have been rotated round to the start. However there is a lot of 0 padding in the middle because an integer in Java takes up 32 bits.
If you wanted to implement this behaviour yourself, internally it is implemented as:
public static int rotateLeft(int i, int distance) {
return (i << distance) | (i >>> -distance);
}
And:
public static int rotateRight(int i, int distance) {
return (i >>> distance) | (i << -distance);
}
We divided an int to save three values into it. For example the first 8 bits (from left to right) hold one value, the 8th to 12th bits hold another value and rest of bits hold the third value.
I am writing a utility method to get value from a certain range of bits of an int. is it good enough? do you have a better solution? The startBitPos and endBitPos are count from right to left.
public static int bitsValue(int intNum, int startBitPos, int endBitPos)
{
//parameters checking ignored for now
int tempValue = intNum << endBitPos;
return tempValue >> (startBitPos + endBitPos);
}
EDIT:
I am sure all values will be unsign.
No, this isn't quite right at the moment:
You should use the unsigned right shift operator to avoid ending up with negative numbers when you don't want them. (That's assuming the original values are unsigned, of course.)
You're not shifting left by the appropriate amount to clear the extraneous high bits.
I suspect you want:
// Clear unnecessary high bits
int tempValue = intNum << (31 - endBitPos);
// Shift back to the lowest bits
return tempValue >>> (31 - endBitPos + startBitPos);
Personally I'd feel more comfortable with a mask-and-shift than this double shifting, but I'm finding it hard to come up with something as short as the above.
public static int bitsValue(int intNum, int startBitPos, int endBitPos)
{
int mask = ~0; //or 0xffffffff
//parameters checking ignored for now
mask = ~(mask<<(endBitPos)) & mask<<startBitPos
return intNum & mask;
}
however if you have commonly used bitranges it's better to keep masks for them statically
0xff000000 // is the 8 most significant bits
0x00e00000 // is the next3 bits and
0x001fffff // are the remaining 21 bits
If you only have a couple of fixed length 'masks' you could store them explicitly and use them like this:
int [] masks = new int [4];
int masks[0] = 0x11111111;
int masks[1] = 0x111100000000;
// ...
public int getValue(int input, int mask){
return input & masks[i];
}
I'm currently looking at a simple programming problem that might be fun to optimize - at least for anybody who believes that programming is art :) So here is it:
How to best represent long's as Strings while keeping their natural order?
Additionally, the String representation should match ^[A-Za-z0-9]+$. (I'm not too strict here, but avoid using control characters or anything that might cause headaches with encodings, is illegal in XML, has line breaks, or similar characters that will certainly cause problems)
Here's a JUnit test case:
#Test
public void longConversion() {
final long[] longs = { Long.MIN_VALUE, Long.MAX_VALUE, -5664572164553633853L,
-8089688774612278460L, 7275969614015446693L, 6698053890185294393L,
734107703014507538L, -350843201400906614L, -4760869192643699168L,
-2113787362183747885L, -5933876587372268970L, -7214749093842310327L, };
// keep it reproducible
//Collections.shuffle(Arrays.asList(longs));
final String[] strings = new String[longs.length];
for (int i = 0; i < longs.length; i++) {
strings[i] = Converter.convertLong(longs[i]);
}
// Note: Comparator is not an option
Arrays.sort(longs);
Arrays.sort(strings);
final Pattern allowed = Pattern.compile("^[A-Za-z0-9]+$");
for (int i = 0; i < longs.length; i++) {
assertTrue("string: " + strings[i], allowed.matcher(strings[i]).matches());
assertEquals("string: " + strings[i], longs[i], Converter.parseLong(strings[i]));
}
}
and here are the methods I'm looking for
public static class Converter {
public static String convertLong(final long value) {
// TODO
}
public static long parseLong(final String value) {
// TODO
}
}
I already have some ideas on how to approach this problem. Still, I though I might get some nice (creative) suggestions from the community.
Additionally, it would be nice if this conversion would be
as short as possible
easy to implement in other languages
EDIT: I'm quite glad to see that two very reputable programmers ran into the same problem as I did: using '-' for negative numbers can't work as the '-' doesn't reverse the order of sorting:
-0001
-0002
0000
0001
0002
Ok, take two:
class Converter {
public static String convertLong(final long value) {
return String.format("%016x", value - Long.MIN_VALUE);
}
public static long parseLong(final String value) {
String first = value.substring(0, 8);
String second = value.substring(8);
long temp = (Long.parseLong(first, 16) << 32) | Long.parseLong(second, 16);
return temp + Long.MIN_VALUE;
}
}
This one takes a little explanation. Firstly, let me demonstrate that it is reversible and the resultant conversions should demonstrate the ordering:
for (long aLong : longs) {
String out = Converter.convertLong(aLong);
System.out.printf("%20d %16s %20d\n", aLong, out, Converter.parseLong(out));
}
Output:
-9223372036854775808 0000000000000000 -9223372036854775808
9223372036854775807 ffffffffffffffff 9223372036854775807
-5664572164553633853 316365a0e7370fc3 -5664572164553633853
-8089688774612278460 0fbba6eba5c52344 -8089688774612278460
7275969614015446693 e4f96fd06fed3ea5 7275969614015446693
6698053890185294393 dcf444867aeaf239 6698053890185294393
734107703014507538 8a301311010ec412 734107703014507538
-350843201400906614 7b218df798a35c8a -350843201400906614
-4760869192643699168 3dedfeb1865f1e20 -4760869192643699168
-2113787362183747885 62aa5197ea53e6d3 -2113787362183747885
-5933876587372268970 2da6a2aeccab3256 -5933876587372268970
-7214749093842310327 1be00fecadf52b49 -7214749093842310327
As you can see Long.MIN_VALUE and Long.MAX_VALUE (the first two rows) are correct and the other values basically fall in line.
What is this doing?
Assuming signed byte values you have:
-128 => 0x80
-1 => 0xFF
0 => 0x00
1 => 0x01
127 => 0x7F
Now if you add 0x80 to those values you get:
-128 => 0x00
-1 => 0x7F
0 => 0x80
1 => 0x81
127 => 0xFF
which is the correct order (with overflow).
Basically the above is doing that with 64 bit signed longs instead of 8 bit signed bytes.
The conversion back is a little more roundabout. You might think you can use:
return Long.parseLong(value, 16);
but you can't. Pass in 16 f's to that function (-1) and it will throw an exception. It seems to be treating that as an unsigned hex value, which long cannot accommodate. So instead I split it in half and parse each piece, combining them together, left-shifting the first half by 32 bits.
EDIT: Okay, so just adding the negative sign for negative numbers doesn't work... but you could convert the value into an effectively "unsigned" long such that Long.MIN_VALUE maps to "0000000000000000", and Long.MAX_VALUE maps to "FFFFFFFFFFFFFFFF". Harder to read, but will get the right results.
Basically you just need to add 2^63 to the value before turning it into hex - but that may be a slight pain to do in Java due to it not having unsigned longs... it may be easiest to do using BigInteger:
private static final BigInteger OFFSET = BigInteger.valueOf(Long.MIN_VALUE)
.negate();
public static String convertLong(long value) {
BigInteger afterOffset = BigInteger.valueOf(value).add(OFFSET);
return String.format("%016x", afterOffset);
}
public static long parseLong(String text) {
BigInteger beforeOffset = new BigInteger(text, 16);
return beforeOffset.subtract(OFFSET).longValue();
}
That wouldn't be terribly efficient, admittedly, but it works with all your test cases.
If you don't need a printable String, you can encode the long in four chars after you've shifted the value by Long.MIN_VALUE (-0x80000000) to emulate an unsigned long:
public static String convertLong(long value) {
value += Long.MIN_VALUE;
return "" +
(char)(value>>48) + (char)(value>>32) +
(char)(value>>16) + (char)value;
}
public static long parseLong(String value) {
return (
(((long)value.charAt(0))<<48) +
(((long)value.charAt(1))<<32) +
(((long)value.charAt(2))<<16) +
(long)value.charAt(3)) + Long.MIN_VALUE;
}
Usage of surrogate pairs is not a problem, since the natural order of a string is defined by the UTF-16 values in its chars and not by the UCS-2 codepoint values.
There's a technique in RFC2550 -- an April 1st joke RFC about the Y10K problem with 4-digit dates -- that could be applied to this purpose. Essentially, each time the integer's string representation grows to require another digit, another letter or other (printable) character is prepended to retain desired sort-order. The negative rules are more arcane, yielding strings that are harder to read at a glance... but still easy enough to apply in code.
Nicely, for positive numbers, they're still readable.
See:
http://www.faqs.org/rfcs/rfc2550.html