Is it possible to create a bitmask for ~100 constants?

Is it possible to create a bitmask for ~100 constants? - java

Would that mean that the 100th constant would have to be 1 << 100?

You can use a BitSet which has any number bits you want to set or clear. e.g.
BitSet bitSet = new BitSet(101);
bitSet.set(100);

You can't do it directly because maximum size for a primitive number which can be used as a bitmask is actually 64 bit for a long value. What you can do is to split the bitmask into 2 or more ints or longs and then manage it by hand.
int[] mask = new int[4];
final int MAX_SHIFT = 32;
void set(int b) {
mask[b / MAX_SHIFT] |= 1 << (b % MAX_SHIFT);
}
boolean isSet(int b) {
return (mask[b / MAX_SHIFT] & (1 << (b % MAX_SHIFT))) != 0;
}

You can only create a simple bitmask with the number of bits in the primitive type.
If you have a 32 bit (as in normal Java) int then 1 << 31 is the most you can shift the low bit.
To have larger constants you use an array of int elements and you figure out which array element to use by dividing by 32 (with 32 bit int) and shift with % 32 (modula) into the selected array element.

Effective Java Item #32 suggests using an EnumSet instead of bit fields. Internally, it uses a bit vector so it is efficient, however, it becomes more readable as each bit has a descriptive name (the enum constant).

Yes, if you intend to be able to bitwise OR any or all of those constants together, then you're going to need a bit representing each constant. Of course if you use an int you will only have 32 bits and a long will only give you 64 bits.

Related

Getting 16 least and most significant bits of an arbitrary length binary number

So the problem I am having is obtaining the least significant and most significant 16 bits of a number over 16 bits but not necessarily of any certain length.
If the number was an int which is 32 bits I believe I could just do something like:
int Num=0xFFFFFFFF
short most = (short)(Num & 0xFFFF0000);
short least =(short)(Num & 0x0000FFFF);
Result:
most=0xFFFF
least=0xFFFF
Which in theory should get me a short 16 bit number with the least and most significant bits. But the problem is I need to be able to do this for an arbitrary amount of bits number, so this approach will not work because it will change what I need to & the number with. Is there a better approach to getting these values?
It seems Like there would be a fairly simple way to do this, but I can't find anything online.
Thanks

Before the main subject. your code have wrong to get most.
you should shift right for 4.
short most = (short)((Num & 0xFFFF0000) >> 0x10);
I guess you want this approach.
// lenMost should be in 0 to 32
int[] divide(int target, int lenMost) {
int MASK = 0xFFFFFFFF;
int lenLeast = 32 - lenMost;
int ret[] = new int[2]();
// get most
ret[0] = target & (MASK << lenLeast)
ret[0] >>= lenLeast;
// get least
ret[1] = target & (MASK >> lenMost);
return ret;
}

Very fast universal hash function for 128 bit keys

I need a very fast universal hash function for a 128-bit key. The returned value needs to be about 32 bit (well, 16 bit would be sufficient; in most cases I only need 1-4 bits actually).
Universal hash means, there are two parameters: key (128 bit) and index (64 bit). For two keys, the universal hash function needs to return different result eventually, if called with different indexes. So with a different index, the universal hash should behave like a different hash function. For x = universalHash(k, i) and y = universalHash(k, i + 1), it would be best if on average 50% of all bits are different between x and y (randomly). The same for the case if the method is called with different keys. In practise, 5% off is OK for me.
It needs to be very fast (one or two multiplications at most). It is called millions of times. Please don't say: no, you won't need it to be fast. It also needs to return different values eventually.
What I have so far (Java code, but C is (due to the lack of a 128 bit data type, the key is the composite of a and b, which are 64 bit each):
int universalHash(long a, long b, long index) {
long x = a ^ Long.rotateLeft(b, (int) index) ^ index;
int y = (int) ((x >>> 32) ^ x);
y = ((y >>> 16) ^ y) * 0x45d9f3b;
y = ((y >>> 16) ^ y) * 0x45d9f3b;
y = (y >>> 16) ^ y;
return y;
}
int universalHash2(long a, long b, long index) {
long x = Long.rotateLeft(a, (int) index) ^
Long.rotateRight(b, (int) index) ^ index;
x = (x ^ (x >>> 32)) * 0xbf58476d1ce4e5b9L;
return (int) ((x >>> 32) ^ x);
}
(The second method is actually broken for some values.)
I would like to have a hash function that is faster than those above, and is guaranteed to work in all cases (if possible provably correct, even thought that's not a strict requirement; it doesn't need to be cryptographically secure however).
I will call the universalHash method with incrementing index (first index 0, then index 1, and so on) for the same keys. It would be best if the next result could be calculated faster (e.g. without multiplication) from the previous result. But I also need to have a fast "direct access" if the index is some value (as in the example code).
Background
The problem I'm trying to solve is finding a MPHF (minimal perfect hash function) for a relatively small set of keys (up to 16 keys by directly mapping, and up to about 1024 keys by splitting into smaller subsets). For details on the algorithm, see my MinPerf project, specially the RecSplit algorithm. To support set of size 10^12 (like BBHash), I'm trying to internally use 128 bit signatures, which would simplify the algorithm.

You need a hash function that outputs 32 bits for 128 bits of inputs.
A simple way would be to just return "some" 32 bits out of the original 128 bits. There are many ways of choosing 32 bits and every choice will have collisions. But the index can decide which 32 bits to choose.
128/32 = 4, so 4 indices are enough to find at least one different bit.
For key 0 you choose the lower most 32 bits
For key 1 you choose the next 32 bits
and so on ..
The C implementation would be
uint32_t universal_hash(uint64_t key_higher, uint64_t key_lower, int index) {
// For a lack of portable 128 bit datatype we take the key in parts.
return 0xFFFFFFFF & ( index >=2 ? key_higher >> ((index - 2)*32) : key_lower >> (index*32));
}

Java, combining two integers to long results negative number

I am trying to combining two integers to a long in Java. Here is the code I am using:
Long combinedValue = (long) a << 32 | b;
When a = 0x03 and b = 0x1B56 ED23, I am able to get the expected value (combinedValue = 13343583523 in long).
However, when my a = 0x00 and b = 0xA2BF E1C7, I get a negative value, -1567628857, instead of 2730484167. Can anyone explain why shifting an integer 0 by 32 bits causes the first 32 bits become 0xFFFF FFFF?
Thanks

b is negative, too. That's what that constant means. What you probably want is ((long) a << 32) | (b & 0xFFFFFFFFL).

When you OR (long) a << 32 with b, if b is an int then it will be promoted to a long because the operation must be done between two values of the same type. This is called a widening conversion.
When this conversion from int to long happens, b will be sign extended, meaning that if the top bit is set then it will be copied into the top 32 bits of the 64 bit long value. This is what causes the top 32 bits to be 0xffffffff.

Bit Packing in Java

I am working on a compression algorithm.I am reading an image file of 8 bits/pixel and I want to pack these 8 bit values into 4 bits in order to compress.I want some useful insight into Bit Packing in Java and how can I approach this problem? I dont need a working solution.Just guidance.
Thanks in advance

Your compression routine could look as follows:
void compress(byte[] pic, byte[] picCompressed) {
boolean odd = false;
int pos = 0;
for (byte p : pic)
{
byte b = quantize(p);
if (odd) {
picCompressed[pos++] |= (byte)(b << 4);
} else {
picCompressed[pos] = b;
}
odd = !odd;
}
}
The original array is traversed in a loop. Controlled by an alternating odd variable, the compressed 4 bits are stuffed either in the upper or in the lower half of the byte position in the compressed array.
A simplistic quantizing routine would just ignore the lower 4 bits:
byte quantize(byte p) {
return (byte)((p >> 4) & 0x0F);
}
In practice, quantizing is non-uniform and often implemented using a look-up table. You could use an array of 256 bytes to assign a target value to every possible byte value.

Java has operators to test/manipulate the bits of a number. Take a look at this:
Bitwise and Bit Shift Operators
If you need to handle a larger amount of bits, there is also the Bitset class.
Basically what you need is just the bitwise operators to test/manipulate the bits of variables of either byte or int type.

What is the fastest way to associate a boolean flag to every possible integer value?

If I had a byte instead of an integer, I could easily create a boolean array with 256 positions and check:
boolean[] allBytes = new boolean[256];
if (allBytes[value & 0xFF] == true) {
// ...
}
Because I have an integer, I can't have an array with size 2 billion. What is the fastest way to check if an integer is true or false? A set of Integers? A hashtable?
EDIT1: I want to associate for every possible integer (2 billion) a true or false flag.
EDIT2: I have ID X (integer) and I need a quick way to know if ID X is ON or OFF.

A BitSet can't handle negative numbers. But there's a simple way around:
class BigBitSet {
private final BitSet[] bitSets = new BitSet[] {new BitSet(), new BitSet()};
public boolean get(int bitIndex) {
return bitIndex < 0 ? bitSets[1].get(~bitIndex)
: bitSets[0].get(bitIndex);
}
...
}
The second BitSet is for negative numbers, which get translated via the '~' operator (that's better than simply negating as it works for Integer.MIN_VALUE, too).
The memory consumption may get up to 4 Gib, i.e., about 524 MB.

I feel stupid for even elaborating on this.
The smallest unit of information your computer can store is a bit, right? A bit has two states, you want two states, so lets just say bit=0 is false and bit=1 is true.
So you need as many bits as there are possible int's, 2^32 = 4,294,967,296. You can fit 8 bits into a byte, so you need only 2^32 / 8 = 536,870,912 bytes.
From that easily follows code to address each of these bits in the bytes...
byte[] store = new byte[1 << 29]; // 2^29 bytes provide 2^32 bits
void setBit(int i) {
int byteIndex = i >>> 3;
int bitMask = 1 << (i & 7);
store[byteIndex] |= bitMask;
}
boolean testBit(int i) {
int byteIndex = i >>> 3;
int bitMask = 1 << (i & 7);
return (store[byteIndex] & bitMask) != 0;
}
java.util.BitSet provides practically the same premade in a nice class, only you can use it to store a maximum of 2^31 bits since it does not work with negative bit indices.

Since you're using Java, use BitSet. It's fast and easy. If you prefer, you could also use an array of primitive longs or BigInteger, but this is really what BitSet is for.
http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is it possible to create a bitmask for ~100 constants? - java

Would that mean that the 100th constant would have to be 1 << 100?

You can use a BitSet which has any number bits you want to set or clear. e.g. BitSet bitSet = new BitSet(101); bitSet.set(100);

Effective Java Item #32 suggests using an EnumSet instead of bit fields. Internally, it uses a bit vector so it is efficient, however, it becomes more readable as each bit has a descriptive name (the enum constant).

Yes, if you intend to be able to bitwise OR any or all of those constants together, then you're going to need a bit representing each constant. Of course if you use an int you will only have 32 bits and a long will only give you 64 bits.

Related

Getting 16 least and most significant bits of an arbitrary length binary number

Very fast universal hash function for 128 bit keys

Java, combining two integers to long results negative number

Bit Packing in Java

What is the fastest way to associate a boolean flag to every possible integer value?

Categories

Resources