Use of bit wise shift operator in ConcurrentHashMap - java

As I was going through the ConcurrentHashMap source code, I have encountered so many bit wise shift operator. Some are applied on to create constants and some are on variables.
static final int MAXIMUM_CAPACITY = 1 << 30;
static final int MAX_SEGMENTS = 1 << 16; // slightly conservative
long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
I am not able to understand, if constant like MAXIMUM_CAPACITY could be declared directly then what is the use of using bitwise shift operator.

They are not using the number in decimal form (base 10). Instead, they are saying "this is a number with 30 trailing 0 bits", implying the number is used for base 2 systems.
The bitshifting makes it easier to inform the reader of the value. In base 10, it would represent 1073741824, which seems like a random number.
This is common in programming. For example:
int secondsInDay = 60 * 60 * 24;
We are representing the amount of seconds in a minute, multiplied by the amount of minutes in an hour, multiplied by the amount of hours in a day.
We could have just put 86400, but what if we wanted to change how many minutes are in an hour (to represent time on some other planet)? You would have to manually calculate it to change the value.
On the other hand, by breaking it down into units as shown above, we can simply change the middle 60 to change how many minutes are in a day.

Related

Most efficient way to round a timestamp to the nearest 10 seconds

What is the most efficient way in Java (11) to round a given timestamp (e.g. System.currentTimeMillis()) to the nearest 10 seconds?
e.g. 12:55:11 would be 12:55:10 and 12:55:16 would be 12:55:20
This code is executed ~10-20 times per second, so it must be efficient.
Any ideas?
Thanks
Probably this:
long time = System.currentTimeMillis();
long roundedTime = (time + 5_000) / 10_000 * 10_000;
Basically 3 x 64 bit primitive arithmetic operations.
(If you want to truncate to 10 seconds granularity, just remove the + 5_000.)
Theoretically we should consider integer overflow. In practice the above code should be OK for roughly the next 292 million years. (Source: Wikipedia.)

Finding Java.util.Random seed from bounded nextInt(int bound) results

Background
I've been reading and trying to wrap my head around various questions/answers that relate to finding the seed from Java.util.Random given its output from nextInt().
The implementation of nextInt(int bound) is:
public int nextInt(int bound) {
if (bound <= 0)
throw new IllegalArgumentException("bound must be positive");
if ((bound & -bound) == bound) // i.e., bound is a power of 2
return (int)((bound * (long)next(31)) >> 31);
int bits, val;
do {
bits = next(31);
val = bits % bound;
} while (bits - val + (bound-1) < 0);
return val;
}
The implementation of next(int bits) is:
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
where the multiplier is 0x5DEECE66DL, the addend is 0xBL, and the mask is (1L << 48) - 1. These are hexadecimal values where the L is Java convention for long conversion.
By calling nextInt() without a bound, the full 32-bits are returned from next(32) instead of dropping bits with bits % bound.
Questions
Without completely bruteforcing the full 248 possibilities, how would I go about finding the current seed after x amount of calls to nextInt(n) (assuming the bound is never a power of 2)? For example, let's assume I want to find the seed given 10 calls to nextInt(344) [251, 331, 306, 322, 333, 283, 187, 54, 170, 331].
How can I determine the amount of data I'd need to find the correct seed, not just another one that produces the same starting data?
Does this change given bounds that are either odd/even?
Without completely bruteforcing the full 248 possibilities, how would I go about finding the current seed after x amount of calls to nextInt(n) (assuming the bound is never a power of 2)?
Let's first remove code that is here for multi-threading, error testing, and bound a power of two. Things boil down to
public int nextInt_equivalent(int bound) {
int bits, val;
do {
seed = (seed * multiplier + addend) & mask; // 48-bit LCG
bits = seed >> 17; // keep the top 31 bits
val = bits % bound; // build val in [0, bound)
} while (bits - val + (bound-1) < 0);
return val;
}
Next we must understand what the while (bits - val + (bound-1) < 0) is about. It's here to use bits only when it is in an interval of width multiple of bound, thus insuring a uniform distribution of val. That interval is [0, (1L<<31)/bound*bound).
The while condition is equivalent to while (bits >= (1L<<31)/bound*bound), but executes faster. This condition occurs for the (1L<<31)%bound highest values of bits out of 1L<<31. When bound is 344, this occurs for 8 values of bits out of 231, or about 3.7 per US billion.
This is so rare that one reasonable approach is to assume it does not occur. Another is to hope that it occurs, and test if (and when) it does by seeing if the seeds that cause that rare event lead to a sequence of val found in the givens. We have only ((1L<<31)%bound)<<17 (here slightly above a million) values of seed to test, which is quite feasible. Whatever, in the rest I assume this is ruled out, and consider the generator without the while.
When bound is even, or more generally a multiple of 2S for some S>0, observe that the low-order S bits of the output val (which we can find) are also the low-order S bits of bits, and thus the bits of rank [17, 17+S) of seed. And that the low 17+S bits of seed are fully independent of the 31-S other ones. When bound is 344=8×43, we have S=3, and thus we can attack the low-order 17+S=20 low-order bits of seed independently. We directly get S=3 bits of seed from the first val.
We get the 17 low-order bits of seed by elimination: for each of 217 candidates, and given the S=3 bits we know, does the 17+S=20 bits of seed lead to a sequence of val which low-order S bits match the given sequence? With enough values we can make a full determination of the 17+S bits. We need ⌈17/S+1⌉ = 7 val to narrow down to a single value for the 17+S low-order bits of seedin this way. If we got less, we need to keep several candidates in the next step. In the question we have ample val to narrow to a single value, and be convinced we get it right.
And then when we have these 17+S=20 bits of seed, we can find the remaining 31-S=28 with a moderate amount of brute force. We could test 228 values of the yet unknown bits for seed and check which gives a full match with the known val. But better: we know seed % (bound<<17) exactly, and thus need only test 231/bound values of seed (here roughly 6 million).
How can I determine the amount of data I'd need to find the correct seed?
A working heuristic for all but pathological LCGs, and many other PRNGs, is that you need as much information as there are bits in the state, thus 48 bits. Each output gives you log2(bound) bits, thus you need ⌈48/log2(bound)⌉ values, here 6 (which would require keeping track of a few candidates for the low 20 bits of seed and thus require correspondingly more work in the second phase) . Extra values give confidence that the actual state was recovered, but AFAIK wrong guesses just will not happen unless the while comes into play.
Does this change given bounds that are either odd/even?
The above attack strategy does not work well for odd bound (we can't separately guess the low-order bits and need to search 248/bound values of seed). However there are better attacks with less guesswork, applicable even if we raise the number of state bits considerably, and including for odd bound. They are more difficult to explain (read: I hardly can get them working with a math package, and can't explain how, yet; see this question).

Cast one of the operands of multiplication operation to a long?

I am multiplying two numbers which is int and storing in long. But it displays error as ,
Cast one of the operands of this multiplication operation to a long. How to solve this?
private int milisecPerSecond = 1000;
/**
* The update interval in seconds.
*/
// Update frequency in seconds
private int updateIntervalInSec = 5;
/**
* The update interval.
*/
// Update frequency in milliseconds
private long updateInterval = milisecPerSecond * updateIntervalInSec;
First, this will not result in a compiler error. It's the usual case to multiply to int numbers and store the result in an int.
A signed int if 32 bit wide can take numbers up to approx. 2 billion. If your code does what it says, this will make an update interval of 2 million seconds, or 23 days.
If you really want to force a long-based calculation, just cast one of its factors:
long updateInterval = millisecsPerSecond * (long) updateIntervalInSec;
The second factor will get converted automatically then.
Keep in mind that only casting the result will not be enough:
private long updateInterval = (long) (milisecPerSecond * updateIntervalInSec);
// THIS WON'T DO AS EXPECTED
as it will do the problematic calculation on int and just cast the then maybe wrong result into a long.
Casting is something that's described on the first three pages of every book on Java. If you're serious enough about code quality to run code quality tools, then have someone with knowledge of the language you use sit beside you. A trained pair of fresh eyes for a code review is such a great resource.

new BigInteger(String) performance / complexity

I'm wondering about the performance/complexity of constructing BigInteger objects with the new BigInteger(String) constructor.
Consider the following method:
public static void testBigIntegerConstruction()
{
for (int exp = 1; exp < 10; exp++)
{
StringBuffer bigNumber = new StringBuffer((int) Math.pow(10.0, exp));
for (int i = 0; i < Math.pow(10.0, exp - 1); i++)
{
bigNumber.append("1234567890");
}
String val = bigNumber.toString();
long time = System.currentTimeMillis();
BigInteger bigOne = new BigInteger(val);
System.out.println("time for constructing a 10^" + exp
+ " digits BigInteger : " + ((System.currentTimeMillis() - time))
+ " ms");
}
}
This method creates BigInteger objects of Strings with 10^x digits, where x=1 at the beginning, and it's increased with every iteration. It measures and outputs the time required for constructing the corresponding BigInteger object.
On my machine (Intel Core i5 660, JDK 6 Update 25 32 bit) the output is:
time for constructing a 10^1 digits BigInteger : 0 ms
time for constructing a 10^2 digits BigInteger : 0 ms
time for constructing a 10^3 digits BigInteger : 0 ms
time for constructing a 10^4 digits BigInteger : 16 ms
time for constructing a 10^5 digits BigInteger : 656 ms
time for constructing a 10^6 digits BigInteger : 59936 ms
time for constructing a 10^7 digits BigInteger : 6227975 ms
While ignoring the lines up to 10^5 (because of possible distortions introduced by (processor) caching effects, JIT-compilation etc), we can clearly see a complexity of O(n^2) here.
Keeping in mind that every operation on a BigInteger creates a new one due to immutability, this is a major performance penality for huge numbers.
Questions:
Did I miss something?
Why is this the case?
Is this fixed in more recent JDKs?
Are there any alternatives?
UPDATE:
I did further measurements and I can confirm the statement from some of the answers: It seems that BigInteger is optimized for subsequent numerical operations with the expense of higher construction costs for huge numbers which seems reasonable for me.
Simplifying from the source somewhat, it's the case because in the "traditional" String parsing loop
for each digit y from left to right:
x = 10 * x + y
you have the issue that 10 * x takes time linear in the length of x, unavoidably, and that length grows by more-or-less a constant factor for each digit, also unavoidably.
(The actual implementation is somewhat smarter than this -- it tries to parse an int's worth of binary digits at a time, and so the actual multiplier in the loop is more likely 1 or 2 billion -- but yeah, it's still quadratic overall.)
That said, a number with 10^6 digits is at least a googol, and that's bigger than any number I've heard of being used even for cryptographic purposes. You're parsing a string that takes two megabytes of memory. Yes, it'll take a while, but I suspect the JDK authors didn't see the point of optimizing for such a rare use case.
The O(n^2) effort is caused by the decimal to binary conversion if the BigInteger is specified as decimal digits.
Also, 10^7 digits is a really huge number. For typical cryptographic algorithms like RSA you would deal with 10^3 to 10^4 digits. Most of the BigInteger operations are not optimized for such a large number of digits.
You're actually measuring the time it takes to parse a string and create the BigInteger. Numeric operations involving BigIntegers would be a lot more efficient than this.

Number of bits used in long - java

I want to know the number of bits really used in the long datatype in Java.
For instance:
long time = System.currentTimeMillis();
System.out.println(java.lang.Long.toHexString(time));
Output:
12c95165393
Here you can see it only has 11 hex digits which means only 44 bits out of 64 are utilized. 10 bits are still un-utilized. Is there a way to know at runtime how many bits are used OR to pad the remaining bits ?
Try Long.numberOfLeadingZeros():
long time = System.currentTimeMillis();
System.out.println(Long.numberOfLeadingZeros(time));
For used bits:
Long.SIZE - Long.numberOfLeadingZeros(time)
According to the API for 'toHexString', the leading zeros are stripped. http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Long.html#toHexString(long)
Therefore, you can't answer this question with the toHexString method.
You can format it with padding using String.format:
System.out.println(String.format("%016x", time));
If you want to know how many of the leading bits are unset you can either check the bits individually or you can right shift the number (with >>>, not >>) until it's zero.
There are just 11 hex digits in your example, because toHexString doesn't output leading zeroes.
Code:
String minBin=Long.toBinaryString(Long.MIN_VALUE),maxBin=Long.toBinaryString(Long.MAX_VALUE);
System.out.println("Number of bits in Min Long : "+minBin.length()+", Max Long : "+maxBin.length());
Output:
Number of bits in Min Long : 64, Max Long : 63 //1 extra bit in min long to represent sign
According to Java documentation, a long type variable always is 64 bits long.
If you mean to count how many bits there are between the least significant bit to the first most significant bit set to 1 then you can compute:
int usedBits( long time )
{
if( time != 0 )
return (int)( Math.log( time ) / Math.log( 2 ) ) + 1;
else
return 0;
}
The number of bits used are always 64. The number of bits set can be 0 and 64 (You should assume it can be anything else). The top bit set can be determined, but is highly unlikely to be useful. In short, you don't need to know.
If you want to print a fairly random number with a fixed width of digits or in hex, this is a printing/formatting issue and you can determine this with printf.

Categories

Resources