I know that we can optimise "find even numbers" code by using bitwise operator &. The following program:
if(i%2==0) sout("even")
else sout("odd")
can be optimised to:
if(i&1==0) sout("even")
else sout("odd")
The above approach works only for 2 as a divisor. What if we have to optimise the code when we have multiple divisors like 4, 9, 20, 56 and so on? Is there a way to further optimise this code?
You obviously didn't even try what you posted because it doesn't compile (even with a reasonable sout added). First expression statements in Java end in semicolon, and second i&1==0 parses as i & (1==0) -> i & true and the & operator doesn't take an int and a boolean.
If i is negative and odd, i%2 is -1 while i&1 is 1 = +1. That's because % is remainder not modulo.
In the limited cases where i%n and (i&(n-1)) are the same -- i nonnegative and n a power of two -- as the commenters said the Java runtime compiler (JIT) will actually produce the same code for both and obfuscating the source will only make your program more likely to be or become wrong without providing any benefit.
Fifty years ago when people were writing in assembler for machines with microsecond clocks (i.e. not even megaHertz much less gigaHertz) this sometimes helped -- only sometimes because usually only a small fraction of the code matters to execution time. In this century it's at best a waste and often harmful.
Related
A few weeks ago I wrote an exam. The first task was to find the right approximation of a function with the given properties: Properties of the function
I had to check every approximation with the tests i write for the properties. The properties 2, 3 and 4 were no problem. But I don't know how to check for property 1 using a JUnit test written in Java. My approach was to do it this way:
#Test
void test1() {
Double x = 0.01;
Double res = underTest.apply(x);
for(int i = 0; i < 100000; ++i) {
x = rnd(0, x);
Double lastRes = res;
res = underTest.apply(x);
assertTrue(res <= lastRes);
}
}
Where rnd(0, x) is a function call that generates a random number within (0,x].
But this can't be the right way, because it is only checking for the x's getting smaller, the result is smaller than the previous one. I.e. the test would also succeed if the first res is equal to -5 and the next a result little smaller than the previous for all 100000 iterations. So it could be that the result after the 100000 iteration is -5.5 (or something else). That means the test would also succeed for a function where the right limit to 0 is -6. Is there a way to check for property 1?
Is there a way to check for property 1?
Trivially? No, of course not. Calculus is all about what happens at infinity / at infinitely close to any particular point – where basic arithmetic gets stuck trying to divide a number that grows ever tinier by a range that is ever smaller, and basic arithmetic can't do '0 divided by 0'.
Computers are like basic arithmetic here, or at least, you are applying basic arithmetic in the code you have pasted, and making the computer do actual calculus is considerably more complicated than what you do here. As in, years of programming experience required more complicated; I very much doubt you were supposed to write a few hundred thousand lines of code to build an entire computerized math platform to write this test, surely.
More generally, you seem to be labouring under the notion that a test proves anything.
They do not. A test cannot prove correctness, ever. Just like science can never prove anything, they can only disprove (and can build up ever more solid foundations and confirmations which any rational person usually considers more than enough so that they will e.g. take as given that the Law of Gravity will hold, even if there is no proof of this and there never will be).
A test can only prove that you have a bug. It can never prove that you don't.
Therefore, anything you can do here is merely a shades of gray scenario: You can make this test catch more scenarios where you know the underTest algorithm is incorrect, but it is impossible to catch all of them.
What you have pasted is already useful: This test proves that as you use double values that get ever closer to 0, your test ensures that the output of the function grows ever smaller. That's worth something. You seem to think this particular shade of gray isn't dark enough for you. You want to show that it gets fairly close to infinity.
That's... your choice. You cannot PROVE that it gets to infinity, because computers are like basic arithmetic with some approximation sprinkled in.
There isn't a heck of a lot you can do here: Yes, you can test if the final res value is for example 'less than -1000000'. You can even test if it is literally negative infinity, but there is no guarantee that this is even correct; a function that is defined as 'as the input goes to 0 from the positive end, the output will tend towards negative infinity' is free to do so only for an input so incredibly tiny, that double cannot represent it at all (computers are not magical; doubles take 64 bit, that means there are at most 2^64 unique numbers that double can even represent. 2^64 is a very large number, but it is nothing compared to the doubly-dimensioned infinity that is the concept of 'all numbers imaginable' (there are an infinite amount of numbers between 0 and 1, and an infinite amount of such numbers across the whole number line, after all). Thus, there are plenty of very tiny numbers that double just cannot represent. At all.
For your own sanity, using randomness in a unit test is a bad idea, and for some definitions of the word 'unit test', literally broken (some labour under the notion that unit tests must be reliable or they cannot be considered a unit test, and this is not such a crazy notion if you look at what 'unit test' pragmatically speaking ends up being used for: To have test environments automatically run the unit tests, repeatedly and near continuously, in order to flag down ASAP when someone breaks one. If the CI server runs a unit test 1819 times a day, it would get quite annoying if by sheer random chance, one in 20k times it fails; it would then assume the most recent commit is to blame and no framework out there I know of repeats a unit test a few more times. In the end programming works best if you move away from the notion of proofs and cold hard definitions, and move towards 'do what the community thinks things mean'. For unit tests, that means: Don't use randomness).
Firstly: you cannot reliably test ANY of the properties
function f could break one of the properties in a point x which is not even representable as a double due to limited precision
there is too much points too test, realistically you need to pick a subset of the domain
Secondly:
your definition of a limit is wrong. You check that the function is monotonically decreasing. This is not required by limit definition - the function could fluctuate when approaching the limit. In the general case, I would probably follow Weierstrass definition
But:
By eyeballing the conditions you can quickly notice that a logarithmic function (with any base) meets the criteria. (so the function is indeed monotonically decreasing).
Let's pick natural logarithm, and check its value at smallest x that can be represented as a double:
System.out.println(Double.MIN_VALUE);
System.out.println(Math.log(Double.MIN_VALUE));
System.out.println(Math.log(Double.MIN_VALUE) < -1000);
// prints
4.9E-324
-744.4400719213812
false
As you can see, the value is circa -744, which is really far from minus infinity. You cannot get any closer on double represented on 64 bits.
So I was examining the Integer's class source code (JDK 8) to understand how an int get converted to a String. It seems to be using a package private method called getChars (line 433) to convert an int to char array.
While the code is not that difficult to understand, however, there are multiple lines of code where they use bitwise shift operations instead of simple arithmetic multiplication/division, such as the following lines of code:
// really: r = i - (q * 100);
r = i - ((q << 6) + (q << 5) + (q << 2));
and
q = (i * 52429) >>> (16+3);
r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
I just do not understand the point of doing that, is this actually an optimization and does it affect the runtime of the algorithm?
Edit:
To put it in a another way, since the compiler does this type of optimization internally, is this manual optimization necessary?
I don't know the reason for this specific change and unless you find the original author, it's unlikely you'll find an authoritative answer to that anyway.
But I'd like to respond to the wider point, which is that a lot of code in the runtime library (java.* and many internal packages) is optimized to a degree that would be very unusual (and I dare say irresponsible) to apply to "normal" application code.
And that has basically two reasons:
It's called a lot and in many different environment. Optimizing a method in your server to take 0.1% less CPU time when it's only executed 50 times per day on 3 servers each won't be worth the effort you put into it. If, however, you can make Integer.toString 0.1% faster for everyone who will ever execute it, then this can turn into a very big change indeed.
If you optimize your application code on a specific VM then updating that VM to a newer version can easily undo your optimization, when the compiler decides to optimize differently. With code in java.* this is far less of an issue, because it is always shipped with the runtime that will run it. So if they introduce a compiler change that makes a given optimization no longer optimal, then they can change the code to match this.
tl;dr java.* code is often optimized to an insane degree because it's worth it and they can know that it will actually work.
There are a couple reasons that this is done. Being a long-time embedded developer, using tiny microcontrollers that sometimes didn't even have a multiplication and division instruction, I can tell you that this is significantly faster. The key here is that the multiplier is a constant. If you were multiplying two variables, you'd either need to use the slower multiply and divide operators or, if they didn't exist, perform multiplication using a loop with the add operator.
As far as I know, modulo % is a pretty costly operation, backed by division opertation underneath, the slowest operation for CPU.
Whether it worth or not to substitute this operation explicitly by it's bitwise analog number & (divisor - 1) in code, or JIT can do this for us implicitly?
As far as I aware JIT doesn't optimize such expression so:
number%divisor is not faster (slower or same speed) than number & (divisor - 1) in case of divisor is constant (so divisor - 1 can be calculated in compile time).
It is difficult to say how big difference will be because in modern CPU it will depend on code around it, cache state, and many other factors.
PS: Keep in mind that optimization will work only if divisor is power of 2.
This question already has answers here:
Right Shift to Perform Divide by 2 On -1
(6 answers)
Closed 8 years ago.
While reading Java Source code for Collections.reverse method, Right Shift operator is used for finding middle.
......
for (int i=0, mid=size>>1, j=size-1; i<mid; i++, j--) // Right Shift
swap(list, i, j);
.....
Same can be done by using traditional divide by 2 approach.
I explored on stack Right Shift to perform division and find that its better to use division operator and not Right Shift.
UPDATE : But then why java guys used Right Shift and not division ?
So which approach is better to use and Why ?
Signed division by 2 and right shift by 1 are not completely equivalent. Division by 2 rounds towards zero, even for negative numbers. Right shift by 1 rounds downwards, which means -1 >> 1 is -1 (whereas -1 / 2 is zero).
Concretely, that means that if the JIT compiler can not (or does not) prove that a number can not be negative (if you had posted the full code, I might have been able to check that), it has to do a something more complicated than merely a right shift - something like this: (divides eax by 2 and clobbers edi, based on what GCC output)
mov edi, eax
shr eax, 31
add eax, edi
sar eax, 1
If you had used a right shift by 1, it would just be something like
sar eax, 1
It's not a big difference, but it is a difference, so the "it doesn't make any difference"-crowd can go home now. Ok it's only on the loop initialization, so it doesn't have a serious impact on performance, but let's not forget that this is library code - different guidelines apply. Specifically, readability is less emphasized, and the guideline "don't waste performance unless you absolutely must" is more emphasized. Under the circumstances, there is no good reason to write size / 2 there, all that would do is make the performance a tiny bit worse. There is no upside.
Also, I find this readability thing a little silly in this case. If someone really doesn't know what size >> 1 does, that's their problem - it's just one of the basic operators, not even some convoluted combination of operators, if you can't read it then you don't know Java.
But feel free to use size / 2 in your own code. The takeaway from this answer shouldn't be "division by 2 is bad", but rather, "library code shouldn't sacrifice performance for readability".
It's always better to use the more readable option, unless there is a pressing need for speed.
Go with the clear, obvious division and then if you find yourself needing to optimize later you can change to the right shift and comment clearly.
In practice it doesn't matter, but a division makes the intent cleaner, and I can only imagine that one would attempt a bitshift for "performance reasons". But it's 2014 and you're not writing x86 assembly by hand, so trying to optimize code like this is a waste of time.
The right shift operator is fast as compared to the division operator.
As you know all the data is stored and processed in the binary format. The right shift works directly on the binary format and hence is fast and optimal. The division works on integer numbers and hence is slow and not optimal.
But since the processor speed now is fairly good and it doesnt really make a difference that you use which ever operator, the choice is really yours.
If you think that your app already takes too much of processor speed and you really feel a need to lessen the load, you can use right shift. For light weight applications, division operator is suitable.
I have a situation where I might need to apply a multiplier to a value in order to get the correct results. This involves computing the value using floating point division.
I'm thinking it would be a good idea to check the values before I perform floating point logic on them to save processor time, however I'm not sure how efficient it will be at run-time either way.
I'm assuming that the if check is 1 or 2 instructions (been a while since assembly class), and that the floating point operation is going to be many more than that.
//Check
if (a != 10) { //1 or 2 instructions?
b *= (float) a / 10; //Many instructions?
}
Value a is going to be '10' most of the time, however there are a few instances where it wont be. Is the floating point division going to take very many cycles even if a is equal to the divisor?
Will the previous code with the if statement execute more efficiently than simply the next one without?
//Don't check
b *= (float) a / 10; //Many instructions?
Granted there wont be any noticable difference either way, however I'm curious as to the behavior of the floating point multiplication when the divisor is equal to the dividend in case things get processor heavy.
Assuming this is in some incredibly tight loop, executed billions of times, so the difference of 1-2 instructions matters, since otherwise you should probably not bother --
Yes you are right to weigh the cost of the additional check each time, versus the savings when the check is true. But my guess is that it has to be true a lot to overcome not only the extra overhead, but the fact that you're introducing a branch, which will ultimately do more to slow you down via a pipeline stall in the CPU in the JIT-compiled code than you'll gain otherwise.
If a == 10 a whole lot, I'd imagine there's a better and faster way to take advantage of that somehow, earlier in the code.
IIRC, floating-point multiplication is much less expensive than division, so this might be faster than both:
b *= (a * 0.1);
If you do end up needing to optimize this code I would recommend using Caliper to do micro benchmarks of your inner loops. It's very hard to predict accurately what sort of effect these small modifications will have. Especially in Java where how the VM behaves is bit of an unknown since in theory it can optimize the code on the fly. Best to try several strategies and see what works.
http://code.google.com/p/caliper/