I recently read a StackOverflow question that indicated, when accessing variables, it is faster to use the stack than the heap:
void f() {
int x = 123; // <- located in stack
}
int x; // <- located in heap
void f() {
x = 123
}
However, I can't work it through my head which is faster in my example (since I assume they are both using the stack). I'm working on hitbox calculation and such, which uses alot of X-Y, width, height variables (up to 10-20 times for each) in the function.
Is it faster to use an object's get() method each time or set it to a local variable at the start of the function?
In code, is it faster to (or more efficient to):
void f() {
doSomething(foo.getValue() + bar.getValue());
doSomethingElse(foo.getValue(), bar.getValue());
doAnotherThing(foo.getValue(), bar.getValue());
// ... <- lot's of accessing (something.getValue());
}
or
void g() {
int firstInt = foo.getValue();
int secondInt = bar.getValue();
doSomething(firstInt + secondInt);
doSomethingElse(firstInt, secondInt);
doAnotherThing(firstInt, secondInt);
// ... <- lot's of accessing firstInt and secondInt
}
when foo and bar are MyObject's
public class MyObject {
int x = 1;
public int getValue() {
return x;
}
}
If they are about the same efficiency, how many times do I have to perform a .getValue() for it to become less efficient?
Thanks in advance!
JIT will change (optimize) your code on runtime, so this is not important in Java. One simple JIT optimization is method inlining.
For further optimization read about Micro Benchmarking and look at this question How do I write a correct micro-benchmark in Java?
If you do use local variables, you may tell the compiler that the value will not be changed final int x = ...;, turning it in some kind of local constant.
Reassigning values (recycling objects and such) may help reduce garbage collection (GC).
Write some extreme stress test, if possible a visual one too, let it run for a long time and measure real performance, perceived performance. A better faster time is not always better, sometimes it may be a bit slower but a bit smoother across the execution.
Related
So, the question is. If I'm calling method guess from class - Player and it is a void-type method without return statement in it, how come I'm able to store result of number = (int)(Math.random() * 10) in number variable for 3 different objects (p1, p2, p3)?
I'm little confused about when should I use return statement or void-type methods, because if number = (int)(Math.random() * 10) is giving some results which I want to use, why then I don't need to return this results from a method to pass them to the number variable which I declared in int number = 0;
public class Player {
int number = 0;
public void guess() {
number = (int)(Math.random() * 10);
System.out.println("I'm guessing " + number);
}
}
A void method does not return anything, but it still allows you to do things. (Print to the console, modify variables etc) The void keyword just means that it doesn't return a value. (In void methods you can still use a blank return; to end the method) And because you are modifying your number variable in the GuessGame object the changes you make will stay even though you don't return a variable. Try this simple test to see what I mean:
//In your GuessGame class
int number = 0;
public void foo() {
number++;
}
public static void main(String[] args) {
GuessGames games = new GuessGames();
games.foo();
System.out.println(games.number);
//Outputs 1
}
docs for the return statement
The point is: where is the result of Math.random() * 10 physically stored on your computer when your program is run? You list two options.
Options 1: Instance field
In this case the compiler instructs your operating system to reserve space for a int variable for the whole life of the Player object. The player object may live for microseconds, seconds, minutes, hours, days, months, ... it depends! This storage space is usually find in the RAM of the computer and from Java you can access it with the syntax myPlayer.number as long as you have a Player reference somewhere.
Options 2: Return value
In this case the compiler finds the space to store the result of the computation in a register of the Java virtual machine, that you can mentally map to a register of the physical processor. This value will only at best survive for a couple of processor cycles (there are gazillinos in a GHz CPU, so it's really a tiny little fracion of a second) if you don't store it somewhere else - and if you don't it's lost forever. See the following example:
private int someRandom;
private int gimmeARandom() {
return Math.random() * 10;
}
private int test() {
int someRandom = gimmeARandom(); // --> store the value until end of method
this.someRandom = someRandom; // --> further keep it so we can read it later
gimmeARandom(); // --> what did it returned? We'll never know
}
Void is different than static - void just means the function does not return anything, but it can still be a instance method, i.e. one that is associated with each new instance of a class. I think you're confusing this with the functionality of static, which allows methods to be called without an instance of the class.
Inspired by another question on Stack Overflow, I have written a micro-benchmark to check, what is more efficient:
conditionally checking for zero divisor or
catching and handling an ArithmeticException
Below is my code:
#State(Scope.Thread)
#BenchmarkMode(Mode.AverageTime)
#OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark {
private int a = 10;
// a little bit less obvious than int b = 0;
private int b = (int) Math.floor(Math.random());
#Benchmark
public float conditional() {
if (b == 0) {
return 0;
} else {
return a / b;
}
}
#Benchmark
public float exceptional() {
try {
return a / b;
} catch (ArithmeticException aex) {
return 0;
}
}
}
I am totally new to JMH and not sure if the code is allright.
Is my benchmark correct? Do you see any mistakes?
Side not: please don't suggest asking on https://codereview.stackexchange.com. For Codereview code must already work as intended. I am not sure this benchmark works as intended.
The big thing I see missing is any sort of randomness. That will make it easier for the branch prediction to do its work, which will make both methods faster than they probably would be in practice for division by 0.
I would do three variations of each method:
with a random array with zeros intermixed, and have the benchmark be parameterized with an index into that array.
with a random array of non-zero numbers
with all 0s
That should give you a good idea of the overall performance, including branch prediction. For point (1), it may also be interesting to play with the ratio of 0s to non-0s.
I forget if JMH lets you parameterize directly on individual values of an array. If it does, then I'd use that. Otherwise, you'll have to parameterize on the index to that array. In that case, I would also put the all-0s in an array so that the stay access is part of all tests. I would also probably create a "control" that just accesses the array and returns its value, so that you can find out that overhead more directly.
Also, a minor nit: I don't think you need to return floats, since they'll just be converted from the ints that the division actually produces.
This question is specifically geared towards the Java language, but I would not mind feedback about this being a general concept if so. I would like to know which operation might be faster, or if there is no difference between assigning a variable a value and performing tests for values. For this issue we could have a large series of Boolean values that will have many requests for changes. I would like to know if testing for the need to change a value would be considered a waste when weighed against the speed of simply changing the value during every request.
public static void main(String[] args){
Boolean array[] = new Boolean[veryLargeValue];
for(int i = 0; i < array.length; i++) {
array[i] = randomTrueFalseAssignment;
}
for(int i = 400; i < array.length - 400; i++) {
testAndChange(array, i);
}
for(int i = 400; i < array.length - 400; i++) {
justChange(array, i);
}
}
This could be the testAndChange method
public static void testAndChange(Boolean[] pArray, int ind) {
if(pArray)
pArray[ind] = false;
}
This could be the justChange method
public static void justChange(Boolean[] pArray, int ind) {
pArray[ind] = false;
}
If we were to end up with the very rare case that every value within the range supplied to the methods were false, would there be a point where one method would eventually become slower than the other? Is there a best practice for issues similar to this?
Edit: I wanted to add this to help clarify this question a bit more. I realize that the data type can be factored into the answer as larger or more efficient datatypes can be utilized. I am more focused on the task itself. Is the task of a test "if(aConditionalTest)" is slower, faster, or indeterminable without additional informaiton (such as data type) than the task of an assignment "x=avalue".
As #TrippKinetics points out, there is a semantical difference between the two methods. Because you use Boolean instead of boolean, it is possible that one of the values is a null reference. In that case the first method (with the if-statement) will throw an exception while the second, simply assigns values to all the elements in the array.
Assuming you use boolean[] instead of Boolean[]. Optimization is an undecidable problem. There are very rare cases where adding an if-statement could result in better performance. For instance most processors use cache and the if-statement can result in the fact that the executed code is stored exactly on two cache-pages where without an if on more resulting in cache faults. Perhaps you think you will save an assignment instruction but at the cost of a fetch instruction and a conditional instruction (which breaks the CPU pipeline). Assigning has more or less the same cost as fetching a value.
In general however, one can assume that adding an if statement is useless and will nearly always result in slower code. So you can quite safely state that the if statement will slow down your code always.
More specifically on your question, there are faster ways to set a range to false. For instance using bitvectors like:
long[] data = new long[(veryLargeValue+0x3f)>>0x06];//a long has 64 bits
//assign random values
int low = 400>>0x06;
int high = (veryLargeValue-400)>>0x06;
data[low] &= 0xffffffffffffffff<<(0x3f-(400&0x3f));
for(int i = low+0x01; i < high; i++) {
data[i] = 0x00;
}
data[high] &= 0xffffffffffffffff>>(veryLargeValue-400)&0x3f));
The advantage is that a processor can perform operations on 32- or 64-bits at once. Since a boolean is one bit, by storing bits into a long or int, operations are done in parallel.
The 2 following versions of the same function (which basically tries to recover a password by brute force) do not give same performance:
Version 1:
private static final char[] CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".toCharArray();
private static final int N_CHARS = CHARS.length;
private static final int MAX_LENGTH = 8;
private static char[] recoverPassword()
{
char word[];
int refi, i, indexes[];
for (int length = 1; length <= MAX_LENGTH; length++)
{
refi = length - 1;
word = new char[length];
indexes = new int[length];
indexes[length - 1] = 1;
while(true)
{
i = length - 1;
while ((++indexes[i]) == N_CHARS)
{
word[i] = CHARS[indexes[i] = 0];
if (--i < 0)
break;
}
if (i < 0)
break;
word[i] = CHARS[indexes[i]];
if (isValid(word))
return word;
}
}
return null;
}
Version 2:
private static final char[] CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".toCharArray();
private static final int N_CHARS = CHARS.length;
private static final int MAX_LENGTH = 8;
private static char[] recoverPassword()
{
char word[];
int refi, i, indexes[];
for (int length = 1; length <= MAX_LENGTH; length++)
{
refi = length - 1;
word = new char[length];
indexes = new int[length];
indexes[length - 1] = 1;
while(true)
{
i = refi;
while ((++indexes[i]) == N_CHARS)
{
word[i] = CHARS[indexes[i] = 0];
if (--i < 0)
break;
}
if (i < 0)
break;
word[i] = CHARS[indexes[i]];
if (isValid(word))
return word;
}
}
return null;
}
I would expect version 2 to be faster, as it does (and that is the only difference):
i = refi;
...as compare to version 1:
i = length -1;
However, it's the opposite: version 1 is faster by over 3%!
Does someone knows why? Is that due to some optimization done by the compiler?
Thank you all for your answers that far.
Just to add that the goal is actually not to optimize this piece of code (which is already quite optimized), but more to understand, from a compiler / CPU / architecture perspective, what could explain such performance difference.
Your answers have been very helpful, thanks again!
Key
It is difficult to check this in a micro-benchmark because you cannot say for sure how the code has been optimised without reading the machine code generated, even then the CPU can do plenty of tricks to optimise it future eg. it turns the x86 code in RISC style instructions to actually execute.
A computation takes as little as one cycle and the CPU can perform up to three of them at once. An access to L1 cache takes 4 cycles and for L2, L3, main memory it takes 11, 40-75, 200 cycles.
Storing values to avoid a simple calculation is actually slower in many cases. BTW using division and modulus is quite expensive and caching this value can be worth it when micro-tuning your code.
The correct answer should be retrievable by a deassembler (i mean .class -> .java converter),
but my guess is that the compiler might have decided to get rid of iref altogether and decided to store length - 1 an auxiliary register.
I'm more of a c++ guy, but I would start by trying:
const int refi = length - 1;
inside the for loop. Also you should probably use
indexes[ refi ] = 1;
Comparing running times of codes does not give exact or quarantine results
First of all, it is not the way comparing performances like this. A running time analysis is needed here. Both 2 codes have same loop structure and their running time are the same. You may have different running times when you run codes. However, they mostly differ with cache hits, I/O times, thread & process schedules. There is no quarantine that code is always completed in a exact time.
However, there is still differences in your code, to understand the difference you should look into your CPU architecture. I can explain it according to x86 architecture basically.
What happens behind the scenes?
i = refi;
CPU takes refi and i to its registers from ram. there is 2 access to ram if the values in not in the cache. and value of i will be written to the ram. However, it always takes different times according to thread & process schedules. Furrhermore, if the values are in virtual memory it wil take longer time.
i = length -1;
CPU also access i and length from ram or cache. there is same number of accesses. In addition, there is a subtraction here which means extra CPU cycles. That is why you think this one take longer time to complete. It is expected, but the issues that i mentioned above explain why this take longer time.
Summation
As i explain this is not the way of comparing performance. I think, there is no real difference between these codes. There are lots of optimizations inside CPU and also in compiler. You can see optimized codes if you decompile .class files.
My advice is it is better to minimize BigO running time analysis. If you find better algorithms it is the best way of optimizing codes. In case you still have bottlenecks in your code, you may try micro-benchmarking.
See also
Analysis of algorithms
Big O notation
Microprocessor
Compiler optimization
CPU Scheduling
To start with, you can't really compare the performance by just running your program - micro benchmarking in Java is complicated.
Also, a subtraction on modern CPUs can take as little as a third of a clock cycle on average. On a 3GHz CPU, that is 0.1 nanoseconds. And nothing tells you that the subtraction actually happens as the compiler might have modified the code.
So:
You should try to check the generated assembly code.
If you really care about the performance, create an appropriate micro-benchmark.
Consider the following two segments of code in Java,
Integer x=new Integer(100);
Integer y=x;
Integer z=x;
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
In which the memory usage was when tested on my system : Used memory (bytes): 287848
and
int a=100;
int b=a;
int c=a;
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
In which the memory usage was when tested on my system : Used memory (bytes): 287872
and the following
Integer x=new Integer(100);
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
and
int a=100;
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
in both of the above cases, the memory usage was exactly the same when tested on my system : Used memory (bytes): 287872
The statement
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
will display the total memory currently in use [Total available memory-Currently free available memory], (in bytes).
I have alternatively verified through the above mentioned methods that in the first case the memory usage (287848) was lower than the second one (287872) while in the rest of the two cases it was exactly the same (287872). Of course and obviously, it should be such because in the very first case, y and z contain a copy of the reference held in x and they all (x, y and z) point to the same/common object (location) means that the first case is better and more appropriate than the second one and in the rest of the two cases, there are equivalent statements with exactly the same memory usage (287872). If it is so, then the use of primitive data types in Java should be useless and avoidable though they were basically designed for better memory usage and more CPU utilization. still why do primitive data types in Java exist?
A question somewhat similar to this one was already posted here but it did not have such a scenario.
That question is here.
I wouldn't pay attention to Runtime.freeMemory -- it's very ambiguous (does it include unused stack space? PermGen space? gaps between heap objects that are too small to be used?), and giving any precise measurement without halting all threads is impossible.
Integers are necessarily less space efficient than ints, because just the reference to the Integer takes 32 bits (64 for a 64-bit JVM without compressed pointers).
If you really want to test it empirically, have many threads recurse deeply and then wait. As in
class TestThread extends Thread {
private void recurse(int depth) {
int a, b, c, d, e, f, g;
if (depth < 100)
recurse(depth + 1);
for (;;) try {
Thread.sleep(Long.MAX_VALUE);
} catch (InterruptedException e) {}
}
#Override public void run() {
recurse(0);
}
public static void main(String[] _) {
for (int i = 0; i < 500; ++i)
new TestThread().start();
}
}
For a start, an Integer wraps an int, therefore Integer has to be at least as big as int.
From the docs (I really doubt this is necessary):
The Integer class wraps a value of the primitive type int in an
object. An object of type Integer contains a single field whose type
is int.
So obviously a primitive int is still being used.
Not only that but objects have more overhead, and the most obvious one is that when you're using objects your variable contains a reference to it:
Integer obj = new Integer(100);
int prim = 100;
ie. obj stores a reference to an Integer object, which contains an int, whereas prim stores the value 100. That there's enough to prove that using Integer over int brings with it more overhead. And there's more overhead than just that.
The wrapper contains a primitive as a field, but it causes additional overhead because it's an object. The reference takes up space as well, but your example isn't really designed to show this.
The tests you designed aren't really well-suited for a precise measurement, but since you used them, try this example instead:
public static void main(String[] args) {
int numInts = 100000;
Integer[] array = new Integer[numInts];
// int[] array = new int[numInts];
for(int i = 0; i < numInts; i++){
array[i] = i; //put some real data into the arrays using auto-boxing if needed
}
System.out.println("Used memory (bytes): " +
(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
}
Now try it again but uncomment the primitive line and comment out the wrapper line. You should see that the wrapper takes up much more memory
If your first example, you have the equivalent to 1 integer, and 2 pointers.
Because Integer is an Object, it has pointer properties, and contains functions.
By using int instead of Integer, you are copying the value 3 times.
You have a difference in 24 bytes, which is used for storing the headers and values of your extra 2 ints. Although I wouldn't trust your test: the JVM can be somewhat random, and it's garbage collection is quite dynamic. As far as required memory for a single Integer vs int, Integer will take up more space because it is an Object, and thus contains more information.
Runtime.getRuntime().freeMemory() : getting delta on this does not give you the correct statistics as there are many moving parts like garbage collection and other threads.
Integer takes more memory than int primitive.
Your test case is too simple to be of any conclusive result.
Any test case that takes less than 5 seconds doesn't mean anything.
You need to at least do something with these objects you are creating. The JVM can simply look at your code and just not do anything because your objects aren't ever used, and you exit. (Can't say for certain what the JVM interpreter does, but the JIT will use escape analysis to optimize your entire testcase into nothing)
First of all, if you're looking for memory effectiveness, primitives are smaller because they are what size they are. The wrapper objects are objects, and need to be garbage collected. They have tons of fields within them that you can use, those fields are stored somewhere...
Primitives aren't "designed" to be more effective. Wrapper objects were designed to be more feature friendly. You need primitives, because how else are you going to store a number?
If you really wan't to see the memory difference, take a real application. If you want to write it yourself, go ahead but it'll take some time. Use some text editor and search and replace every single int declaration with Integer, and long with Long, etc. Then take a look at the memory footprint. I wouldn't be surprised if you see your computer explode.
From a programming point of view, you need to use primitives when necessary, and wrapper objects when necessary. When its applicable to do both, it's your preference. Trust me, there aren't that many.
http://www.javaspecialists.eu/archive/Issue193.html
This might help you understand/explore things a little bit more. An excellent article! Cheers!
If you look at the source code of java.lang.Integer, the value is stored as an int.
private int value;
Your test is not valid, that's all there is to it.
Proof:
when you run these Tests you'll get an AssertionError in second Test (because memory gets lower, even if you stop resetting memory-field). Once you try this tests with 10.000 loops you'll get at both StackOverflowError.
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;
import org.junit.Test;
public class TestRedundantIntegers {
private long memory;
#Test
public void whenRecursiveIntIsSet() {
memory = Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory();
recurseInt(0, 100);
}
private void recurseInt(int depth, int someInt) {
int x = someInt;
assertThat(memory,is(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
memory=Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory();
if (depth < 1000)
recurseInt(depth + 1, x);
}
#Test
public void whenRecursiveIntegerIsSet() {
memory = Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory();
recurseInteger(0, new Integer(100));
}
private void recurseInteger(int depth, Integer someInt) {
Integer x = someInt;
assertThat(memory,is(Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory()));
memory=Runtime.getRuntime().totalMemory()-Runtime.getRuntime().freeMemory();
if (depth < 1000)
recurseInt(depth + 1, x);
}
}
As for "where and when": use the non-primitive types where an Object is required, and the primitives everywhere else. For example, the types of a generic can't be primitive, so you can't use primitives with them. Even before generics were introduced, things like HashSet and HashMap couldn't store primitives.