Is checking if a value is 0 faster than just overwriting it?

Is checking if a value is 0 faster than just overwriting it? - java

Let's just say I have an array which contains some numbers contained between 1 and 9, and another array of 9 elements which contains every number from 1 to 9 (1, 2, 3, ... 9). I want to read every number from the first array, and when I read the number X, put to 0 the X value in the second array (which would be contained in second_array[X-1]). Is it faster for the CPU to do this:
//size is the size of the first array
int temp = 0;
for(int i; i < size; i++)
{
temp = first_array[i];
if(second_array[temp-1] != 0) second_array[temp-1]= 0;
}
Or the same without the control:
int temp = 0;
for(int i; i < size; i++)
{
temp = first_array[i];
second_array[temp-1]= 0;
}
To be clear: does it take more time to make a control on the value, or to overwrite it? I need my code to execute as fast as possible, so every nanosecond saved would be useful. Thanks in advance.

The second version is more performant, as is does not require the check, which will happen in every iteration of the loop and only in one case yield true.
Furthermore you can improve even more if you write:
for(int i; i < size; i++)
{
temp = first_array[i];
}
second_array[size-1]= 0;

Writing data to memory is always fast when comparing to if condition.
Reason - to compare two values,you need to read them first and then you need to perform comparison.While in other case,you are simply writing data to memory which does not care about anything.

If the value is checked repeatedly, it may be cached in the CPU cache and accessed much faster from there than from the main RAM. Differently, writing the value requires this value to make all the way to the main memory that may take more time. Writing may be deferred, but the dirty cache row must be flushed sooner or later.
Hence the version that only repeatedly reads may be faster than the version that repeatedly writes.

You can just go with:
for(int i = 0; i < size; i++)
{
second_array[first_array[i]-1]= 0;
}
Even if the check is faster (i doubt it), it would only be faster if a large part of the second_array started as 0, as any non 0 location requires two operations.
Anyway, the best way to figure out the answer to these kind of questions is to use run it and benchmark! A quick benchmark on my side shows it is 60% slower if second_array does not contain zero's, but 50% faster when second_array is all zero's! That would lead me to conclude the check is faster if more than 55% of second_array start zeroed out.

Related

Very large Java ArrayList has slow traversal time

Solution: My ArrayList was filled with duplicates. I modified my code to filter these out, which reduced running times to about 1 second.
I am working on a algorithms project that requires me to look at large amounts of data.
My program has a potentially very large ArrayList (A) that has every element in it traversed. For each of these elements in (A), several other, calculated elements are added to another ArrayList (B). (B) will be much, much larger than (A).
Once my program has run through seven of these ArrayLists, the running time goes up to approximately 5 seconds. I'm trying to get that down to < 1 second, if possible.
I am open to different ways to traverse the ArrayList, as well as using a completely different data-structure. I don't care about the order of the values inside the lists, as long as I can go through all values, very fast. I have tried a linked-list and it was significantly slower.
Here is a snippet of code, to give you a better understanding. The code tries to find all single-digit permutations of a prime number.
public static Integer primeLoop(ArrayList current, int endVal, int size)
{
Integer compareVal = 0;
Integer currentVal = 0;
Integer tempVal = 0;
int currentSize = current.size()-1;
ArrayList next = new ArrayList();
for(int k = 0; k <= currentSize; k++)
{
currentVal = Integer.parseInt(current.get(k).toString());
for(int i = 1; i <= 5; i++)
{
for(int j = 0; j <= 9; j++)
{
compareVal = orderPrime(currentVal, endVal, i, j);
//System.out.println(compareVal);
if(!compareVal.equals(tempVal) && !currentVal.equals(compareVal))
{
tempVal = compareVal;
next.add(compareVal);
//System.out.println("Inserted: "+compareVal + " with parent: "+currentVal);
if(compareVal.equals(endVal))
{
System.out.println("Separation: " + size);
return -1;
}
}
}
}
}
size++;
//System.out.println(next);
primeLoop(next, endVal, size);
return -1;
}
*Edit: Removed unnecessary code from snippet above. Created a currSize variable that stops the program from having to call the size of (current) every time. Still no difference. Here is an idea of how the ArrayList grows:
2,
29,
249,
2293,
20727,
190819,

When something is slow, the typical advice is to profile it. This is generally wise, as it's often difficult to determine what's the cause of slowness, even for performance experts. Sometimes it's possible to pick out code that's likely to be a performance problem, but this is hit-or-miss. There are some likely things in this code, but it's hard to say for sure, since we don't have the code for the orderPrime() and primeLoop() methods.
That said, there's one thing that caught my eye. This line:
currentVal = Integer.parseInt(current.get(k).toString());
This gets an element from current, turns it into a string, parses it back to an int, and then boxes it into an Integer. Conversion to and from String is pretty expensive, and it allocates memory, so it puts pressure on garbage collection. Boxing primitive int values to Integer objects also allocates memory, contributing to GC pressure.
It's hard to say what the fix is, since you're using the raw type ArrayList for current. I surmise it might be ArrayList<Integer>, and if so, you could just replace this line with
currentVal = (Integer)current.get(k);
You should be using generics in order to avoid the cast. (But that doesn't affect performance, just the readability and type-safety of the code.)
If current doesn't contain Integer values, then it should. Whatever it contains should be converted to Integer beforehand, instead of putting conversions inside a loop.
After fixing this, you are still left with boxing/unboxing overhead. If performance is still a problem, you'll have to switch from ArrayList<Integer> to int[] because Java collections cannot contain primitives. This is inconvenient, since you'll have to implement your own list-like structure that simulates a variable-length array of int (or find a third party library that does this).
But even all of the above might not be enough to make your program run fast enough. I don't know what your algorithm is doing, but it looks like it's doing linear searching. There are a variety of ways to speed up searching. But another commenter suggested binary search, and you said it wasn't allowed, so it's not clear what can be done here.

Here is an idea of how the ArrayList grows: 2, 29, 249, 2293, 20727, 190819
Your next list grows too large, so it must contain duplicates:
190_819 entries for 100_000 numbers?
According to primes.utm.edu/howmany.html there are only 9,592 primes up to 100_000.
Getting rid of the duplicates will certainly improve your response times.

Why you have this line
current.iterator();
You don't use the iterator at all, you don't even have a variable for it. It's just waisting of time.
for(int k = 0; k <= current.size()-1; k++)
Instead of counting size every iteration, create value like:
int curSize = current.size() - 1;
And use it in loop.
It can save some time.

Slow initialization of large array of small objects

I've stumbled upon this case today and I'm wondering what is the reason behind this huge difference in time.
The first version initializes an 5k x 5k array of raw ints:
public void initializeRaw() {
int size = 5000;
int[][] a = new int[size][size];
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++)
a[i][j] = -1;
}
and it takes roughly 300ms on my machine.
On the other hand, initializing the same array with simple 2-int structs:
public class Struct { public int x; public int y; }
public void initializeStruct() {
int size = 5000;
Struct[][] a = new Struct[size][size];
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++)
a[i][j] = new Struct();
}
takes over 15000ms.
I would expect it to be a bit slower, after all there is more memory to allocate (10 bytes instead of 4 if I'm not mistaken), but I cannot understand why could it take 50 times longer.
Could anyone explain this? Maybe there is just a better way to to this kind of initialization in Java?
EDIT: For some comparison - the same code that uses Integer instead of int/Struct works 700ms - only two times slower.

I would expect it to be a bit slower, after all there is more memory to allocate (10 bytes instead of 4 if I'm not mistaken), but I cannot understand why could it take 50 times longer.
No, it's much worse than that. In the first case, you're creating 5001 objects. In the second case, you're creating 25,005,001 objects. Each of the Struct objects is going to take between 16 and 32 bytes, I suspect. (It will depend on various JVM details, but that's a rough guess.)
Your 5001 objects in the first case will take a total of ~100MB. The equivalent objects (the arrays) may take a total of ~200MB, if you're on a platform with 64-bit references... and then there's the other 25 million objects to allocate and initialize.
So yes, a pretty huge difference...

When you create an array of 5000 ints, you are allocating all the space you need for all those ints in one go, as a single block of consecutive elements. When you assign an int to each array element, you are not allocating anything. Contrast that with an array of 5000 Struct instances. You iterate through that array and for every single one of those 5000 elements you allocate a Struct instance. Allocating an object takes a lot longer than simply writing an int value into a variable.
The fact that you have two-dimensional arrays doesn't make much compararive difference here, as it just means you do allocate 5000 array objects in both cases.
If you are timing an array of Integer objects, and you are then setting each element to -1, then you are not allocating separate Integer objects each time. Instead, you are using autoboxing, which means the compiler is implicitly calling Integer.valueOf(-1) and that method returns the same object from the cache each time.
UPDATE: Going back to addressing your concern, if I understand correctly, you have a requirement to keep 5000x5000 Structs in a 2D array, and you are disappointed that creating this array takes a lot longer than using primitives. To improve performance, you can create two arrays of primitives, one for each field of Struct, but this would reduce code clarity.
You can also create a single array of longs (since each long is double the size of an int) and use the & and >> operators to get your original ints. Again, this would reduce code clarity, but you'll only have one array.
However, you seem to be concentrating on a single part of the code, namely the creation of the array. You may well find that the processing you do on each element overshadows the time it takes to create the array. Profile your whole application and see if the creation of arrays is significant.

Does java cache array length calculation in loops [duplicate]

This question already has answers here:
What is the Cost of Calling array.length
(8 answers)
Closed 7 years ago.
Lets say that i have an array which i would like to iterate over:
int[] someArray = {1,2,3,4}
for (int i = 0; i < someArray.length; i++) {
// do stuff
}
Will this length of aray be caclulated with each iteration or it will be optimized to calculate it only once ?
Should i iterate arrays by calculating the length in advance and pass that to a loop ?
for (int i = 0, length = someArray.length; i < length ; i++) {
// do stuff
}

From JLS 7
10.7 Array Members
The members of an array type are all of the following:
• The public final field length, which contains the number of components of
the array. length may be positive or zero.
Coming back to your question,java doesn't recount the number of elements in array on array.length. It returns the value of public final int length, calculated during array creation.

As always for performance: write the simplest code you can, and test it to see whether it performs well enough.
If you only need the element (and not the index) I would encourage you to use the enhanced-for loop:
for (int value : array) {
...
}
As per JLS 14.14.2 that's basically equivalent to your first piece of code, but the code only talks about what you're actually interested in.
But if you do need the index, and assuming you don't change array anywhere, I believe the JIT compiler will optimize the native code to only fetch the length once. Obtaining the length is an O(1) operation, as it's basically just a field within the array, but obviously it does involve hitting memory, so it's better for the eventual code to only do this once... but that doesn't mean that your code has to do this. Note that I wouldn't expect the Java compiler (javac) to perform this optimization - I'd expect the JIT to.
In fact, I believe a good JIT will actually see code such as:
for (int i = 0; i < array.length; i++) {
int value = array[i];
...
}
and be able to optimize away the array bounds checks - it can recognize that if it accesses the same array object all the time, that can't possibly fail with an array bounds error, so it can avoid the check. It may be able to do the same thing for more "clever" code that fetches the length beforehand, but JIT optimizations often deliberately target very common code patterns (in order to get the biggest "bang for buck") and the above way of iterating over an array is very common.

As length is the member of Array So it is already been set when you create an Array , On Each iteration you are only accessing that property nothing else .
So either you access it like
int myArrayLength=arr.length;
for(int i=0;i<myArrayLength;i++)
or like :
for(int i=0;i<arr.length;i++)
There will be no measurable performance change .

Efficient loop through Java List

The following list is from the google I/O talk in 2008 called "Dalvik Virtual Machine Internals" its a list of ways to loop over a set of objects in order from most to least efficient:
(1) for (int i = initializer; i >=0; i--) //hard to loop backwards
(2) int limit = calculate_limit(); for (int i= 0; i< limit; i++)
(3) Type[] array = get_array(); for (Type obj : array)
(4) for (int i =0; i< array.length; i++) //gets array.length everytime
(5) for (int i=0; i < this.var; i++) //has to calculate what this.var is
(6) for (int i=0; i < obj.size(); i++) //even worse calls function each time
(7) Iterable list = get_list(); for (Type obj : list) //generic object based iterators slow!
The first 3 are in the same territory of efficiency, avoid 7 if possible. This is mainly advice to help battery life but could potentially help java SE code too.
My question is: why (7) is slow and why (3) is good? I thought it might be the difference between Array and List for (3) and (7). Also, as Dan mentioned (7) creates loads of small temporary objects which have to be GCed, I'm a bit rusty on Java nowadays, can someone explain why? It's in his talk video at 0:41:10 for a minute.

This list is a bit outdated, and shouldn't be really useful today.
It was a good reference some years ago, when Android devices were slow and had very limited resources. The Dalvik VM implementation also lacked a lot of optimizations available today.
On such devices, a simple garbage collection easily took 1 or 2 seconds (for comparison, it takes around 20ms on most devices today). During the GC, the device just freezed, so the developers had to be very careful about memory consumption.
You shouldn't have to worry too much about that today, but if you really care about performance, here are some details :
(1) for (int i = initializer; i >= 0; i--) //hard to loop backwards
(2) int limit = calculate_limit(); for (int i=0; i < limit; i++)
(3) Type[] array = get_array(); for (Type obj : array)
These ones are easy to understand. i >= 0 is faster to evaluate than i < limit because it doesn't read the value of a variable before doing the comparison. It works directly with an integer literal, which is faster.
I don't know why (3) should be slower than (2). The compiler should produce the same loop as (2), but maybe the Dalvik VM didn't optimize it correctly at this time.
(4) for (int i=0; i < array.length; i++) //gets array.length everytime
(5) for (int i=0; i < this.var; i++) //has to calculate what this.var is
(6) for (int i=0; i < obj.size(); i++) //even worse calls function each time
These ones are already explained in the comments.
(7) Iterable list = get_list(); for (Type obj : list)
Iterables are slow because they allocate memory, do some error handling, call multiple methods internally, ... All of this is much slower than (6) which does only a single function call on each iteration.

I felt my first answer was not satisfactory and really didn't help to explain the question; I had posted the link to this site and elaborated a bit, which covered some basic use cases, but not the nitty-gritty of the issue. So, I went ahead and did a little hands-on research instead.
I ran two separate codes:
// Code 1
int i = 0;
Integer[] array = { 1, 2, 3, 4, 5 };
for (Integer obj : array) {
i += obj;
}
System.out.println(i);
// Code 2
int i = 0;
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.add(4);
list.add(5);
for (Integer obj : list) {
i += obj;
}
System.out.println(i);
Of course, both print out 15, and both use an Integer array (no ints).
Next, I used javap to disassemble these and look at the bytecode. (I ignored the initialization; everything before the for loop is commented out.) Since those are quite lengthy, I posted them at PasteBin here.
Now, while the bytecode for code 1 is actually longer, it is less intensive. It uses invokevirtual only once (aside from the println), and no other invocations are necessary. In code 1, it seems to optimize the iteration to a basic loop; checking the array length, and loading into our variable, then adding to i. This appears to be optimized to behave exactly as for (int i = 0; i < array.length; i++) { ... }.
Now, in code 2, the bytecode gets much more intensive. It has to make 2 invokeinterface calls (both to Iterator) in addition to every other call that is needed above. Additionally, code 2 has to call checkcast because it is a generic Iterator (which is not optimized, as I mentioned above). Now, despite the fact that there are less calls to load and store operations, these aforementioned calls involve a substantial amount more overhead.
As he says in the video, if you find yourself needing to do a lot of these, you may run into problems. Running one at the start of an Activity, for example, probably is not too big a deal. Just beware creating many of them, especially iterating in onDraw, for example.

I guess that the compiler optimizes (3) to this (this is the part where I'm guessing):
for (int i =0; i < array.length; ++i)
{
Type obj = array[i];
}
And (7) can't be optimized, since the compiler doesn't know what kind of Iterable it is. Which means that it really has to create a new Iterator on the heap. Allocating memory is expensive. And every time you ask for the next object, it goes trough some calls.
To give a rough sketch of what is happens when (7) gets compiled (sure about this):
Iterable<Type> iterable = get_iterable();
Iterator<Type> it = iterable.iterator(); // new object on the heap
while (it.hasNext()) // method call, some pushing and popping to the stack
{
Type obj = it.next(); // method call, again pushing and popping
}

I guess you have to marshall the objects into a "linked list" based Iterator, and then support an API, as opposed to a chunk of memory and a pointer (array)

Third variant is faster then 7 because array is reified type and JVM should just assign a pointer to a correct value. But when you iterate over a collection , then compiler can perform additional casts because of erasure. Actually compiler insert this casts to generic code to determine some dirty hacks like using deprecated raw types as soon as possible.
P.S. This is just a guess. In fact, I think that the compiler and JIT compiler can perform any optimisation (JIT even in runtime) and result can depend on particular details like JVM version and vendor.

In case of Android, this is the video from Google Developers in 2015.
To Index or Iterate? (Android Performance Patterns Season 2 ep6)
https://www.youtube.com/watch?v=MZOf3pOAM6A
They did the test on DALVIK runtime, 4.4.4 build, 10 times, to got average results.
The result shows "For index" is the best.
int size = list.size();
for (int index = 0; index < size; index++) {
Object object = list.get(index);
...
}
They also suggest to do similar test by yourself on your platform at the end of the video.

Is there any reason for initializing iterations of a loop from zero instead of one when not using arrays?

I mean
for (int i=1; i<7; i++)
is much more readable when it is purely for the number of iterations that
for (int i=0; i<6; i++)
but somewhat the other approach has become an standard.
What do you think? It´s a bad practice or discouraged?

I think it's simply because arrays are almost always 0-based, that when designers create other non-array objects that have collections, they have a tendency to make them 0-based as well. 0-based is just a standard, so sticking to it is just consistency and ease of use for maintainers.
Also, to me,
for(int i = 0; i < 6; i++) {
is more readable because I know that when 1 loop has completed, the count (i) will be one. With 1-based, after the first loop, the count is i = 2. This throws me off a bit.

Convention. Zero indexing is so widespread and historically prevalent that, unless there is a specific reason to do otherwise, it is what programmers expect to see. If they see a loop indexed from 1, they usually don't think "Ah, that's easier to read," they think "Oh, better figure out why we're starting from one here."

I would argue that
for (int i=1; i<7; i++)
is harder to read than
for (int i=0; i<6; i++)
because it is 6 iterations. You need to subtract 1 from 7 to figure that out seeing the first loop.
The two options are usually start at 0 and use < as your condition, or start at 1 and use <= as your condition like so:
for (int i=1; i<=6; i++)
Note that some languages (e.g. MATLAB) do index everything starting at 1.
What usually happens is that all the loops in the language follow the indexing convention of its most basic/widely-used naturally indexed construct. In Java, arrays are indexed at 0, and because it's a pain to switch from 0-based indexing to 1-based indexing and back, everything becomes 0-indexed.
For comparison, in MATLAB the arrays and matrices are 1-indexed, so all the loops are typically 1-indexed as well.

It stems back to the c pointer days (my c is rusty so forgive me).'
THe following is an array declaration in c
int arr[10] = { ... };
In reality, the "arr" variable is actually of type "int *" (but declaring it as above lets the compiler force automatic cleanup when it falls out of scope, as opposed to an int * with a malloc), which points to a memory location which has enough contiguous space for 10 int values. Then you access the array:
int element = arr[index];
What's really going on in the access is this:
int element = * (arr + index*sizeof(int));
It's taking the pointer, adding enough bytes to pointer to point at the memory location for element you're interested in (the part in the parens), and then dereferencing it to get the value (the *).
So, for the very first element in the array, your index is 0, because the pointer is already pointing at the first element, so you don't add anything to the memory location.
Think of the index as "how many slots do i have to add to my pointer to find what i want".

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.