It's my first working on a quite big project, and I've been asked to obtain the best performances.
So I've thouhgt to replace my for loops with a ListIterator, because I've got around 180 loops which call list.get(i) on lists with about 5000 elements.
So I've got two questions.
1) Are those 2 snippets equal? I mean, do them produce the same output? If no, how can I correct the ListIterator thing?
ListIterator<Corsa> ridesIterator = rides.listIterator();
while (ridesIterator.hasNext()) {
ridesIterator.next();
Corsa previous = ridesIterator.previous(); //rides.get(i-1)
Corsa current = ridesIterator.next(); //rides.get(i)
if (current.getOP() < d.getFFP() && previous.getOA() > d.getIP() && current.wait(previous) > DP) {
doSomething();
break;
}
}
__
for (int i = 1; i < rides.size(); i++) {
if (rides.get(i).getOP() < d.getFP() && rides.get(i - 1).getOA() > d.getIP() && rides.get(i).getOP() - rides.get(i - 1).getOA() > DP) {
doSomething();
break;
}
}
2) How will it be the first snippet if I've got something like this? (changed i and its exit condition)
for (int i = 0; i < rides.size() - 1; i++) {
if (rides.get(i).getOP() < d.getFP() && rides.get(i + 1).getOA() > d.getIP() && rides.get(i).getOP() - rides.get(i + 1).getOA() > DP) {
doSomething();
break;
}
}
I'm asking because it's the first time that I'm using a ListIterator and I can't try it now!
EDIT: I'm not using an ArrayList, it's a custom List based on a LinkedList
EDIT 2 : I'm adding some more infos.
I can't use a caching system because my data is changing on evry iteration and managing the cache would be hard as I'd have to deal with inconsistent data.
I can't even merge some of this loops into one big loop, as I've got them on different methods because they need to do a lot of different things.
So, sticking on this particular case, what do you think is the best pratice?
Is ListIterator the best way to deal with my case? And how can I use the ListIterator if my for loop works between 0 and size-1 ?
If you know the maximum size, you will get the best performance if you resign from collections such as ArrayList replacing them with simple arrays.
So instead creating ArrayList<Corsa> with 5000 elements, do Corsa[] rides = new Corsa[5000]. Instead of hard-coding 5000 use it as final static int MAX_RIDES = 5000 for example, to avoid magic number in the code. Then iterate with normal for, referring to rides[i].
Generally if you look for performance, you should code in Java, as if it was C/C++ (of course where you can). The code is not so object-oriented and beautiful, but it's fast. Remember to do optimization always in the end, when you are sure, you have found a bottleneck. Otherwise, your efforts are futile, only making the code less readable and maintainable. Also use a profiler, to make sure your changes are in fact upgrades, not downgrades.
Another downside of using ListIterator is that it internally allocates memory. So GC (Garbage Collector) will awake more often, which also can have impact on the overall performance.
No they do not do the same.
while (ridesIterator.hasNext()) {
ridesIterator.next();
Corsa previous = ridesIterator.previous(); //rides.get(i-1)
Corsa current = ridesIterator.next(); //rides.get(i)
The variables previous and current would contain the same "Corsa" value, see the ListIterator documentation for details (iterators are "in between" positions).
The correct code would look as follows:
while (ridesIterator.hasNext()) {
Corsa previous = ridesIterator.next(); //rides.get(i-1)
if(!ridesIterator.hasNext())
break; // We are already at the last element
Corsa current = ridesIterator.next(); //rides.get(i)
ridesIterator.previous(); // going back 1, to start correctly next time
The code would actually look exactly the same, only the interpretation (as shown in the comments) would be different:
while (ridesIterator.hasNext()) {
Corsa previous = ridesIterator.next(); //rides.get(i)
if(!ridesIterator.hasNext())
break; // We are already at the last element
Corsa current = ridesIterator.next(); //rides.get(i+1)
ridesIterator.previous(); // going back 1, to start correctly next time
From a (premature?) optimization viewpoint the ListIterator implementation is better.
LinkedList is a doubly-linked list which means that each element links to both its predecessor (previous) as well as its successor (next). So it does 3 referals per loop. => 3*N
Each get(i) needs to go through all previous elements to get to the i index position. So on average N/4 referals per loop. (You'd think N/2, but LinkedList starts from the beginning or the end of the list.) => 2 * N * N/4 == N^2 /2
Here are some suggestions, hopefully one or two will be applicable to your situation.
Try to do only one rides.get(x) per loop.
Cache method results in local variables as appropriate for your code.
In some cases the compiler can optimize multiple calls to the same thing doing it just once instead, but not always for many subtle reasons. As a programmer, if you know for a fact that these should deliver the same values, then cache them in local variables.
For example,
int sz = rides.size ();
float dFP = d.getFP (); // wasn't sure of the type, so just called if float..
float dIP = d.getIP ();
Corsa lastRide = rides.get ( 0 );
for ( int i = 1; i < sz; i++ ) {
Corsa = rides.get ( i );
float rOP = r.getOP ();
if ( rOP < dFP ) {
float lastRideOA = lastRide.getOA (); // only get OA if rOP < dFP
if ( lastRideOA > dIP && rOP - lastRideOA > DP ) {
doSomething ();
// maybe break;
}
}
lastRide = r;
}
These are optimizations that may not work in all cases. For example, if your doSomething expands the list, then you need to recompute sz, or maybe go back to doing rides.size() each iteration. These optimizations also assumes that the list is stable in that the elements don't change during the get..()'s. If doSomething makes changes to the list, then you'd need to cache less. Hopefully you get the idea. You can apply some of these techniques to the iterator form of the loop as well.
Related
I'am trying to improve the Bellman-Ford algorithm's performance and I would like to know if the improvement is correct.
I run the relaxing part not V-1 but V times, and I got a boolean variable involved, which is set true if any relax happened during the iteration of the outer loop. If no relax happened at the n. iteration where n <= V, it returns from the loop with the shortest path, but if it relaxes at n = V iteration, that means we have a negative cycle.
I thought it might improve runtime, since sometime we don't have to iterate for V-1 times to find the shortest path, and we can return earlier, and it's also more elegant than checking the cycle with another block of code.
AdjacencyListALD graph;
int[] distTo;
int[] edgeTo;
public BellmanFord(AdjacencyListALD g)
{
graph = g;
}
public int findSP(int source, int dest)
{
// initialization
distTo = new int[graph.SIZE];
edgeTo = new int[graph.SIZE];
for (int i = 0;i<graph.SIZE;i++)
{
distTo[i] = Integer.MAX_VALUE;
}
distTo[source] = 0;
// relaxing V-1 times + 1 for checking negative cycle = V times
for(int i = 0;i<(graph.SIZE);i++)
{
boolean hasRelaxed=false;
for(int j = 0;j<graph.SIZE;j++)
{
for(int x=0;x<graph.sources[j].length;x++)
{
int s = j;
int d = graph.sources[j].get(x).label;
int w = graph.sources[j].get(x).weight;
if(distTo[d] > distTo[s]+w)
{
distTo[d] = distTo[s]+w;
hasRelaxed = true;
}
}
}
if(!hasRelaxed)
return distTo[dest];
}
System.out.println("Negative cycle detected");
return -1;
}
Good comments on the need for testing. That's a given. But it doesn't address the underlying question, whether the OP's modifications to Bellman-Ford constitute an improvement to the algorithm. And the answer is, yes, this is actually a well-known improvement, as G. Bach pointed out in comments.
The OP's observation is that if, in any relaxation iteration, nothing relaxes, then there will be no changes in subsequent iterations and we can therefore just stop. Absolutely correct. There are no outside influences on the values assigned to the vertices. The only thing updating those values is the relaxation step itself. If it finds nothing to do on any iteration there is no way that something to do will materialize out of the aether. Ergo we can terminate.
This doesn't affect the complexity of the algorithm, nor does it help with worst case graphs, but it can reduce actual running time in practice.
As for running the relaxation one more time (|V| times rather than the usual |V|-1), this is just another way of stating the check for negative cycles that follows the relaxation step. It's just another way of saying that, when we terminate by running |V|-1 relaxation iterations, we need to see if any improvement can still be calculated, which reveals a negative cycle.
Bottom line: OP's approach is sound. Now, yes, test the code.
I'm developing a game in Java, and part of it requires that objects spawn at the top of the screen and proceed to fall down. I have three objects that can possibly spawn, and three possible x coordinates for them to spawn at, all stored in an array called xCoordinate[].
One of the objects is of a class called Enemy, which inherits a class I have called FallingThings. In the FallingThings class, I have methods to generate new objects, my enemy method is below:
public static void generateNewEnemy() {
xIndexEnemyOld = xIndexEnemy;
xIndexEnemy = new Random().nextInt(3);
if (delayTimer == 0) {
while (xIndexEnemy == xIndexEnemyOld) {
xIndexEnemy = new Random().nextInt(3);
}
}
if (xIndexEnemy != xIndexMoney && xIndexEnemy != xIndexFriend) {
Enemy enemy = new Enemy(xCoordinates[xIndexEnemy]);
enemies.add((Enemy) enemy);
} else {
generateNewEnemy();
}
}
xIndexEnemy represents the index of the xCoordinates array.
xIndexMoney and xIndexFriend are the indexes of the xCoordinates array for the two other objects (the comparisons with these values ensures that one object does not spawn directly on top of another).
The delayTimer variable represents the random delay between when new objects spawn, which was set earlier in my main class.
I store each instance of an Enemy object in an ArrayList.
Everything works except for the fact that sometimes, an object will spawn over itself (for example, the delay is 0, so two enemy objects spawn directly on top of each other, and proceed to fall down at the same speed at the same time).
I've been trying to crack this for the past two days, but I understand exactly why my code right now isn't working properly. I even tried implementing collision detection to check if another object already exists in the space, but that didn't work either.
I would be extremely grateful for any suggestions and ideas.
EDIT2
It seems that you still don't understand the problem with your function. It was addressed in the other answer but I'll try to make it more clear.
public static void generateNewEnemy() {
xIndexEnemyOld = xIndexEnemy;
This is just wrong. You can't set the Old index without having actually used a new index yet.
xIndexEnemy = new Random().nextInt(3);
if (delayTimer == 0) {
while (xIndexEnemy == xIndexEnemyOld) {
xIndexEnemy = new Random().nextInt(3);
}
}
This is actually ok. You're generating an index until you get one that is different. It may not be the most elegant of solutions but it does the job.
if (xIndexEnemy != xIndexMoney && xIndexEnemy != xIndexFriend) {
Enemy enemy = new Enemy(xCoordinates[xIndexEnemy]);
enemies.add((Enemy) enemy);
} else {
generateNewEnemy();
}
}
This is your problem (along with setting the Old index back there). Not only do you have to generate an index thats different from the Old index, it must also be different from IndexMoney and IndexFriend.
Now, what happens if, for example, IndexOld = 0, IndexMoney = 1 and IndexFriend = 2? You have to generate an index that's different from 0, so you get (again, for instance) 1. IndexMoney is 1 too, so the condition will fail and you do a recursive call. (Why do you even have a recursive call?)
OldIndex was 0, and now in the next call you're setting it to 1. So IndexOld = 1, IndexMoney = 1 and IndexFriend = 2. Do you see the problem now? The overlapped index is now wrong. And the new index can only be 0 no matter how many recursive calls it takes.
You're shooting yourself in the foot more than once. The recursive call does not result in an infinite loop (stack overflow actually) because you're changing the Old index. (Which, again is in the wrong place)
That if condition is making it so the newly generated index cannot overlap ANY of the previous indexes. From what you said before it's not what you want.
You can simplify your function like this,
public static void generateNewEnemy() {
xIndexEnemy = new Random().nextInt(3);
if (delayTimer == 0) {
while (xIndexEnemy == xIndexEnemyOld) {
xIndexEnemy = new Random().nextInt(3);
}
}
Enemy enemy = new Enemy(xCoordinates[xIndexEnemy]);
enemies.add((Enemy) enemy);
xIndexEnemyOld = xIndexEnemy;
// Now that you used the new index you can store it as the Old one
}
Will it work? It will certainly avoid overlapping when the delayTimer is 0 but I don't know the rest of your code (nor do I want to) and what do you do. It's you who should know.
About my suggestions, they were alternatives for how to generate the index you wanted. I was assuming you would know how to fit them in your code, but you're still free to try them after you've fixed the actual problem.
Original Answer
Here's one suggestion.
One thing you could do is to have these enemies "borrow" elements from the array. Say you have an array,
ArrayList< Float > coordinates = new ArrayList< Float >();
// Add the coordinates you want ...
You can select one of the indexes as you're doing, but use the maximum size of the array instead and then remove the element that you choose. By doing that you are removing one of the index options.
int nextIndex = new Random().nextInt( coordinates.size() );
float xCoordinate = coordinates.get( nextIndex );
coordinates.remove( nextIndex ); // Remove the coordinate
Later, when you're done with the value (say, when enough time has passed, or the enemy dies) you can put it back into the array.
coordinates.add( xCoordinate );
Now the value is available again and you don't have to bother with checking indexes.
Well, this is the general idea for my suggestion. You will have to adapt it to make it work the way you need, specifically when you place the value back into the array as I don't know where in your code you can do that.
EDIT:
Another alternative is, you keep the array that you previously had. No need to remove values from it or anything.
When you want to get a new coordinate create an extra array with only the values that are available, that is the values that won't overlap other objects.
...
if (delayTimer == 0) {
ArrayList< Integer > availableIndexes = new ArrayList< Integer >();
for ( int i = 0; i < 3; ++i ) {
if ( i != xIndexEnemyOld ) {
availableIndexes.add( i );
}
}
int selectedIndex = new Random().nextInt( availableIndexes.size() );
xIndexEnemy = availableIndexes.get( selectedIndex );
}
// Else no need to use the array
else {
xIndexEnemy = new Random().nextInt( 3 );
}
...
And now you're sure that the index you're getting should be different, so no need to check if it overlaps.
The downside is that you have to create this extra array, but it makes your conditions simpler.
(I'm keeping the "new Random()" from your code but other answers/comments refer that you should use a single instance, remember that)
As I see, if delay == 0 all is good, but if not, you have a chance to generate new enemy with the same index. Maybe you want to call return; if delayTimer != 0?
UPDATED
Look what you have in such case:
OldEnemyIndex = 1
NewEnemyIndex = random(3) -> 1
DelayTimer = 2
Then you do not pass to your if statement, then in the next if all is ok, if your enemy has no the same index with money or something else, so you create new enemy with the same index as previous
I'm creating an A* search at the moment ( wiki page with pseudocode ) and I've been spending the last hour or so coming up with heuristic equations. When I think I finally found a good one, I removed the print statement that was allowing me to see what states were being visited. For some reason, that made my search go much much slower. If I add the print back in, it becomes fast again. What could possibly be going on?
I even tried changing what it prints. No matter what I am printing (as long as it is 2 characters or more), the result is the same.
Some of the code:
I apologize beforehand for messy code, this is my first time working with something like this:
while(!toVisit.isEmpty()){//toVisit is a set of states that need to be visited
int f = Integer.MAX_VALUE;
State temp;
State visiting = new State();
Iterator<State> it = toVisit.iterator();
while(it.hasNext()){//find state with smallest f value
temp = it.next();
if(temp.getF() < f){
f = temp.getF();
visiting = temp;//should be state with smallest f by end of loop
}
}
System.out.println("Visiting: ");//THIS LINE HERE
//LINE THAT MAGICALY MAKES IT FAST ^^^^
if(numConflicts(visiting.getList()) == 0){//checking if visiting state is the solution
best = visiting.getList();//sets best answer
return visiting;//ends algorithm
}
........
info on toVisit and visiting.getList():
HashSet<State> toVisit = new HashSet<State>();//from Java.util
public ArrayList<Node> State.getList(){return list;}
Node is my own class. It only contains some coordinates
This consistently solves the problem in about 6 seconds. If I change that line to print nothing or something shorter than about 2 characters, it takes anywhere from 20 to 70 seconds
This question is specifically geared towards the Java language, but I would not mind feedback about this being a general concept if so. I would like to know which operation might be faster, or if there is no difference between assigning a variable a value and performing tests for values. For this issue we could have a large series of Boolean values that will have many requests for changes. I would like to know if testing for the need to change a value would be considered a waste when weighed against the speed of simply changing the value during every request.
public static void main(String[] args){
Boolean array[] = new Boolean[veryLargeValue];
for(int i = 0; i < array.length; i++) {
array[i] = randomTrueFalseAssignment;
}
for(int i = 400; i < array.length - 400; i++) {
testAndChange(array, i);
}
for(int i = 400; i < array.length - 400; i++) {
justChange(array, i);
}
}
This could be the testAndChange method
public static void testAndChange(Boolean[] pArray, int ind) {
if(pArray)
pArray[ind] = false;
}
This could be the justChange method
public static void justChange(Boolean[] pArray, int ind) {
pArray[ind] = false;
}
If we were to end up with the very rare case that every value within the range supplied to the methods were false, would there be a point where one method would eventually become slower than the other? Is there a best practice for issues similar to this?
Edit: I wanted to add this to help clarify this question a bit more. I realize that the data type can be factored into the answer as larger or more efficient datatypes can be utilized. I am more focused on the task itself. Is the task of a test "if(aConditionalTest)" is slower, faster, or indeterminable without additional informaiton (such as data type) than the task of an assignment "x=avalue".
As #TrippKinetics points out, there is a semantical difference between the two methods. Because you use Boolean instead of boolean, it is possible that one of the values is a null reference. In that case the first method (with the if-statement) will throw an exception while the second, simply assigns values to all the elements in the array.
Assuming you use boolean[] instead of Boolean[]. Optimization is an undecidable problem. There are very rare cases where adding an if-statement could result in better performance. For instance most processors use cache and the if-statement can result in the fact that the executed code is stored exactly on two cache-pages where without an if on more resulting in cache faults. Perhaps you think you will save an assignment instruction but at the cost of a fetch instruction and a conditional instruction (which breaks the CPU pipeline). Assigning has more or less the same cost as fetching a value.
In general however, one can assume that adding an if statement is useless and will nearly always result in slower code. So you can quite safely state that the if statement will slow down your code always.
More specifically on your question, there are faster ways to set a range to false. For instance using bitvectors like:
long[] data = new long[(veryLargeValue+0x3f)>>0x06];//a long has 64 bits
//assign random values
int low = 400>>0x06;
int high = (veryLargeValue-400)>>0x06;
data[low] &= 0xffffffffffffffff<<(0x3f-(400&0x3f));
for(int i = low+0x01; i < high; i++) {
data[i] = 0x00;
}
data[high] &= 0xffffffffffffffff>>(veryLargeValue-400)&0x3f));
The advantage is that a processor can perform operations on 32- or 64-bits at once. Since a boolean is one bit, by storing bits into a long or int, operations are done in parallel.
Imagine you want to count how many non-ASCII chars a given char[] contains. Imagine, the performance really matters, so we can skip our favorite slogan.
The simplest way is obviously
int simpleCount() {
int result = 0;
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
Then you think that many inputs are pure ASCII and that it could be a good idea to deal with them separately. For simplicity assume you write just this
private int skip(int i) {
for (; i < string.length; i++) {
if (string[i] >= 128) break;
}
return i;
}
Such a trivial method could be useful for more complicated processing and here it can't do no harm, right? So let's continue with
int smartCount() {
int result = 0;
for (int i = skip(0); i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
It's the same as simpleCount. I'm calling it "smart" as the actual work to be done is more complicated, so skipping over ASCII quickly makes sense. If there's no or a very short ASCII prefix, it can costs a few cycles more, but that's all, right?
Maybe you want to rewrite it like this, it's the same, just possibly more reusable, right?
int smarterCount() {
return finish(skip(0));
}
int finish(int i) {
int result = 0;
for (; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
And then you ran a benchmark on some very long random string and get this
The parameters determine the ASCII to non-ASCII ratio and the average length of a non-ASCII sequence, but as you can see they don't matter. Trying different seeds and whatever doesn't matter. The benchmark uses caliper, so the usual gotchas don't apply. The results are fairly repeatable, the tiny black bars at the end denote the minimum and maximum times.
Does anybody have an idea what's going on here? Can anybody reproduce it?
Got it.
The difference is in the possibility for the optimizer/CPU to predict the number of loops in for. If it is able to predict the number of repeats up front, it can skip the actual check of i < string.length. Therefore the optimizer needs to know up front how often the condition in the for-loop will succeed and therefore it must know the value of string.length and i.
I made a simple test, by replacing string.length with a local variable, that is set once in the setup method. Result: smarterCount has runtime of about simpleCount. Before the change smarterCount took about 50% longer then simpleCount. smartCount did not change.
It looks like the optimizer looses the information of how many loops it will have to do when a call to another method occurs. That's the reason why finish() immediately ran faster with the constant set, but not smartCount(), as smartCount() has no clue about what i will be after the skip() step. So I did a second test, where I copied the loop from skip() into smartCount().
And voilĂ , all three methods return within the same time (800-900 ms).
My tentative guess would be that this is about branch prediction.
This loop:
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
Contains exactly one branch, the backward edge of the loop, and it is highly predictable. A modern processor will be able to accurately predict this, and so fill its whole pipeline with instructions. The sequence of loads is also highly predictable, so it will be able to pre-fetch everything the pipelined instructions need. High performance results.
This loop:
for (; i < string.length - 1; i++) {
if (string[i] >= 128) break;
}
Has a dirty great data-dependent conditional branch sitting in the middle of it. That is much harder for the processor to predict accurately.
Now, that doesn't entirely make sense, because (a) the processor will surely quickly learn that the break branch will usually not be taken, (b) the loads are still predictable, and so just as pre-fetchable, and (c) after that loop exits, the code goes into a loop which is identical to the loop which goes fast. So i wouldn't expect this to make all that much difference.