A print statement is speeding up my A * search? - java

I'm creating an A* search at the moment ( wiki page with pseudocode ) and I've been spending the last hour or so coming up with heuristic equations. When I think I finally found a good one, I removed the print statement that was allowing me to see what states were being visited. For some reason, that made my search go much much slower. If I add the print back in, it becomes fast again. What could possibly be going on?
I even tried changing what it prints. No matter what I am printing (as long as it is 2 characters or more), the result is the same.
Some of the code:
I apologize beforehand for messy code, this is my first time working with something like this:
while(!toVisit.isEmpty()){//toVisit is a set of states that need to be visited
int f = Integer.MAX_VALUE;
State temp;
State visiting = new State();
Iterator<State> it = toVisit.iterator();
while(it.hasNext()){//find state with smallest f value
temp = it.next();
if(temp.getF() < f){
f = temp.getF();
visiting = temp;//should be state with smallest f by end of loop
}
}
System.out.println("Visiting: ");//THIS LINE HERE
//LINE THAT MAGICALY MAKES IT FAST ^^^^
if(numConflicts(visiting.getList()) == 0){//checking if visiting state is the solution
best = visiting.getList();//sets best answer
return visiting;//ends algorithm
}
........
info on toVisit and visiting.getList():
HashSet<State> toVisit = new HashSet<State>();//from Java.util
public ArrayList<Node> State.getList(){return list;}
Node is my own class. It only contains some coordinates
This consistently solves the problem in about 6 seconds. If I change that line to print nothing or something shorter than about 2 characters, it takes anywhere from 20 to 70 seconds

Related

Prevent Depth First Search Algorithm from getting stuck in an infinite loop ,8 Puzzle

So my code works for basic 8 Puzzle problems, but when I test it with harder puzzle configurations it runs into an infinite loop. Can someone please edit my code to prevent this from happening. Note that I have included code that prevents the loops or cycles. I tried including the the iterative depth first search technique, but that too did not work. Can someone please review my code.
/** Implementation for the Depth first search algorithm */
static boolean depthFirstSearch(String start, String out ){
LinkedList<String> open = new LinkedList<String>();
open.add(start);
Set<String> visitedStates = new HashSet<String>(); // to prevent the cyle or loop
LinkedList<String> closed = new LinkedList<String>();
boolean isGoalState= false;
while((!open.isEmpty()) && (isGoalState != true) ){
String x = open.removeFirst();
System.out.println(printPuzzle(x)+"\n\n");
jtr.append(printPuzzle(x) +"\n\n");
if(x.equals(out)){ // if x is the goal
isGoalState = true;
break;
}
else{
// generate children of x
LinkedList<String> children = getChildren(x);
closed.add(x); // put x on closed
open.remove(x); // since x is now in closed, take it out from open
//eliminate the children of X if its on open or closed ?
for(int i=0; i< children.size(); i++){
if(open.contains(children.get(i))){
children.remove(children.get(i));
}
else if(closed.contains(children.get(i))){
children.remove(children.get(i));
}
}
// put remaining children on left end of open
for(int i= children.size()-1 ; i>=0 ; i--){
if ( !visitedStates.contains(children.get(i))) { // check if state already visited
open.addFirst(children.get(i)); // add last child first, and so on
visitedStates.add(children.get(i));
}
}
}
}
return true;
}
I would suggest putting the positions that you are considering into a https://docs.oracle.com/javase/7/docs/api/java/util/PriorityQueue.html with a priority based on how close they are to being solved.
So what you do is take the closest position off of the queue, and add in all of the one move options from there that haven't yet been processed. Then repeat. You'll spent most of your time exploring possibilities that are close to solved instead of just moving randomly forever.
Now your question is "how close are we to solving it". One approach is to take the sum of all of the taxicab distances between where squares are and where they need to be. A better heuristic may be to give more weight to getting squares away from the corner in place first. If you get it right, changing your heuristic should be easy.

Bellman-Ford improvement: does it work?

I'am trying to improve the Bellman-Ford algorithm's performance and I would like to know if the improvement is correct.
I run the relaxing part not V-1 but V times, and I got a boolean variable involved, which is set true if any relax happened during the iteration of the outer loop. If no relax happened at the n. iteration where n <= V, it returns from the loop with the shortest path, but if it relaxes at n = V iteration, that means we have a negative cycle.
I thought it might improve runtime, since sometime we don't have to iterate for V-1 times to find the shortest path, and we can return earlier, and it's also more elegant than checking the cycle with another block of code.
AdjacencyListALD graph;
int[] distTo;
int[] edgeTo;
public BellmanFord(AdjacencyListALD g)
{
graph = g;
}
public int findSP(int source, int dest)
{
// initialization
distTo = new int[graph.SIZE];
edgeTo = new int[graph.SIZE];
for (int i = 0;i<graph.SIZE;i++)
{
distTo[i] = Integer.MAX_VALUE;
}
distTo[source] = 0;
// relaxing V-1 times + 1 for checking negative cycle = V times
for(int i = 0;i<(graph.SIZE);i++)
{
boolean hasRelaxed=false;
for(int j = 0;j<graph.SIZE;j++)
{
for(int x=0;x<graph.sources[j].length;x++)
{
int s = j;
int d = graph.sources[j].get(x).label;
int w = graph.sources[j].get(x).weight;
if(distTo[d] > distTo[s]+w)
{
distTo[d] = distTo[s]+w;
hasRelaxed = true;
}
}
}
if(!hasRelaxed)
return distTo[dest];
}
System.out.println("Negative cycle detected");
return -1;
}
Good comments on the need for testing. That's a given. But it doesn't address the underlying question, whether the OP's modifications to Bellman-Ford constitute an improvement to the algorithm. And the answer is, yes, this is actually a well-known improvement, as G. Bach pointed out in comments.
The OP's observation is that if, in any relaxation iteration, nothing relaxes, then there will be no changes in subsequent iterations and we can therefore just stop. Absolutely correct. There are no outside influences on the values assigned to the vertices. The only thing updating those values is the relaxation step itself. If it finds nothing to do on any iteration there is no way that something to do will materialize out of the aether. Ergo we can terminate.
This doesn't affect the complexity of the algorithm, nor does it help with worst case graphs, but it can reduce actual running time in practice.
As for running the relaxation one more time (|V| times rather than the usual |V|-1), this is just another way of stating the check for negative cycles that follows the relaxation step. It's just another way of saying that, when we terminate by running |V|-1 relaxation iterations, we need to see if any improvement can still be calculated, which reveals a negative cycle.
Bottom line: OP's approach is sound. Now, yes, test the code.

Java A* Implementation Issues

I have written an implementation of the A* algorithm, taken mainly from This wiki page, however I have a major problem; in that I believe I am visiting way too many nodes while calculating a route therefore ruining my performance. I've been trying to figure out the issue for a few days and I can't see what's wrong. Please note, all my data structures are self implemented however I've tested them and believe they're not the issue.
I've included my Priority Queue implementation just in case.
closedVertices is a Hash map of Vertices.
private Vertex routeCalculation(Vertex startLocation, Vertex endLocation, int routetype)
{
Vertex vertexNeighbour;
pqOpen.AddItem(startLocation);
while (!(pqOpen.IsEmpty()))
{
tempVertex = pqOpen.GetNextItem();
for (int i = 0; i < tempVertex.neighbors.GetNoOfItems(); i++) //for each neighbor of tempVertex
{
currentRoad = tempVertex.neighbors.GetItem(i);
currentRoad.visited = true;
vertexNeighbour = allVertices.GetNewValue(currentRoad.toid);
//if the neighbor is in closed set, move to next neighbor
checkClosed();
nodesVisited++;
setG_Score();
//checks if neighbor is in open set
findNeighbour();
//if neighbour is not in open set
if (!foundNeighbor || temp_g_score < vertexNeighbour.getTentativeDistance())
{
vertexNeighbour.setTentativeDistance(temp_g_score);
//calculate H once, store it and then do an if statement to see if it's been used before - if true, grab from memory, else calculate.
if (vertexNeighbour.visited == false)
vertexNeighbour.setH(heuristic(endLocation, vertexNeighbour));
vertexNeighbour.setF(vertexNeighbour.getH() + vertexNeighbour.getTentativeDistance());
// if neighbor isn't in open set, add it to open set
if (!(foundNeighbor))
{
pqOpen.AddItem(vertexNeighbour);
}
else
{
pqOpen.siftUp(foundNeighbourIndex);
}
}
}
}
}
return null;
}
Can anyone see where I may be exploring too many nodes?
Also, I've attempted to implement a way to calculate the quickest (timed) route, by modifying F by the speed of the road. Am I right in saying this the correct way to do it?
(I divided the speed of the road by 100 because it was taking a long time to execute otherwise).
I found my own error; I had implemented the way in which I calculate the heuristic for each node wrong - I had an IF statement to see if the H had already been calculated however I had done this wrong and therefore it never actually calculated the H for some nodes; resulting in excessive node exploration. I simply removed the line: if (vertexNeighbour.visited == false) and now I have perfect calculations.
However I am still trying to figure out how to calculate the fastest route in terms of time.

Optimization: replace for loop with ListIterator

It's my first working on a quite big project, and I've been asked to obtain the best performances.
So I've thouhgt to replace my for loops with a ListIterator, because I've got around 180 loops which call list.get(i) on lists with about 5000 elements.
So I've got two questions.
1) Are those 2 snippets equal? I mean, do them produce the same output? If no, how can I correct the ListIterator thing?
ListIterator<Corsa> ridesIterator = rides.listIterator();
while (ridesIterator.hasNext()) {
ridesIterator.next();
Corsa previous = ridesIterator.previous(); //rides.get(i-1)
Corsa current = ridesIterator.next(); //rides.get(i)
if (current.getOP() < d.getFFP() && previous.getOA() > d.getIP() && current.wait(previous) > DP) {
doSomething();
break;
}
}
__
for (int i = 1; i < rides.size(); i++) {
if (rides.get(i).getOP() < d.getFP() && rides.get(i - 1).getOA() > d.getIP() && rides.get(i).getOP() - rides.get(i - 1).getOA() > DP) {
doSomething();
break;
}
}
2) How will it be the first snippet if I've got something like this? (changed i and its exit condition)
for (int i = 0; i < rides.size() - 1; i++) {
if (rides.get(i).getOP() < d.getFP() && rides.get(i + 1).getOA() > d.getIP() && rides.get(i).getOP() - rides.get(i + 1).getOA() > DP) {
doSomething();
break;
}
}
I'm asking because it's the first time that I'm using a ListIterator and I can't try it now!
EDIT: I'm not using an ArrayList, it's a custom List based on a LinkedList
EDIT 2 : I'm adding some more infos.
I can't use a caching system because my data is changing on evry iteration and managing the cache would be hard as I'd have to deal with inconsistent data.
I can't even merge some of this loops into one big loop, as I've got them on different methods because they need to do a lot of different things.
So, sticking on this particular case, what do you think is the best pratice?
Is ListIterator the best way to deal with my case? And how can I use the ListIterator if my for loop works between 0 and size-1 ?
If you know the maximum size, you will get the best performance if you resign from collections such as ArrayList replacing them with simple arrays.
So instead creating ArrayList<Corsa> with 5000 elements, do Corsa[] rides = new Corsa[5000]. Instead of hard-coding 5000 use it as final static int MAX_RIDES = 5000 for example, to avoid magic number in the code. Then iterate with normal for, referring to rides[i].
Generally if you look for performance, you should code in Java, as if it was C/C++ (of course where you can). The code is not so object-oriented and beautiful, but it's fast. Remember to do optimization always in the end, when you are sure, you have found a bottleneck. Otherwise, your efforts are futile, only making the code less readable and maintainable. Also use a profiler, to make sure your changes are in fact upgrades, not downgrades.
Another downside of using ListIterator is that it internally allocates memory. So GC (Garbage Collector) will awake more often, which also can have impact on the overall performance.
No they do not do the same.
while (ridesIterator.hasNext()) {
ridesIterator.next();
Corsa previous = ridesIterator.previous(); //rides.get(i-1)
Corsa current = ridesIterator.next(); //rides.get(i)
The variables previous and current would contain the same "Corsa" value, see the ListIterator documentation for details (iterators are "in between" positions).
The correct code would look as follows:
while (ridesIterator.hasNext()) {
Corsa previous = ridesIterator.next(); //rides.get(i-1)
if(!ridesIterator.hasNext())
break; // We are already at the last element
Corsa current = ridesIterator.next(); //rides.get(i)
ridesIterator.previous(); // going back 1, to start correctly next time
The code would actually look exactly the same, only the interpretation (as shown in the comments) would be different:
while (ridesIterator.hasNext()) {
Corsa previous = ridesIterator.next(); //rides.get(i)
if(!ridesIterator.hasNext())
break; // We are already at the last element
Corsa current = ridesIterator.next(); //rides.get(i+1)
ridesIterator.previous(); // going back 1, to start correctly next time
From a (premature?) optimization viewpoint the ListIterator implementation is better.
LinkedList is a doubly-linked list which means that each element links to both its predecessor (previous) as well as its successor (next). So it does 3 referals per loop. => 3*N
Each get(i) needs to go through all previous elements to get to the i index position. So on average N/4 referals per loop. (You'd think N/2, but LinkedList starts from the beginning or the end of the list.) => 2 * N * N/4 == N^2 /2
Here are some suggestions, hopefully one or two will be applicable to your situation.
Try to do only one rides.get(x) per loop.
Cache method results in local variables as appropriate for your code.
In some cases the compiler can optimize multiple calls to the same thing doing it just once instead, but not always for many subtle reasons. As a programmer, if you know for a fact that these should deliver the same values, then cache them in local variables.
For example,
int sz = rides.size ();
float dFP = d.getFP (); // wasn't sure of the type, so just called if float..
float dIP = d.getIP ();
Corsa lastRide = rides.get ( 0 );
for ( int i = 1; i < sz; i++ ) {
Corsa = rides.get ( i );
float rOP = r.getOP ();
if ( rOP < dFP ) {
float lastRideOA = lastRide.getOA (); // only get OA if rOP < dFP
if ( lastRideOA > dIP && rOP - lastRideOA > DP ) {
doSomething ();
// maybe break;
}
}
lastRide = r;
}
These are optimizations that may not work in all cases. For example, if your doSomething expands the list, then you need to recompute sz, or maybe go back to doing rides.size() each iteration. These optimizations also assumes that the list is stable in that the elements don't change during the get..()'s. If doSomething makes changes to the list, then you'd need to cache less. Hopefully you get the idea. You can apply some of these techniques to the iterator form of the loop as well.

Tips optimizing Java code

So, I've written a spellchecker in Java and things work as they should. The only problem is that if I use a word where the max allowed distance of edits is too large (like say, 9) then my code runs out of memory. I've profiled my code and dumped the heap into a file, but I don't know how to use it to optimize my code.
Can anyone offer any help? I'm more than willing to put up the file/use any other approach that people might have.
-Edit-
Many people asked for more details in the comments. I figured that other people would find them useful, and they might get buried in the comments. Here they are:
I'm using a Trie to store the words themselves.
In order to improve time efficiency, I don't compute the Levenshtein Distance upfront, but I calculate it as I go. What I mean by this is that I keep only two rows of the LD table in memory. Since a Trie is a prefix tree, it means that every time I recurse down a node, the previous letters of the word (and therefore the distance for those words) remains the same. Therefore, I only calculate the distance with that new letter included, with the previous row remaining unchanged.
The suggestions that I generate are stored in a HashMap. The rows of the LD table are stored in ArrayLists.
Here's the code of the function in the Trie that leads to the problem. Building the Trie is pretty straight forward, and I haven't included the code for the same here.
/*
* #param letter: the letter that is currently being looked at in the trie
* word: the word that we are trying to find matches for
* previousRow: the previous row of the Levenshtein Distance table
* suggestions: all the suggestions for the given word
* maxd: max distance a word can be from th query and still be returned as suggestion
* suggestion: the current suggestion being constructed
*/
public void get(char letter, ArrayList<Character> word, ArrayList<Integer> previousRow, HashSet<String> suggestions, int maxd, String suggestion){
// the new row of the trie that is to be computed.
ArrayList<Integer> currentRow = new ArrayList<Integer>(word.size()+1);
currentRow.add(previousRow.get(0)+1);
int insert = 0;
int delete = 0;
int swap = 0;
int d = 0;
for(int i=1;i<word.size()+1;i++){
delete = currentRow.get(i-1)+1;
insert = previousRow.get(i)+1;
if(word.get(i-1)==letter)
swap = previousRow.get(i-1);
else
swap = previousRow.get(i-1)+1;
d = Math.min(delete, Math.min(insert, swap));
currentRow.add(d);
}
// if this node represents a word and the distance so far is <= maxd, then add this word as a suggestion
if(isWord==true && d<=maxd){
suggestions.add(suggestion);
}
// if any of the entries in the current row are <=maxd, it means we can still find possible solutions.
// recursively search all the branches of the trie
for(int i=0;i<currentRow.size();i++){
if(currentRow.get(i)<=maxd){
for(int j=0;j<26;j++){
if(children[j]!=null){
children[j].get((char)(j+97), word, currentRow, suggestions, maxd, suggestion+String.valueOf((char)(j+97)));
}
}
break;
}
}
}
Here's some code I quickly crafted showing one way to generate the candidates and to then "rank" them.
The trick is: you never "test" a non-valid candidate.
To me your: "I run out of memory when I've got an edit distance of 9" screams "combinatorial explosion".
Of course to dodge a combinatorial explosion you don't do thing like trying to generate yourself all words that are at a distance from '9' from your misspelled work. You start from the misspelled word and generate (quite a lot) of possible candidates, but you refrain from creating too many candidates, for then you'd run into trouble.
(also note that it doesn't make much sense to compute up to a Levenhstein Edit Distance of 9, because technically any word less than 10 letters can be transformed into any other word less than 10 letters in max 9 transformations)
Here's why you simply cannot test all words up to a distance of 9 without either having an OutOfMemory error or simply a program never terminating:
generating all the LED up to 1 for the word "ptmizing", by only adding one letter (from a to z) generates already 9*26 variations (i.e. 324 variations) [there are 9 positions where you can insert one out of 26 letters)
generating all the LED up to 2, by only adding one letter to what we know have generates already 10*26*324 variations (60 840)
generating all the LED up to 3 gives: 17 400 240 variations
And that is only by considering the case where we add one, add two or add three letters (we're not counting deletion, swaps, etc.). And that is on a misspelled word that is only nine characters long. On "real" words, it explodes even faster.
Sure, you could get "smart" and generate this in a way not to have too many dupes etc. but the point stays: it's a combinatorial explosion that explodes fastly.
Anyway... Here's an example. I'm simply passing the dictionary of valid words (containing only four words in this case) to the corresponding method to keep this short.
You'll obviously want to replace the call to the LED with your own LED implementation.
The double-metaphone is just an example: in a real spellchecker words that do "sound alike"
despite further LED should be considered as "more correct" and hence often suggest first. For example "optimizing" and "aupteemising" are quite far from a LED point of view, but using the double-metaphone you should get "optimizing" as one of the first suggestion.
(disclaimer: following was cranked in a few minutes, it doesn't take into account uppercase, non-english words, etc.: it's not a real spell-checker, just an example)
#Test
public void spellCheck() {
final String src = "misspeled";
final Set<String> validWords = new HashSet<String>();
validWords.add("boing");
validWords.add("Yahoo!");
validWords.add("misspelled");
validWords.add("stackoverflow");
final List<String> candidates = findNonSortedCandidates( src, validWords );
final SortedMap<Integer,String> res = computeLevenhsteinEditDistanceForEveryCandidate(candidates, src);
for ( final Map.Entry<Integer,String> entry : res.entrySet() ) {
System.out.println( entry.getValue() + " # LED: " + entry.getKey() );
}
}
private SortedMap<Integer, String> computeLevenhsteinEditDistanceForEveryCandidate(
final List<String> candidates,
final String mispelledWord
) {
final SortedMap<Integer, String> res = new TreeMap<Integer, String>();
for ( final String candidate : candidates ) {
res.put( dynamicProgrammingLED(candidate, mispelledWord), candidate );
}
return res;
}
private int dynamicProgrammingLED( final String candidate, final String misspelledWord ) {
return Levenhstein.getLevenshteinDistance(candidate,misspelledWord);
}
Here you generate all possible candidates using several methods. I've only implemented one such method (and quickly so it may be bogus but that's not the point ; )
private List<String> findNonSortedCandidates( final String src, final Set<String> validWords ) {
final List<String> res = new ArrayList<String>();
res.addAll( allCombinationAddingOneLetter(src, validWords) );
// res.addAll( allCombinationRemovingOneLetter(src) );
// res.addAll( allCombinationInvertingLetters(src) );
return res;
}
private List<String> allCombinationAddingOneLetter( final String src, final Set<String> validWords ) {
final List<String> res = new ArrayList<String>();
for (char c = 'a'; c < 'z'; c++) {
for (int i = 0; i < src.length(); i++) {
final String candidate = src.substring(0, i) + c + src.substring(i, src.length());
if ( validWords.contains(candidate) ) {
res.add(candidate); // only adding candidates we know are valid words
}
}
if ( validWords.contains(src+c) ) {
res.add( src + c );
}
}
return res;
}
One thing you could try is, increase the Java's heap size, in order to overcome "out of memory error".
Following article will help you in order to understand how to increase heap size in Java
http://viralpatel.net/blogs/2009/01/jvm-java-increase-heap-size-setting-heap-size-jvm-heap.html
But I think the better approach to address your problem is, find out a better algorithm than the current algorithm
Well without more Information on the topic there is not much the community could do for you... You can start with the following:
Look at what your Profiler says (after it has run a little while): Does anything pile up? Are there a lot of Objects - this should normally give you a hint on what is wrong with your code.
Publish your saved dump somewhere and link it in your question, so someone else could take a look at it.
Tell us which profiler you are using, then somebody can give you hints on where to look for valuable information.
After you have narrowed down your problem to a specific part of your Code, and you cannot figure out why there are so many objects of $FOO in your memory, post a snippet of the relevant part.

Categories

Resources