I have built a 8 puzzle solver using Breadth First Search. I would now want to modify the code to use heuristics. I would be grateful if someone could answer the following two questions:
Solvability
How do we decide whether an 8 puzzle is solvable ? (given a starting state and a goal state )
This is what Wikipedia says:
The invariant is the parity of the permutation of all 16 squares plus
the parity of the taxicab distance (number of rows plus number of
columns) of the empty square from the lower right corner.
Unfortunately, I couldn't understand what that meant. It was a bit complicated to understand. Can someone explain it in a simpler language?
Shortest Solution
Given a heuristic, is it guaranteed to give the shortest solution using the A* algorithm? To be more specific, will the first node in the open list always have a depth ( or the number of movements made so fat ) which is the minimum of the depths of all the nodes present in the open list?
Should the heuristic satisfy some condition for the above statement to be true?
Edit : How is it that an admissible heuristic will always provide the optimal solution? And how do we test whether a heuristic is admissible?
I would be using the heuristics listed here
Manhattan Distance
Linear Conflict
Pattern Database
Misplaced Tiles
Nilsson's Sequence Score
N-MaxSwap X-Y
Tiles out of row and column
For clarification from Eyal Schneider :
I'll refer only to the solvability issue. Some background in permutations is needed.
A permutation is a reordering of an ordered set. For example, 2134 is a reordering of the list 1234, where 1 and 2 swap places. A permutation has a parity property; it refers to the parity of the number of inversions. For example, in the following permutation you can see that exactly 3 inversions exist (23,24,34):
1234
1432
That means that the permutation has an odd parity. The following permutation has an even parity (12, 34):
1234
2143
Naturally, the identity permutation (which keeps the items order) has an even parity.
Any state in the 15 puzzle (or 8 puzzle) can be regarded as a permutation of the final state, if we look at it as a concatenation of the rows, starting from the first row. Note that every legal move changes the parity of the permutation (because we swap two elements, and the number of inversions involving items in between them must be even). Therefore, if you know that the empty square has to travel an even number of steps to reach its final state, then the permutation must also be even. Otherwise, you'll end with an odd permutation of the final state, which is necessarily different from it. Same with odd number of steps for the empty square.
According to the Wikipedia link you provided, the criteria above is sufficient and necessary for a given puzzle to be solvable.
The A* algorithm is guaranteed to find the (one if there are more than one equal short ones) shortest solution, if your heuristic always underestimates the real costs (In your case the real number of needed moves to the solution).
But on the fly I cannot come up with a good heuristic for your problem. That needs some thinking to find such a heuristic.
The real art using A* is to find a heuristic that always underestimates the real costs but as little as possible to speed up the search.
First ideas for such a heuristic:
A quite pad but valid heuristic that popped up in my mind is the manhatten distance of the empty filed to its final destination.
The sum of the manhatten distance of each field to its final destination divided by the maximal number of fields that can change position within one move. (I think this is quite a good heuristic)
For anyone coming along, I will attempt to explain how the OP got the value pairs as well as how he determines the highlighted ones i.e. inversions as it took me several hours to figure it out. First the pairs.
First take the goal state and imagine it as a 1D array(A for example)
[1,2,3,8,0,4,7,5]. Each value in that array has it's own column in the table(going all the way down, which is the first value of the pair.)
Then move over 1 value to the right in the array(i + 1) and go all the way down again, second pair value. for example(State A): the first column, second value will start [2,3,8,0,4,7,5] going down. the second column, will start [3,8,0,4,7,5] etc..
okay now for the inversions. for each of the 2 pair values, find their INDEX location in the start state. if the left INDEX > right INDEX then it's an inversion(highlighted). first four pairs of state A are: (1,2),(1,3),(1,8),(1,0)
1 is at Index 3
2 is at Index 0
3 > 0 so inversion.
1 is 3
3 is 2
3 > 2 so inversion
1 is 3
8 is 1
3 > 2 so inversion
1 is 3
0 is 7
3 < 7 so No inversion
Do this for each pairs and tally up the total inversions.
If both even or both odd (Manhattan distance of blank spot And total inversions)
then it's solvable. Hope this helps!
Related
I am trying to think how to solve the Subset sum problem with an extra constraint: The subset of the array needs to be continuous (the indexes needs to be). I am trying to solve it using recursion in Java.
I know the solution for the non-constrained problem: Each element can be in the subset (and thus I perform a recursive call with sum = sum - arr[index]) or not be in it (and thus I perform a recursive call with sum = sum).
I am thinking about maybe adding another parameter for knowing weather or not the previous index is part of the subset, but I don't know what to do next.
You are on the right track.
Think of it this way:
for every entry you have to decide: do you want to start a new sum at this point or skip it and reconsider the next entry.
a + b + c + d contains the sum of b + c + d. Do you want to recompute the sums?
Maybe a bottom-up approach would be better
The O(n) solution that you asked for:
This solution requires three fixed point numbers: The start and end indices, and the total sum of the span
Starting from element 0 (or from the end of the list if you want) increase the end index until the total sum is greater than or equal to the desired value. If it is equal, you've found a subset sum. If it is greater, move the start index up one and subtract the value of the previous start index. Finally, if the resulting total is greater than the desired value, move the end index back until the sum is less than the desired value. In the other case (where the sum is less) move the end index forward until the sum is greater than the desired value. If no match is found, repeat
So, caveats:
Is this "fairly obvious"? Maybe, maybe not. I was making assumptions about order of magnitude similarity when I said both "fairly obvious" and o(n) in my comments
Is this actually o(n)? It depends a lot on how similar (in terms of order of magnitude (digits in the number)) the numbers in the list are. The closer all the numbers are to each other, the fewer steps you'll need to make on the end index to test if a subset exists. On the other hand, if you have a couple of very big numbers (like in the thousands) surrounded by hundreds of pretty small numbers (1's and 2's and 3's) the solution I've presented will get closers to O(n^2)
This solution only works based on your restriction that the subset values are continuous
I was reading Data Structures and Algorithms in Java book and I came across the following question that I would like to get help with:
Suppose you are given an array, A, containing 100 integers that were generated using the method r.nextInt(10), where r is an object of type java.util.Random. Let x denote the product of the integers in A. There is a single number that x will equal with probability at least 0.99. What is that number and what is a formula describing the probability that x is equal to that number?
I think x is equal to zero; as most probably 0 will be generated. However, that's just a guess. I wasn't able to find the formula. The java documentation doesn't specify the randomization equation and I wasn't able to find any related topics either here or after searching using Google.
I would like to get some help with the probability formula please. Thanks in advance.
The possible values for the array elements are 0 .. 9, each with probability 1/10. If one of the elements is 0, the product will be 0 as well. So we calculate the probability that at least one element is 0.
It turns out, this is the opposite of all elements being greater than zero. The probability for an element to be greater than 0 is 9/10, and the probability that all elements are greater than zero is therefore (9/10)^100.
The probability that at least one element is 0 is therefore 1 - (9/10)^100 which is approximately 0.9999734.
Regarding nextInt: The javadoc specifies:
uniformly distributed int value between 0 (inclusive) and the
specified value (exclusive)
a "uniform distribution" is a distribution where each outcome is equally likely.
hence the chances for a particular outcome are "1/[number of possible outcomes]" (so they all add up to 1).
Regarding the array:
Filling the array can be regarded as observing 100 statistically independent events.
You should read up, on how the maths work when combining multiple independent events.
https://docs.oracle.com/javase/7/docs/api/java/util/Random.html#nextInt(int)
As you already expect, it will of be 0
r.nextInt(10)
will return numbers from 0 to 9
The product of any 0 * 1-9 will be 0, therefore for 100 random numbers, the chance that no 0 will be returned from this function is pretty low.
I was thinking of making a Sudoku solver, I have 2 questions:
1) What would be faster?
A) Go through all the empty spots, have a list of numbers (1-9) remove them if it is in same line, or same category, then if it is length 1, add the only one remaining. Repeat this while needed.
B) Go through all the numbers, then check all the spots to see if they can have that number. Repeat this while needed.
2) What is the most efficient List for housing a list under 9 in length?
Thanks,
Legend
Answer 2) Not a list but a set would make sense. In this case BitSet.
Case 1) There are 27 rules in a 9x9 sudoku.
Case 1A) Every spot participates in 3 rules.
Case 1B) Every number is 9 times repeated; appears in 3 rules.
Answer 1) 1A and 1B should theoretical not be different, but 1A seems to make an algorithm & data structure easier.
I think B works! You can use a backtracking algorithm to check the empty spot with any of the 1-9 numbers(in order). Fill the spot with first available choice(1-9) and move ahead. If at any point you are unable to insert a number into a slot then backtrack to the previous slot and try a different number.
This might be helpful :
http://edwinchan.wordpress.com/2006/01/08/sudoku-solver-in-c-using-backtracking/
Could some one guide me on how to solve this problem.
We are given a set S with k number of elements in it.
Now we have to divide the set S into x subsets such that the difference in number of elements in each subset is not more than 1 and the sum of each subset should be as close to each other as possible.
Example 1:
{10, 20, 90, 200, 100} has to be divided into 2 subsets
Solution:{10,200}{20,90,100}
sum is 210 and 210
Example 2:
{1, 1, 2, 1, 1, 1, 1, 1, 1, 6}
Solution:{1,1,1,1,6}{1,2,1,1,1}
Sum is 10 and 6.
I see a possible solution in two stages.
Stage One
Start by selecting the number of subsets, N.
Sort the original set, S, if possible.
Distribute the largest N numbers from S into subsets 1 to N in order.
Distribute the next N largest numbers from S the subsets in reverse order, N to 1.
Repeat until all numbers are distributed.
If you can't sort S, then distribute each number from S into the subset (or one of the subsets) with the least entries and the smallest total.
You should now have N subsets all sized within 1 of each other and with very roughly similar totals.
Stage Two
Now try to refine the approximate solution you have.
Pick the largest total subset, L, and the smallest total subset, M. Pick a number in L that is smaller than a number in M but not by so much as to increase the absolute difference between the two subsets. Swap the two numbers. Repeat. Not all pairs of subsets will have swappable numbers. Each swap keeps the subsets the same size.
If you have a lot of time you can do a thorough search; if not then just try to pick off a few obvious cases. I would say don't swap numbers if there is no decrease in difference; otherwise you might get an infinite loop.
You could interleave the stages once there are at least two members in some subsets.
There is no easy algorithm for this problem.
Check out the partition problem also known as the easiest hard problem , that solve this for 2 sets. This problem is NP-Complete, and you should be able to find all the algorithms to solve it on the web
I know your problem is a bit different since you can chose the number of sets, but you can inspire yourself from solutions to the previous one.
For example :
You can transform this into a serie of linear programs, let k be the number of element in your set.
{a1 ... ak} is your set
For i = 2 to k:
try to solve the following program:
xjl = 1 if element j of set is in set number l (l <= i) and 0 otherwise
minimise max(Abs(sum(apxpn) -sum(apxpm)) for all m,n) // you minimise the max of the difference between 2 sets.
s.t
sum(xpn) on n = 1
(sum(xkn) on k)-(sum(xkm) on k) <= 1 for all m n // the number of element in 2 list are different at most of one element.
xpn in {0,1}
if you find a min less than one then stop
otherwise continue
end for
Hope my notations are clear.
The complexity of this program is exponential, and if you find a polynomial way to solve this you would probe P=NP so I don't think you can do better.
EDIT
I saw you comment ,I missunderstood the constraint on the size of the subsets (I thought it was the difference between 2 sets)
I don't I have time to update it I will do it when I have time.
sryy
EDIT 2
I edited the linear program and it should do what it's asked to do. I just added a constraint.
Hope this time the problem is fully understood, even though this solution might not be optimal
I'm no scientist, so I'd try two approaches:
After sorting items, then going from both "ends" and moving first and last to the actual set,then shift to next set, loop;
Then:
Checking the differences of sums of the sets, and shuffling items if it would help.
Coding the resulting sets appropriately and trying genetic algorithms.
I have read about the selection algorithm and I have a question maybe it looks silly!!! but why we consider the array as groups of 5 elements ?? can we consider it with 7 or 3 elements??thanks also is there any link to help me for understanding this aim better?
also this is my proof when we consider the array with 3 elements and it still is order of n,why?is this correct?
T(n)<=T(n/3)+T(n/3)+theta(n)
claim: T(n)<=cn
proof: For all k<=n : T(n)<=ck
T(n)<=(nc/3)+(nc/3)+theta(n)
T(n)<= (2nc/3)+theta(n)
T(n)<=cn-(cn/3-theta(n)) and for c>=3 theta(n) this algorithm with this condition will have an order of n,too !!!!
A little bit of googling and I found this. There is a very small section on why 5, but it doesn't really answer your question specifically other than to say that it is the smallest possible odd number that can be used (must be odd to give a median). There is some mathematical proof that it can't be 3 (but I don't really understand it myself). I think it is basically saying it can any odd number, 5 or greater, but the smaller the better, I guess because it will be quicker to find the median in the smaller group?
I think you made a mistake for T(n). It should be T(n)=T(n/3)+T(2n/3)+O(n).
The T(n/3) is for finding the pivot (median of medians). Only half of all the n/3 groups have a median smaller than the pivot. Those groups have 2 elements smaller than the pivot. Giving 2*(1/2 * n/3) == n/3 elements smaller than the pivot. Thus only 33% must be smaller than the pivot (and 33% must be greater than the pivot). So, in worse case you still have 66% for the next iteration, T(2n/3).
I can't read your proof well, but now it is impossible to prove it. Right?