I was thinking of making a Sudoku solver, I have 2 questions:
1) What would be faster?
A) Go through all the empty spots, have a list of numbers (1-9) remove them if it is in same line, or same category, then if it is length 1, add the only one remaining. Repeat this while needed.
B) Go through all the numbers, then check all the spots to see if they can have that number. Repeat this while needed.
2) What is the most efficient List for housing a list under 9 in length?
Thanks,
Legend
Answer 2) Not a list but a set would make sense. In this case BitSet.
Case 1) There are 27 rules in a 9x9 sudoku.
Case 1A) Every spot participates in 3 rules.
Case 1B) Every number is 9 times repeated; appears in 3 rules.
Answer 1) 1A and 1B should theoretical not be different, but 1A seems to make an algorithm & data structure easier.
I think B works! You can use a backtracking algorithm to check the empty spot with any of the 1-9 numbers(in order). Fill the spot with first available choice(1-9) and move ahead. If at any point you are unable to insert a number into a slot then backtrack to the previous slot and try a different number.
This might be helpful :
http://edwinchan.wordpress.com/2006/01/08/sudoku-solver-in-c-using-backtracking/
Related
so, for my final project in data structure class, we are to develop algorithms for palindrome, but I sorta want to fancy it up a bit and make into a mini program, what real life situation mimic the usage of palindrome,exempt for works on strings,
thanks !
In real life, could be used for some compression algorithms.
For example there are researches about biological sequence compression algorithms, that use this property
HERE, HERE and HERE more details
Palindromes are used in DNA for marking and permitting cutting. They are used to change one dimensional chain into 2 or 3 dimensional structure.
One interesting application of Longest Palindromic Substring aka Manacher's Algorithm that came out of my head is: while playing Indian Rummy card game (aka Rummy 13), if LPS is 5, then the card occuring at the middle of that substring is a great candiadate for next selection to have two rummies. Similarly if LPS is 6, 4 etc has other similar potential.
Example: (Played with more than 1 deck) There are 13 cards out of which 5 cards are as below:
Clubs of 2, 3, 4, 3 2
Let this be the longest palindrome we got out of all 13 cards. Here Middle card is Clubs 4, which is a great candidate to be selected next.
Because, if in next round you get a Clubs 4, you could have two rummies: 234 and 234.
Palindromes are strings that read the same forwards as backwards such as:
A man, a plan, a canal, Panama!
Was it Eliot's toilet I saw?
Dennis And Edna Sinned
There aren't many real-world applications for this, and finding Palindromes is fairly specific to Strings... Even numeric palindromes operate on the digits within a String...
ie. 580085
is a numeric palindrome, but would still be found by analysing characters in a String.
However, the skills you get from learning to traverse Strings in reverse, recognise special cases such as shared middle characters, perform case insensitive comparisons and strip out non alphanumeric characters from Strings when performing comparisons are useful to all sorts of real-world applications.
I have built a 8 puzzle solver using Breadth First Search. I would now want to modify the code to use heuristics. I would be grateful if someone could answer the following two questions:
Solvability
How do we decide whether an 8 puzzle is solvable ? (given a starting state and a goal state )
This is what Wikipedia says:
The invariant is the parity of the permutation of all 16 squares plus
the parity of the taxicab distance (number of rows plus number of
columns) of the empty square from the lower right corner.
Unfortunately, I couldn't understand what that meant. It was a bit complicated to understand. Can someone explain it in a simpler language?
Shortest Solution
Given a heuristic, is it guaranteed to give the shortest solution using the A* algorithm? To be more specific, will the first node in the open list always have a depth ( or the number of movements made so fat ) which is the minimum of the depths of all the nodes present in the open list?
Should the heuristic satisfy some condition for the above statement to be true?
Edit : How is it that an admissible heuristic will always provide the optimal solution? And how do we test whether a heuristic is admissible?
I would be using the heuristics listed here
Manhattan Distance
Linear Conflict
Pattern Database
Misplaced Tiles
Nilsson's Sequence Score
N-MaxSwap X-Y
Tiles out of row and column
For clarification from Eyal Schneider :
I'll refer only to the solvability issue. Some background in permutations is needed.
A permutation is a reordering of an ordered set. For example, 2134 is a reordering of the list 1234, where 1 and 2 swap places. A permutation has a parity property; it refers to the parity of the number of inversions. For example, in the following permutation you can see that exactly 3 inversions exist (23,24,34):
1234
1432
That means that the permutation has an odd parity. The following permutation has an even parity (12, 34):
1234
2143
Naturally, the identity permutation (which keeps the items order) has an even parity.
Any state in the 15 puzzle (or 8 puzzle) can be regarded as a permutation of the final state, if we look at it as a concatenation of the rows, starting from the first row. Note that every legal move changes the parity of the permutation (because we swap two elements, and the number of inversions involving items in between them must be even). Therefore, if you know that the empty square has to travel an even number of steps to reach its final state, then the permutation must also be even. Otherwise, you'll end with an odd permutation of the final state, which is necessarily different from it. Same with odd number of steps for the empty square.
According to the Wikipedia link you provided, the criteria above is sufficient and necessary for a given puzzle to be solvable.
The A* algorithm is guaranteed to find the (one if there are more than one equal short ones) shortest solution, if your heuristic always underestimates the real costs (In your case the real number of needed moves to the solution).
But on the fly I cannot come up with a good heuristic for your problem. That needs some thinking to find such a heuristic.
The real art using A* is to find a heuristic that always underestimates the real costs but as little as possible to speed up the search.
First ideas for such a heuristic:
A quite pad but valid heuristic that popped up in my mind is the manhatten distance of the empty filed to its final destination.
The sum of the manhatten distance of each field to its final destination divided by the maximal number of fields that can change position within one move. (I think this is quite a good heuristic)
For anyone coming along, I will attempt to explain how the OP got the value pairs as well as how he determines the highlighted ones i.e. inversions as it took me several hours to figure it out. First the pairs.
First take the goal state and imagine it as a 1D array(A for example)
[1,2,3,8,0,4,7,5]. Each value in that array has it's own column in the table(going all the way down, which is the first value of the pair.)
Then move over 1 value to the right in the array(i + 1) and go all the way down again, second pair value. for example(State A): the first column, second value will start [2,3,8,0,4,7,5] going down. the second column, will start [3,8,0,4,7,5] etc..
okay now for the inversions. for each of the 2 pair values, find their INDEX location in the start state. if the left INDEX > right INDEX then it's an inversion(highlighted). first four pairs of state A are: (1,2),(1,3),(1,8),(1,0)
1 is at Index 3
2 is at Index 0
3 > 0 so inversion.
1 is 3
3 is 2
3 > 2 so inversion
1 is 3
8 is 1
3 > 2 so inversion
1 is 3
0 is 7
3 < 7 so No inversion
Do this for each pairs and tally up the total inversions.
If both even or both odd (Manhattan distance of blank spot And total inversions)
then it's solvable. Hope this helps!
I am writing a simple Java program that will input a text file which will have some numbers representing a (n x n) matrix where numbers are separated by spaces. for ex:
1 2 3 4
5 6 7 8
9 1 2 3
4 5 6 7
I then want to store these numbers in a data structure that I will then use to manipulate the data (which will include, comparing adjecent numbers and also deleting certain numbers based on specific rules.
If a number is deleted, all the other numbers above it fall down the amount of spaces.
For the example above, if say i delete 8 and 9, then the result would be:
() 2 3 ()
1 6 7 4
5 1 2 3
4 5 6 7
so the numbers fall down in their columns.
And lastly, the matrix given will always be square (so always n x n, where n will be always given and will always be positive), therefore, the data structure has to be flexible to virtually accept any n-value.
I was originally implementing it in a 2-d array, but I was wandering if someone had an idea of a better data structure that I could use in order to improve efficiency (something that will allow me to more quickly access all the adjacent numbers in the matrix (rows and columns).
Ultimately, mu program will automatically check adjacent numbers against the rules, I delete numbers, re-format the matrix, and keep going, and in the end i want to be able to create an AI that will remove as many numbers from the matrix as possible in the least amount of moves as possible, for any n x n matrix.
In my opinion, you yo know the length of your array when you start, you are better off using an array. A simple dataType will be easier to navigate (direct access). Then again, using LinkedLists, you will be able to remove a middle value without having to re-arrange the data inside you matrix. This will leave you "top" value as null. in your example :
null 2 3 null
1 6 7 4
5 1 2 3
4 5 6 7
Hope this helps.
You could use one dimensional array with the size n*n.
int []myMatrix = new myMatrix[n * n];
To access element with coordinates (i,j) use myMatrix[i + j * n]. To fall elements use System.arraycopy to move lines.
Use special value (e.g. Integer.MIN_VALUE) as a mark for the () hole.
I expect it would be fastest and most memory efficient solution.
Array access is pretty fast. Accessing adjacent elements is easy, as you just increment the relevant index(s) (being cognizant of boundaries). You could write methods to encapsulate those operations that are well tested. Having elements 'fall down' though might get complicated, but shouldn't be too bad if you modularize it out by writing well tested methods.
All that said, if you don't need the absolute best speed, there are other options.
You also might want to consider a modified circularly linked list. When implementing a sudoku solver, I used the structure outlined here. Looking at the image, you will see that this will allow you to modify your 2d array as you want, since all you need to do is move pointers around.
I'll post a screen shot of relevant picture describing the datastructure here, although I would appreciate it if someone will warn me if I am violating some sort of copy right or other rights of the author, in which case I'll take it down...
Try a Array of LinkedLists.
If you want the numbers to auto-fall, I suggest you to use list for the coloumns.
I'm writing a program that generates bingo card numbers. The bingo card is composed of 5 columns, 4 numbers for each column. For the first column can only contain numbers 1-8, second 9-16, and so on (upto 40).
So in the database, what I did is I have two tables for this. The first table is for the column numbers. Each column contains a unique set of numbers (70 sets for each column, which is a combination of 8 taken 4). For 5 columns, I will have 350 sets. The second table is the card numbers. This is composed of 5 columns, each corresponding to the row for B, I, N, G, O. All in all, there are 1,680,700,000. possible combinations for this table. I did this way because each cards are duplicated for each game, only control numbers for cards are unique.
I want to track the winning card for every drawn number. I need the tracking as fast as I could, cause where talking about millions of cards here. I thought of 2 options on doing this:
First, checking each drawn number if it exists on the cards, minimizing the card pool for each draw.
Second, associating a unique prime number for each number(1-40), multiplying them and associate the product to the column (which I call the prime index). The 5 prime indexes, for each column, is multiplied and the product is associated to each card/combination (which I call the card index). When a number is drawn, the associated prime number is divided from the card index, checking if the drawn number is a factor of the card index. Each consecutive draw reduce the card index (for each card in the pool), and thus reduced to 1 if a winning card exists. I will be using MySQL and Java. Which of these 2 techniques is the faster approach? I also do consider the memory space, load, etc., but it is more important to me the speed of the tracking. Thanks a lot!
P.S. Sorry for the long explanation. I just want to clarify things. :D
If you want to be really fast just keep your 24 million cards in memory while they are needed and just do a simple comparation. Using the database for this is overkill and just makes everything more difficult. RAM is not expensive anymore.
There are exactly 70^5=1,680,700,000 possible cards. There is no need to store the cards itself. You can calculate the numbers on the card directly, with only the index. The other way around, given numbers find indices of cards, is just a little bit harder.
For example, card #1421934546. Putting this in base 70 gives: 59 15 40 50 46. (I mean 46 + 70*50 + 70^2*40 + 70^3*15 + 70^4*59 = 1421934546). So, the first column is the 46th(actually 47th because off-by-one) take of the possible 70 takes.
Given the drawn numbers, you can quickly find the columns that match. For example, with numbers 1,2 and 3. There are 5 sets in the first column that match, 1234, 1235, 1236, 1237 and 1238. So, all matching cards have % 70 one of those 5 ids. If you find all the possible sets for each columns, the Cartesian product will give all matching cards.
You don't have to arrange the data in memory the same as you have on the card. e.g. if you have N squares which can be either selected or not selected, a BitSet may be a good choice. This uses 1 bit per square (with some overhead).
Say you have up to 64 squares, this is one long value (64-bits). If you have 1 million cards. This will take up 8 MB of memory. Once you determine which card(s) are winners, you can determine who the owners are. (This could be stored in a database)
Say you sell a card to every adult in the US. (AFAIK, no lottery has ever been this popular) At say one dollar each, you would be bringing 200 million dollars. You would need 1.6 GB of memory which would fit into a 4 GB server costing about $500 easily. You could buy a 16 GB server for about $1000 just to be sure. ;)
I have read about the selection algorithm and I have a question maybe it looks silly!!! but why we consider the array as groups of 5 elements ?? can we consider it with 7 or 3 elements??thanks also is there any link to help me for understanding this aim better?
also this is my proof when we consider the array with 3 elements and it still is order of n,why?is this correct?
T(n)<=T(n/3)+T(n/3)+theta(n)
claim: T(n)<=cn
proof: For all k<=n : T(n)<=ck
T(n)<=(nc/3)+(nc/3)+theta(n)
T(n)<= (2nc/3)+theta(n)
T(n)<=cn-(cn/3-theta(n)) and for c>=3 theta(n) this algorithm with this condition will have an order of n,too !!!!
A little bit of googling and I found this. There is a very small section on why 5, but it doesn't really answer your question specifically other than to say that it is the smallest possible odd number that can be used (must be odd to give a median). There is some mathematical proof that it can't be 3 (but I don't really understand it myself). I think it is basically saying it can any odd number, 5 or greater, but the smaller the better, I guess because it will be quicker to find the median in the smaller group?
I think you made a mistake for T(n). It should be T(n)=T(n/3)+T(2n/3)+O(n).
The T(n/3) is for finding the pivot (median of medians). Only half of all the n/3 groups have a median smaller than the pivot. Those groups have 2 elements smaller than the pivot. Giving 2*(1/2 * n/3) == n/3 elements smaller than the pivot. Thus only 33% must be smaller than the pivot (and 33% must be greater than the pivot). So, in worse case you still have 66% for the next iteration, T(2n/3).
I can't read your proof well, but now it is impossible to prove it. Right?