Quickly find matrix in another matrix - java

if there is a matrix A[][] of order m and another matrix B[][] of order n such that (m>n) you have to find the occurrence of matrix B[][] in matrix A[][].
A[5][5]=
1,2,3,4,5
5,4,1,9,7
2,1,7,3,4
6,4,8,2,7
0,2,4,5,8
B[3][3]=
1,9,7
7,3,4
8,2,7
This matrix B exist in A. I can do it by sliding window algo TC O(p^2*n^2) where p = m-n+1. but I want to do this with minimum time complexity.

You can use the Boyer-Moore string search for problems like this:
Compare right to left. In the first row, you compare 3 with 7. 3 doesn't appear in the first row of B, so you can move your window to the right by 3 elements. When you start the loop again, the window doesn't fit into the remains of A's first row. This means you could process the first row with 2 compares.
In the next row, you compare 1 with 7. 1 appears in B, so you move your window just enough that the 1 in B is over the 1 in A.
The next level would then be to start comparing with the lower right corner of B. That would compare 7 with 7. Since 7 appears three times in B, you have to figure out how to move the window efficiently using similar rules as Boyer-Moore.

There is an abstract which I have found on stack-overflow previously . Hope this should give youwhat are the possible algorithms and approaches you could use . An algorithm for searching for a two dimensional m x m pattern in a two dimensional n x n
text is presented.

Related

Algorithm to slice into square matrix a matrix

i'm searching for a algorithm that take a matrix (in fact, a double entry array) and return an array of matrix that:
is square (WIDTH = HEIGHT)
all of the element in the matrix has the same value.
I don't know if that is clear, so imagine that you have a image made of pixels that is red, blue or green and i want to get an array that contained the least possible squares. Like the pictures shows
EDIT:
Ok, maybe it's not clear: I've a grid of element that can have some values like that:
0011121
0111122
2211122
0010221
0012221
That was my input, and i want in output somethings like that:
|0|0|111|2|1|
|0|1|111|22|
|2|2|111|22|
|00|1|0|22|1|
|00|1|2|22|1|
When each |X| is an array that is a piece of the input array.
My goal is to minimize the number of output array
This problem does not seem to have an efficient solution.
Consider a subset of instances of your problem defined as follows:
There are only 2 values of matrix elements, say 0 and 1.
Consider only matrix elements with value 0.
Identify each matrix element m_ij with a unit square in a rectangular 2D grid whose lower left corner has the coordinates (i, n-j).
The set of unit squares SU chosen this way must be 'connected' and must not have 'holes'; formally, for each pair of units squares (m_ij, m_kl) \in SU^2: (i, j) != (k, l) there is a sequence <m_ij = m_i(0)j(0), m_i(1)j(1), ..., m_i(q)j(q) = m_kl> of q+1 unit squares such that (|i(r)-i(r+1)| = 1 _and_ j(r)=j(r+1)) _or_ (i(r)=i(r+1) _and_ |j(r)-j(r+1)| = 1 ); r=0...q (unit squares adjacent in the sequence share one side), and the set SUALL of all unit squares with lower left corner coordinates from the integers minus SU is also 'connected'.
Slicing matrices that admit for this construction into a minimal number of square submatrices is equivalent to tiling the smallest orthogonal polygon enclosing SU ( which is the union of all elements of SU ) into the minimum number of squares.
This SE.CS post gives the references (and one proof) that show that this problem is NP-complete for integer side lengths of the squares of the tiling set.
Note that according to the same post, a tiling into rectangles runs in polynomial time.
Some hints may be useful.
For representation of reduced matrix, maybe a vector is better because it's needed to be stored (start_x,start_y,value ... not sure if another matrix very useful).
Step 1: loop on x for n occurrences (start with y=0)
Step 2: loop on y for/untill n occurrences. Most of cases here will be m lees then n.
(case m greater then n excluded since cannot do a square) Fine, just keep the min value[m]
Step 3: mark on vector (start_x,start_y, value)
Repeat Step 1-3 from x=m until end x
Step 4: End x, adjust y starting from most left_x found(m-in vector, reiterate vector).
...
keep going till end matrix.
Need to be very careful of how boundary are made(squares) in order to include in result full cover of initial matrix.
Reformulate full-initial matrix can be recomposed exactly from result vector.
(need to find gaps and place it on vector derived from step_4)
Note ! This is not a full solution, maybe it's how to start and figure out on each steps what is to be adjusted.

Do QR/SVD decomposition in ojalgo require as many rows as columns?

When doing QR or SVD decomposition on an m x n matrix A in ojalgo, I've hit a snag. My purpose is to find a basis for the column null space. If m >= n, things work fine. For instance, QR decomposition of the transpose A' of a 5 x 4 matrix A with rank 2 gives me a 4 x 4 Q matrix whose last two columns span the null space of A.
On the other hand, if I start with a 5 x 7 matrix A with rank 5 (and do a QR decomposition of A'), I get the correct rank, but Q is 5 x 5 rather than 7 x 7, and I don't get the null space basis. Similarly, SVD with that same matrix A gets me five positive singular values (no zeros), and the Q2 matrix is 5 x 7 rather than 7 x 7 (no null vectors).
Is this expected behavior? I found a work-around for matrices with n > m (adding n-m rows of zeros to A), but it's clunky.
The matrices can be any size/shape, but calculating the economy sized decomposition is the default behaviour. It is what most users need/want. But there is an interface MatrixDecomposition.EconomySize that lets you control this (to optionally get the full size decomposition). Currently the QR, SVD and Bidiagonal decompositions implement it.

8 puzzle: Solvability and shortest solution

I have built a 8 puzzle solver using Breadth First Search. I would now want to modify the code to use heuristics. I would be grateful if someone could answer the following two questions:
Solvability
How do we decide whether an 8 puzzle is solvable ? (given a starting state and a goal state )
This is what Wikipedia says:
The invariant is the parity of the permutation of all 16 squares plus
the parity of the taxicab distance (number of rows plus number of
columns) of the empty square from the lower right corner.
Unfortunately, I couldn't understand what that meant. It was a bit complicated to understand. Can someone explain it in a simpler language?
Shortest Solution
Given a heuristic, is it guaranteed to give the shortest solution using the A* algorithm? To be more specific, will the first node in the open list always have a depth ( or the number of movements made so fat ) which is the minimum of the depths of all the nodes present in the open list?
Should the heuristic satisfy some condition for the above statement to be true?
Edit : How is it that an admissible heuristic will always provide the optimal solution? And how do we test whether a heuristic is admissible?
I would be using the heuristics listed here
Manhattan Distance
Linear Conflict
Pattern Database
Misplaced Tiles
Nilsson's Sequence Score
N-MaxSwap X-Y
Tiles out of row and column
For clarification from Eyal Schneider :
I'll refer only to the solvability issue. Some background in permutations is needed.
A permutation is a reordering of an ordered set. For example, 2134 is a reordering of the list 1234, where 1 and 2 swap places. A permutation has a parity property; it refers to the parity of the number of inversions. For example, in the following permutation you can see that exactly 3 inversions exist (23,24,34):
1234
1432
That means that the permutation has an odd parity. The following permutation has an even parity (12, 34):
1234
2143
Naturally, the identity permutation (which keeps the items order) has an even parity.
Any state in the 15 puzzle (or 8 puzzle) can be regarded as a permutation of the final state, if we look at it as a concatenation of the rows, starting from the first row. Note that every legal move changes the parity of the permutation (because we swap two elements, and the number of inversions involving items in between them must be even). Therefore, if you know that the empty square has to travel an even number of steps to reach its final state, then the permutation must also be even. Otherwise, you'll end with an odd permutation of the final state, which is necessarily different from it. Same with odd number of steps for the empty square.
According to the Wikipedia link you provided, the criteria above is sufficient and necessary for a given puzzle to be solvable.
The A* algorithm is guaranteed to find the (one if there are more than one equal short ones) shortest solution, if your heuristic always underestimates the real costs (In your case the real number of needed moves to the solution).
But on the fly I cannot come up with a good heuristic for your problem. That needs some thinking to find such a heuristic.
The real art using A* is to find a heuristic that always underestimates the real costs but as little as possible to speed up the search.
First ideas for such a heuristic:
A quite pad but valid heuristic that popped up in my mind is the manhatten distance of the empty filed to its final destination.
The sum of the manhatten distance of each field to its final destination divided by the maximal number of fields that can change position within one move. (I think this is quite a good heuristic)
For anyone coming along, I will attempt to explain how the OP got the value pairs as well as how he determines the highlighted ones i.e. inversions as it took me several hours to figure it out. First the pairs.
First take the goal state and imagine it as a 1D array(A for example)
[1,2,3,8,0,4,7,5]. Each value in that array has it's own column in the table(going all the way down, which is the first value of the pair.)
Then move over 1 value to the right in the array(i + 1) and go all the way down again, second pair value. for example(State A): the first column, second value will start [2,3,8,0,4,7,5] going down. the second column, will start [3,8,0,4,7,5] etc..
okay now for the inversions. for each of the 2 pair values, find their INDEX location in the start state. if the left INDEX > right INDEX then it's an inversion(highlighted). first four pairs of state A are: (1,2),(1,3),(1,8),(1,0)
1 is at Index 3
2 is at Index 0
3 > 0 so inversion.
1 is 3
3 is 2
3 > 2 so inversion
1 is 3
8 is 1
3 > 2 so inversion
1 is 3
0 is 7
3 < 7 so No inversion
Do this for each pairs and tally up the total inversions.
If both even or both odd (Manhattan distance of blank spot And total inversions)
then it's solvable. Hope this helps!

Best data structure to store and manipulate my data?

I am writing a simple Java program that will input a text file which will have some numbers representing a (n x n) matrix where numbers are separated by spaces. for ex:
1 2 3 4
5 6 7 8
9 1 2 3
4 5 6 7
I then want to store these numbers in a data structure that I will then use to manipulate the data (which will include, comparing adjecent numbers and also deleting certain numbers based on specific rules.
If a number is deleted, all the other numbers above it fall down the amount of spaces.
For the example above, if say i delete 8 and 9, then the result would be:
() 2 3 ()
1 6 7 4
5 1 2 3
4 5 6 7
so the numbers fall down in their columns.
And lastly, the matrix given will always be square (so always n x n, where n will be always given and will always be positive), therefore, the data structure has to be flexible to virtually accept any n-value.
I was originally implementing it in a 2-d array, but I was wandering if someone had an idea of a better data structure that I could use in order to improve efficiency (something that will allow me to more quickly access all the adjacent numbers in the matrix (rows and columns).
Ultimately, mu program will automatically check adjacent numbers against the rules, I delete numbers, re-format the matrix, and keep going, and in the end i want to be able to create an AI that will remove as many numbers from the matrix as possible in the least amount of moves as possible, for any n x n matrix.
In my opinion, you yo know the length of your array when you start, you are better off using an array. A simple dataType will be easier to navigate (direct access). Then again, using LinkedLists, you will be able to remove a middle value without having to re-arrange the data inside you matrix. This will leave you "top" value as null. in your example :
null 2 3 null
1 6 7 4
5 1 2 3
4 5 6 7
Hope this helps.
You could use one dimensional array with the size n*n.
int []myMatrix = new myMatrix[n * n];
To access element with coordinates (i,j) use myMatrix[i + j * n]. To fall elements use System.arraycopy to move lines.
Use special value (e.g. Integer.MIN_VALUE) as a mark for the () hole.
I expect it would be fastest and most memory efficient solution.
Array access is pretty fast. Accessing adjacent elements is easy, as you just increment the relevant index(s) (being cognizant of boundaries). You could write methods to encapsulate those operations that are well tested. Having elements 'fall down' though might get complicated, but shouldn't be too bad if you modularize it out by writing well tested methods.
All that said, if you don't need the absolute best speed, there are other options.
You also might want to consider a modified circularly linked list. When implementing a sudoku solver, I used the structure outlined here. Looking at the image, you will see that this will allow you to modify your 2d array as you want, since all you need to do is move pointers around.
I'll post a screen shot of relevant picture describing the datastructure here, although I would appreciate it if someone will warn me if I am violating some sort of copy right or other rights of the author, in which case I'll take it down...
Try a Array of LinkedLists.
If you want the numbers to auto-fall, I suggest you to use list for the coloumns.

custom partition problem

Could some one guide me on how to solve this problem.
We are given a set S with k number of elements in it.
Now we have to divide the set S into x subsets such that the difference in number of elements in each subset is not more than 1 and the sum of each subset should be as close to each other as possible.
Example 1:
{10, 20, 90, 200, 100} has to be divided into 2 subsets
Solution:{10,200}{20,90,100}
sum is 210 and 210
Example 2:
{1, 1, 2, 1, 1, 1, 1, 1, 1, 6}
Solution:{1,1,1,1,6}{1,2,1,1,1}
Sum is 10 and 6.
I see a possible solution in two stages.
Stage One
Start by selecting the number of subsets, N.
Sort the original set, S, if possible.
Distribute the largest N numbers from S into subsets 1 to N in order.
Distribute the next N largest numbers from S the subsets in reverse order, N to 1.
Repeat until all numbers are distributed.
If you can't sort S, then distribute each number from S into the subset (or one of the subsets) with the least entries and the smallest total.
You should now have N subsets all sized within 1 of each other and with very roughly similar totals.
Stage Two
Now try to refine the approximate solution you have.
Pick the largest total subset, L, and the smallest total subset, M. Pick a number in L that is smaller than a number in M but not by so much as to increase the absolute difference between the two subsets. Swap the two numbers. Repeat. Not all pairs of subsets will have swappable numbers. Each swap keeps the subsets the same size.
If you have a lot of time you can do a thorough search; if not then just try to pick off a few obvious cases. I would say don't swap numbers if there is no decrease in difference; otherwise you might get an infinite loop.
You could interleave the stages once there are at least two members in some subsets.
There is no easy algorithm for this problem.
Check out the partition problem also known as the easiest hard problem , that solve this for 2 sets. This problem is NP-Complete, and you should be able to find all the algorithms to solve it on the web
I know your problem is a bit different since you can chose the number of sets, but you can inspire yourself from solutions to the previous one.
For example :
You can transform this into a serie of linear programs, let k be the number of element in your set.
{a1 ... ak} is your set
For i = 2 to k:
try to solve the following program:
xjl = 1 if element j of set is in set number l (l <= i) and 0 otherwise
minimise max(Abs(sum(apxpn) -sum(apxpm)) for all m,n) // you minimise the max of the difference between 2 sets.
s.t
sum(xpn) on n = 1
(sum(xkn) on k)-(sum(xkm) on k) <= 1 for all m n // the number of element in 2 list are different at most of one element.
xpn in {0,1}
if you find a min less than one then stop
otherwise continue
end for
Hope my notations are clear.
The complexity of this program is exponential, and if you find a polynomial way to solve this you would probe P=NP so I don't think you can do better.
EDIT
I saw you comment ,I missunderstood the constraint on the size of the subsets (I thought it was the difference between 2 sets)
I don't I have time to update it I will do it when I have time.
sryy
EDIT 2
I edited the linear program and it should do what it's asked to do. I just added a constraint.
Hope this time the problem is fully understood, even though this solution might not be optimal
I'm no scientist, so I'd try two approaches:
After sorting items, then going from both "ends" and moving first and last to the actual set,then shift to next set, loop;
Then:
Checking the differences of sums of the sets, and shuffling items if it would help.
Coding the resulting sets appropriately and trying genetic algorithms.

Categories

Resources