EDIT (31-12-2019) - https://jonathan.overholt.org/projects/cutlist
Above is the link of the free project which is what exactly I am looking for. I am just looking for proper guidance so that I can make it work.
I am working on minimizing the wastage of aluminium extrusion cutting for Aluminium Sliding Window Fabricators and I am not able to figure out which Algorithm/Data Structure I should go with for the problem.
I have done basic research and found that the problem falls in Cutting Stock Problem (Also called One dimensional cutting problem), Linear Programming Problem, Greedy Algorithm. BUT I am unable to decide which one I should go with and also how to start with.
Brief about the problem :
Basically the window fabricators have 3 sizes of material options to purchase.
12 | 15 | 16 (IN FT)
Now the input would be the size of Window.
WIDTH x HEIGHT (IN FT)
1) 6 x 8 - 10 Windows
2) 9 x 3 - 20 Windows
The calculation is (2 x Width) + (2 x Height). So from the above sizes of window, they need extrusion as follow.
1) 6' (FT) Size Pieces -> 2 x 10 = 20
2) 8' (FT) Size Pieces -> 2 x 10 = 20
3) 9' (FT) Size Pieces -> 2 x 20 = 40
4) 3' (FT) Size Pieces -> 2 x 20 = 40
Using knapsack, we can find out the combination but it has restriction of having size to 1 only whereas here in my case I have 3 different sizes available out of which I would like to generate the best optimum combination for the cutting stock problem.
I would need help in how should I proceed for the above problem with respect to data structure and algorithm in Java or any other language. My knowledge in Maths is not up to the mark and that's why I am facing issue in implementing the logic in my project and like to get some help from the community.
There is no algorithm that will guarantee you the optimal solution other than brute-force checking all possible combinations. That's obviously not a good algorithm, at least not if you have large data sets.
You should have a look at heuristic search algorithms like Simulated Annealing, MCTS or the like. It should be no problem finding existing implementations of heuristic search engines that allow you to provide them with
an input set (random distribution of pieces on material),
a transition function (move pieces between materials),
and an evaluation function (the amount of waste)
and compute a near-optimal solution for you.
I'll echo the sentiment of the other answers: while there might be a "most correct" solution to this problem, what you're really looking for is a "correct enough" solution.
That said, I wrote up this little appendix to help make sense of the code in the project you referenced: cutlist generator design notes
Paraphrased:
Each iteration starts by creating a new instance of the longest stock and placing as many pieces into it as will fit. The stock is then reduced to the smallest stock that all of the selected pieces still fit into. This is all repeated until no pieces remain.
Another word of advice: make sure to clearly identify all of the assumptions you're building into the algorithm. Mine assumes that longer stock is cheaper per unit and that it's always preferred, but I've been asked for variations to optimize for the least number of cuts (bundle cutting) or to keep track of and prefer to use offcuts from previous runs first.
As always: Understand your customer's process very clearly before setting the assumptions.
Have you considered the simplex algorithm?
You have a minimization problem which can either be transformed into a maximization problem and then be solved by the algorithm or be solved by the dual simplex algorithm.
You'll find implementations on Google.
Related
I just recently did a sudoku solver in java using backtracking.
Is it possible, given the solutions, to formulate a problem or puzzle?
EDIT
to formulate the original puzzle, is there some way to achieve this?
and an additional question,
given the puzzle and the solutions.
If I am able to solve the puzzle using the solutions (result is puzzles)
and at the same time able to solve the solutions using the puzzle (result is solutions)
Which has the greater number?
the puzzles? or the solutions?
It is possible to formulate one of the multiple possible original states.
Start with the final solution (all numbers are present)
Remove one number (chosen randomly or not)
Check if this number can be found back given the current state of the board (you already have a solver, it should be easy)
If this number can be calculated, everything is OK. Go back to 2.
If this number cannot be found back, put it back where it was. Go back to 2.
If no more numbers can be removed, you have reached one of the original states of the puzzle.
If you chose the numbers you remove randomly (step 2), you can execute this several times, and get different starting points that lead to the same final puzzle.
Creating a puzzle from a solution is very simple. You just apply the solving steps backwards.
For example, if a line contains 8 digits, you can fill in the 9th. If you do that backwards, if a line contains 9 digits, you can remove one. This will yield a very boring puzzle, but still a valid one (a valid puzzle is a puzzle with only one solution).
The more complicated the steps you make are, the more difficult the puzzle will be. In fact, brute forcing the puzzle is the most difficult strategy, executing it backwards probably boils down to randomly removing a digit, and brute force check if there is still only 1 unique solution. Notice you do not have to solve the entire puzzle: It's enough to prove there is only 1 way to add the removed digit back into the puzzle.
As for the second part of your question: This feels a bit like a mathematical question, but let me answer:
A good puzzle only has 1 solution. Since there are multiple puzzles that yield the same solution (like, there are 81 ways to fill in 80 of the 81 squares, all yielding the same solution from a different puzzle), you could say there are many more puzzles than solutions.
If you also allow puzzles with multiple solutions, it changes. For every puzzle, there must be one or more solutions, but all those solutions also belong to that puzzle, so the number of solutions to puzzles is equal to the number of puzzles to solutions. Invalid puzzles do not change this: Since they belong to 0 solutions, you need no additional puzzles belonging to those solutions.
Ps. It's also trivial to create puzzles if they do not need to be uniquely solvable: Just randomly remove some digits and you're done.
If I am able to solve the puzzle using the solutions (result is puzzles)
and at the same time able to solve the solutions using the puzzle (result is solutions)
Which has the greater number?
the puzzles? or the solutions?
There is no well-defined answer.
Each solution has exactly 281 corresponding puzzles, including the trivial one. Not all of these have a unique solution, but many do. Therefore, if the chosen set of solutions contains only one element, then the maximal set of puzzles that jointly correspond to the solution set is vastly larger.
On the other hand, the completely blank puzzle affords 6670903752021072936960 solutions. There are many pairs of those that share only the blank grid as a common puzzle. Thus, if the chosen set of puzzles includes only the blank grid then the set of corresponding solutions is vastly larger.
I was searching the last few days for a stable implementation of the R-Tree with support of unlimited dimensions (20 or so would be enough). I only found this http://sourceforge.net/projects/jsi/ but they only support 2 dimensions.
Another Option would be a multidimensional implementation of an interval-tree.
Maybe I'm completly wrong with the idea of using an R-Tree or Intervall-Tree for my Problem so i state the Problem in short, that you can send me your thoughts about this.
The Problem I need to solve is some kind of nearest-neighbour search. I have a set of Antennas and rooms and for each antenna an interval of Integers. E.g. antenna 1, min -92, max -85. In fact it could be represented as room -> set of antennas -> interval for antenna.
The idea was that each room spans a box in the R-Tree over the dimension of the antennas and in each dimension by the interval.
If I get a query with N-Antennas and values for each antenna I then could just represent the Information as a query point in the room and retrieve the rooms "nearest" to the point.
Hope you got an Idea of the problem and my idea.
Be aware that R-Trees can degrade badly when you have discrete data. The first thing you really need to find out is an appropriate data representation, then test if your queries work on a subset of the data.
R-Trees will only make your queries faster. If they don't work in the first place, it will not help. You should test your approach without using R-Trees first. Unless you hit a large amount of data (say, 100.000 objects), a linear scan in-memory can easily outperform an R-Tree, in particular when you need some adapter layer because it is not well-intergrated with your code.
The obvious approach here is to just use bounding rectangles, and linearly scan over them. If they work, you can then store the MBRs in an R-Tree to get some performance improvements. But if it doesn't work with a linear scan, it won't work with an R-Tree either (it will not work faster.)
I'm not entirely clear on what your exact problem is, but an R-Tree or interval tree would not work well in 20 dimensions. That's not a huge number of dimensions, but it is large enough for the curse of dimensionality to begin showing up.
To see what I mean, consider just trying to look at all of the neighbors of a box, including ones off of corners and edges. With 20 dimensions, you'll have 320 - 1 or 3,486,784,400 neighboring boxes. (You get that by realizing that along each axis a neighbor can be -1 unit, 0 unit, or +1 unit, but (0,0,0) is not a neighbor because it represents the original box.)
I'm sorry, but you either need to accept brute force searching, or else analyze your problem better and come up with a cleverer solution.
I have found this R*-Tree implementation in Java which seems to offer many features:
https://github.com/davidmoten/rtree
You might want to check it out!
Another good implementation in Java is ELKI: https://elki-project.github.io/.
You can use PostgreSQL’s Generalized Search Tree indexing facility.
GiST
Quick demo
The desire to learn more about GA has sparkled again, and instead of reading a lot and doing nothing, I've decided to start the other way around: pick a problem and try to solve it.
I've picked the Magic Square problem.
For encoding the chromosomes I am using Permutation Encoding, and the following methods for Mutation() and NewChild(parent1, parent2, pivot) .
My selection algorithm is a bit weird, and is adapted from examples found on the Internet.
The score is calculated based on the square of the difference of the sum of rows/columns/diagonals and the magic constant, like this.
What I've noticed is that it converges very fast, and stops improving once it reaches a score of 1..7 (less is better).
I am seeing this as: it reaches a local optimum, a potential well, if you can call it that way, and won't jump over the near-by hill because the mutations are not different enough?
I've tried changing the mutation rate 5 - 80%, leaving an elite group of 10-20% in the chromosome population, changing the population size from 16-32 chromosomes, but no luck.
What am I doing wrong? What improvements can I use to make the population score converge to zero?
If required, I can post the full source code, if someone finds it useful or would like to play with it.
Update: Here is what the convergence rate looks like for a cube of size 5, with a crossover rate of 60% and a mutation rate of 10%:
There are no hard and fast answers for this, but I would advise reducing your elitism to no more than 5% and increasing your population size. I don't quite understand how your getPermutation() call is working to select potential candidates but I would also increase the size of your tournament (lines 8-16 in your selection algorithm) as you increase population size. If you're getting stuck in local optima it means either your mutation function isn't jumping far enough or you don't have enough diversity in your population to begin with.
Hope that helps, good luck!
In cases like this I've used subpopulations with good results. You basically run the algorithm n times and pick the best individuals of each run. With these you form an initial population (rather than initializing randomly as in a usual run) and run the ga again. The starting individuals in this last run are usually suboptimal in different ways and since all of them have more or less the same fitness they tend to pull each other out of their local minima when they get combined. Here it is also a good idea to use elitism in order to not lose the already evolved individuals to easily.
I was searching the last few days for a stable implementation of the R-Tree with support of unlimited dimensions (20 or so would be enough). I only found this http://sourceforge.net/projects/jsi/ but they only support 2 dimensions.
Another Option would be a multidimensional implementation of an interval-tree.
Maybe I'm completly wrong with the idea of using an R-Tree or Intervall-Tree for my Problem so i state the Problem in short, that you can send me your thoughts about this.
The Problem I need to solve is some kind of nearest-neighbour search. I have a set of Antennas and rooms and for each antenna an interval of Integers. E.g. antenna 1, min -92, max -85. In fact it could be represented as room -> set of antennas -> interval for antenna.
The idea was that each room spans a box in the R-Tree over the dimension of the antennas and in each dimension by the interval.
If I get a query with N-Antennas and values for each antenna I then could just represent the Information as a query point in the room and retrieve the rooms "nearest" to the point.
Hope you got an Idea of the problem and my idea.
Be aware that R-Trees can degrade badly when you have discrete data. The first thing you really need to find out is an appropriate data representation, then test if your queries work on a subset of the data.
R-Trees will only make your queries faster. If they don't work in the first place, it will not help. You should test your approach without using R-Trees first. Unless you hit a large amount of data (say, 100.000 objects), a linear scan in-memory can easily outperform an R-Tree, in particular when you need some adapter layer because it is not well-intergrated with your code.
The obvious approach here is to just use bounding rectangles, and linearly scan over them. If they work, you can then store the MBRs in an R-Tree to get some performance improvements. But if it doesn't work with a linear scan, it won't work with an R-Tree either (it will not work faster.)
I'm not entirely clear on what your exact problem is, but an R-Tree or interval tree would not work well in 20 dimensions. That's not a huge number of dimensions, but it is large enough for the curse of dimensionality to begin showing up.
To see what I mean, consider just trying to look at all of the neighbors of a box, including ones off of corners and edges. With 20 dimensions, you'll have 320 - 1 or 3,486,784,400 neighboring boxes. (You get that by realizing that along each axis a neighbor can be -1 unit, 0 unit, or +1 unit, but (0,0,0) is not a neighbor because it represents the original box.)
I'm sorry, but you either need to accept brute force searching, or else analyze your problem better and come up with a cleverer solution.
I have found this R*-Tree implementation in Java which seems to offer many features:
https://github.com/davidmoten/rtree
You might want to check it out!
Another good implementation in Java is ELKI: https://elki-project.github.io/.
You can use PostgreSQL’s Generalized Search Tree indexing facility.
GiST
Quick demo
As I have mentioned in previous questions I am writing a maze solving application to help me learn about more theoretical CS subjects, after some trouble I've got a Genetic Algorithm working that can evolve a set of rules (handled by boolean values) in order to find a good solution through a maze.
That being said, the GA alone is okay, but I'd like to beef it up with a Neural Network, even though I have no real working knowledge of Neural Networks (no formal theoretical CS education). After doing a bit of reading on the subject I found that a Neural Network could be used to train a genome in order to improve results. Let's say I have a genome (group of genes), such as
1 0 0 1 0 1 0 1 0 1 1 1 0 0...
How could I use a Neural Network (I'm assuming MLP?) to train and improve my genome?
In addition to this as I know nothing about Neural Networks I've been looking into implementing some form of Reinforcement Learning, using my maze matrix (2 dimensional array), although I'm a bit stuck on what the following algorithm wants from me:
(from http://people.revoledu.com/kardi/tutorial/ReinforcementLearning/Q-Learning-Algorithm.htm)
1. Set parameter , and environment reward matrix R
2. Initialize matrix Q as zero matrix
3. For each episode:
* Select random initial state
* Do while not reach goal state
o Select one among all possible actions for the current state
o Using this possible action, consider to go to the next state
o Get maximum Q value of this next state based on all possible actions
o Compute
o Set the next state as the current state
End Do
End For
The big problem for me is implementing a reward matrix R and what a Q matrix exactly is, and getting the Q value. I use a multi-dimensional array for my maze and enum states for every move. How would this be used in a Q-Learning algorithm?
If someone could help out by explaining what I would need to do to implement the following, preferably in Java although C# would be nice too, possibly with some source code examples it'd be appreciated.
As noted in some comments, your question indeed involves a large set of background knowledge and topics that hardly can be eloquently covered on stackoverflow. However, what we can try here is suggest approaches to get around your problem.
First of all: what does your GA do? I see a set of binary values; what are they? I see them as either:
bad: a sequence of 'turn right' and 'turn left' instructions. Why is this bad? Because you're basically doing a random, brute-force attempt at solving your problem. You're not evolving a genotype: you're refining random guesses.
better: every gene (location in the genome) represents a feature that will be expressed in the phenotype. There should not be a 1-to-1 mapping between genome and phenotype!
Let me give you an example: in our brain there are 10^13ish neurons. But we have only around 10^9 genes (yes, it's not an exact value, bear with me for a second). What does this tell us? That our genotype does not encode every neuron. Our genome encodes the proteins that will then go and make the components of our body.
Hence, evolution works on the genotype directly by selecting features of the phenotype. If I were to have 6 fingers on each hand and if that would made me a better programmer, making me have more kids because I'm more successful in life, well, my genotype would then be selected by evolution because it contains the capability to give me a more fit body (yes, there is a pun there, given the average geekiness-to-reproducibily ratio of most people around here).
Now, think about your GA: what is that you are trying to accomplish? Are you sure that evolving rules would help? In other words -- how would you perform in a maze? What is the most successful thing that can help you: having a different body, or having a memory of the right path to get out? Perhaps you might want to reconsider your genotype and have it encode memorization abilities. Maybe encode in the genotype how much data can be stored, and how fast can your agents access it -- then measure fitness in terms of how fast they get out of the maze.
Another (weaker) approach could be to encode the rules that your agent uses to decide where to go. The take-home message is, encode features that, once expressed, can be selected by fitness.
Now, to the neural network issue. One thing to remember is that NNs are filters. They receive an input. perform operations on it, and return an output. What is this output? Maybe you just need to discriminate a true/false condition; for example, once you feed a maze map to a NN, it can tell you if you can get out from the maze or not. How would you do such a thing? You will need to encode the data properly.
This is the key point about NNs: your input data must be encoded properly. Usually people normalize it, maybe scale it, perhaps you can apply a sigma function to it to avoid values that are too large or too small; those are details that deal with error measures and performance. What you need to understand now is what a NN is, and what you cannot use it for.
To your problem now. You mentioned you want to use NNs as well: what about,
using a neural network to guide the agent, and
using a genetic algorithm to evolve the neural network parameters?
Rephrased like so:
let's suppose you have a robot: your NN is controlling the left and right wheel, and as input it receives the distance of the next wall and how much it has traveled so far (it's just an example)
you start by generating a random genotype
make the genotype into a phenotype: the first gene is the network sensitivity; the second gene encodes the learning ratio; the third gene.. so on and so forth
now that you have a neural network, run the simulation
see how it performs
generate a second random genotype, evolve second NN
see how this second individual performs
get the best individual, then either mutate its genotype or recombinate it with the loser
repeat
there is an excellent reading on the matter here: Inman Harvey Microbial GA.
I hope I did you some insight on such issues. NNs and GA are no silver bullet to solve all problems. In some they can do very much, in others they are just the wrong tool. It's (still!) up to us to get the best one, and to do so we must understand them well.
Have fun in it! It's great to know such things, makes everyday life a bit more entertaining :)
There is probably no 'maze gene' to find,
genetic algorithms are trying to setup a vector of properties and a 'filtering system' to decide by some kind of 'surival of the fittest' algorithm to find out which set of properties would do the best job.
The easiest way to find a way out of a maze is to move always left (or right) along a wall.
The Q-Algorithm seems to have a problem with local maxima this was workaround as I remember by kicking (adding random values to the matrix) if the results didn't improve.
EDIT: As mentioned above a backtracking algorithm suits this task better than GA or NN.
How to combine both algorithm is described here NeuroGen descibes how GA is used for training a NN.
Try using the free open source NerounDotNet C# library for your Neural networks instead of implementing it.
For Reinforcement Learning library, I am currently looking for one, especially for Dot NET framework..