I have a load of vecmath Point objects (Point3d FWIM) which I would like to “group” based on the distance between them. I can probably write the code for this from scratch (I’ve done similar tasks in excel), but I like the idea of using existing libraries where possible. The problem is I can’t find any libraries like this.
I haven’t thought through the exact algorithm fully, but I hope I’ve done enough for the question not to be deleted. Please bear with me, I'm still new here at the time of this post.
I imagine the grouping would work as follows:
decide the distanceLimit
loop 1: for each Point, calculate the distance to each other Point
Make a "Set"
loop 2: for each Point
if the next Point is within the distanceLimit of any previously considered Points up to i, add it to current "Set"
Else make a new "Set".
Edit: ah the power of verbalising one's ideas. The above doesn't the capture the situation where points 1 and 2 are between one and two distanceLimits apart and initiate separate "sets", and point 3 crops up halfway between them meaning that all three should be in one set really. Need to think about this some more!
I’m also not sure yet what data structures I should really use for the input and output (ArrayLists? Sets?).
Ideally I am looking for an existing library that does this or similar; if you’re confident there isn’t one, then any suggestions for the algorithm or the actual code would be more than welcome.
After lots more googling, I found that:
what I was trying to do is called clustering;
this did exactly what I was trying to do; I was impressed with how well it worked for me.
I am currently working on modelling a certain MIP in gurobi (working in java).
This problem requires a lot of computational time and I don't want to waste more than 100s on each problem. However, when this time limit is reached, I want gurobi to print the best feasable solution it has found within these 100s. I tried using the attribute ObjBound, but this always gives me a lower objective than when I let gurobi find the optimale solution (my problem is a minimization problem). So I'm guessing here that this ObjBound gives a lower bound, while I'm trying to find an upper bound.
Is there any way I can accomplish this?
If you are simply looking to get any feasible solution (not the optimal, just feasbile), set the MIPFocus parameter to 1. This tells gurobi to concentrate on finding a feasible solution rather than an optimal solution.
model.setParam('MIPFocus', 1)
model.optimize()
I am trying to do some great Event recommandation using mahout.
For practice I tried following example-
https://github.com/RevBooyah/Static-mahout-recommender-tutorial/blob/master/ItemRecommend.java
I have some doubt that there are 3 things that used in data model UserId, ItemId and Preference as below-
But when I run the code with or without Preferences tha results are same, So my doubt is that what is the use of the Preferences ? If here it is useless then how can it be used for better Recommandation ?
I tried to find it but found nothing.
Can anyone please help me ?
Are you using Tanimoto similarity of the Log Likelihood Ratio? The sample code uses Tanimoto and so should show different recommendation strengths depending on preference strengths. That will attempt to do something like predicting a user's ratings. It won't affect all weights so to test you might want to randomly assign weights and compare to the sample data. But it's not really important enough to bother with IMO.
This is an old method that dates back to when Netflix and others thought they wanted to guess at your item ratings. Netflix and most others have moved away from that because it is really much more important to rank correctly so the user gets the right set of recs in the best order.
Ranking is always better when using the Log Likelihood similarity measure--on all data I've seen and I've measured the difference in quality several times. LLR ignores the preference strength and calculates the recommendations based on a probabilistic method trying to predict what the user is most likely to prefer.
Ted Dunning describes LLR here
I am trying to device an algorithm that performs error correction in names. My approach is having a database with the correct names, compute edit distance between each of them and the name entered and then suggest the 5 or 10 closest.
This task is significantly different from standard error correction in words as some of the names might be replaced by initials. For instance "Jonathan Smith" and "J. Smith" are actually quite close and could easily be considered the same name, so the edit distance should be really small if not 0. Another challenge is that some names might be written differently while sounding the same. For instance Shnaider and Schneider are versions of the same name written by people with different locales(there are better examples for that I guess). And another case - just imagine all the possible errors in writing Jawaharlal Nehru most of which have nothing to do with the real name. Again probably most of them will be similar phonetically.
Obviously Lucene's error correction algorithm will not help me here as it does not handle the above cases.
So my question is: do you know any library capable of doing error correction in names? Can you propose some algorithm for handling the cases mentioned above?
I am interested in libraries in c++ or java. As for algorithm proposals any language or pseudo code will do.
For phonetic matching, see Soundex.
I think modifying a Levenshtein distance algorithm to treat "abbreviate to an initial" and "expand from an initial" as single-distance edits ought to be straightforward, but the details are beyond me at the moment.
You might also look at Metaphone.
Yes, I know this is nothing new and there are many questions already out there (it even has its own tag), but I'd like to create a Sudoku Solver in Java solely for the purpose of training myself to write code that is more efficient.
Probably the easiest way to do this in a program is have a ton of for loops parse through each column and row, collect the possible values of each cell, then weed out the cells with only one possibility (whether they contain only 1 number, or they're the only cell in their row/column that contains this number) until you have a solved puzzle. Of course, a sheer thought of the action should raise a red flag in every programmer's mind.
What I'm looking for is the methodology to go about solving this sucker in the most efficient way possible (please try not to include too much code - I want to figure that part out, myself).
I want to avoid mathematical algorithms if at all possible - those would be too easy and 100% not my work.
If someone could provide a step-by-step, efficient thought process for solving a Sudoku puzzle (whether by a human or computer), I would be most happy :). I'm looking for something that's vague (so it's a challenge), but informative enough (so I'm not totally lost) to get me started.
Many thanks,
Justian Meyer
EDIT:
Looking at my code, I got to thinking: what would be some of the possibilities for storing these solving states (i.e. the Sudoku grid). 2D Arrays and 3D Arrays come to mind. Which might be best? 2D might be easier to manage from the surface, but 3D Arrays would provide the "box"/"cage" number as well.
EDIT:
Nevermind. I'm gonna go with a 3D array.
It depends on how you define efficient.
You can use a brute force method, which searches through each column and row, collects the possible values of each cell, then weeds out the cells with only one possibility.
If you have cells remaining with more than one possibility, save the puzzle state, pick the cell with the fewest possibilities, pick one of the possibilities, and attempt to solve the puzzle. If the possibility you picked leads to a puzzle contradiction, restore the saved puzzle state, go back to the cell and choose a different possibility. If none of the possibilities in the cell you picked solves the puzzle, pick the next cell with the fewest possibilities. Cycle through the remaining possibilities and cells until you've solved the puzzle.
Attempt to solve the puzzle means searching through each column and row, collecting the possible values of each cell, then weeding out the cells with only one possibility. When all of the cells are weeded out, you've solved the puzzle.
You can use a logical / mathematical method, where your code tries different strategies until the puzzle is solved. Search Google with "sudoku strategies" to see the different strategies. Using logical / mathematical methods, your code can "explain" how the puzzle was solved.
When I made mine, I thought I could solve every board using a set of rules without doing any backtracking. This proved impossible as even puzzles targeting human players potentially require making a few hypothesis.
So I starting with implementing the basic "rules" for solving a puzzle, trying to find the next rule to implement that would allow the resolution of where it stopped last time. In the end, I was forced to add a brute forcing recursive algorithm, but most puzzles are actually solved without using that.
I wrote a blog post about my sudoku solver. Just read through the "The algorithm" section and you'll get a pretty good idea how I went about it.
http://www.byteauthor.com/2010/08/sudoku-solver/
Should anyone need a reference Android implementation, I wrote a solution that uses the algorithm from the post above.
Full open-source code here: https://github.com/bizz84/SudokuSolver
Additionally, this solution loads Sudoku Puzzles in JSON format from a web server and posts back the results.
You should think about reducing the Sudoku Problem to a SATisfiability problem.
This method will avoid you to think too mathematically but more logically about the AI.
The goal step by step is basically :
* Find all the constraints that a Sudoku has. (line, column, box).
* Write these constraints as boolean constraints.
* Put all these constraints in a Boolean Satisfiability Problem.
* Run a SAT solver (or write your own ;) ) on this problem.
* Transform the SAT solution into the solution of the initial Sudoku.
It has been done by Ivor Spence by using SAT4J and you can find the Java Applet of his work here : http://www.cs.qub.ac.uk/~I.Spence/SuDoku/SuDoku.html.
You can also download directly the Java code from SAT4J website, to see how it look like : http://sat4j.org/products.php#sudoku.
And finally, the big advantage of this method is : You can solve N*N Sudokus, and not only the typical 9*9, which is I think, much more challenging for an AI :).