Ive been staring at this problem for hours. For some reason I understand flood fill and recursion with 2-D arrays, but can't seem to even get started on this problem:
You have three assistants working for you. One is named Jeff, the other is named Jeff, and the third is named Jeff. They all type with the same speed of one page per minute. You came to your office today with a bunch of papers you need to have typed as soon as possible. You have to distribute the papers among your assistants in such a way that they finish all the papers at the earliest possible time. "Jeff do this," you yell. "Jeff do that," you say. "Jeff, finish the job," you admonish. So Jeff does. But you need to help him. So you have this APT.
Your task is, given a int[] with the number of pages for each paper, return the minimum number of minutes needed for your assistants to type all these papers. Assume that they can't divide a paper into parts, that is, each paper is typed by one person. For example, given {1,2,3,4,5,6,7}, the function should return 10 because 7+3=10, 6+2+1=9, and 5+4=9 (there are also other combinations of these numbers that would yield the same result).
Related
My computer science teacher has assigned this problem to us, and just about everyone in our class up-roared over the complexity of the problem. We are only in Advanced Topics of Computer Science in High school and none of us really have no idea where to start, what algorithms to use or anything. We have determined that going straight though every possible combination, there would be 2^50th combinations to run though which is way WAY to big for really any of us to search for. I'm just curious if this is even possible to do at our low Computer Science skill level and if anyone personally thinks that this is a fair problem because our teacher still hasn't found a solution to his own problem.
Thanks!
The solution space is not really 2^50. A tie (assuming only two candidates) means 269-269. You can't get to 269 with only one state (or even only a handful of states) so you can immediately throw out all small subsets and all large subsets (winning every state also doesn't work). Furthermore, you only need to look for subsets that total 269 (because there are 538 total, that means that the complement of each of those sets is also 269).
That said, this still boils down to the subset sum problem: (https://en.wikipedia.org/wiki/Subset_sum_problem) so any solution will not scale well (unless you figure out how to do it in polynomial time, in which case you can claim $1,000,000). However, your problem is not to scale it; for the case of the US electoral college configuration (including vote splits in some states) it is not too large to figure out in a reasonable (< 10 mins as you say) amount of time.
The solution space is smaller than it seems, since some states have the same number of electoral votes. For example, Florida and New York both have 29 electoral votes, so there are really just three cases, not four: both on the left, both on the right, and one on each side (which should be double-counted since this can happen in two ways). This reduces the number of cases to 6.2 * 10^9, over five orders of magnitude smaller than 2^51 (although, in exchange, there's a slight amount of extra work determining how many cases you're representing). Even without further optimization this is small enough to iterate over fairly quickly.
This PARI/GP script
EV=[55,38,29,20,18,16,15,14,13,12,11,10,9,8,7,6,5,4,3]~;
count=[1,1,2,2,1,2,1,1,1,1,4,4,3,2,3,6,3,5,8];
s=0; forvec(v=vector(#count,i,[0,count[1]]), if(v*EV==269, s+=prod(i=1,#count, binomial(count[i],v[i])))); s
yields an answer within milliseconds.
This version doesn't attempt to handle third-party candidates, split state votes, etc.
I just recently did a sudoku solver in java using backtracking.
Is it possible, given the solutions, to formulate a problem or puzzle?
EDIT
to formulate the original puzzle, is there some way to achieve this?
and an additional question,
given the puzzle and the solutions.
If I am able to solve the puzzle using the solutions (result is puzzles)
and at the same time able to solve the solutions using the puzzle (result is solutions)
Which has the greater number?
the puzzles? or the solutions?
It is possible to formulate one of the multiple possible original states.
Start with the final solution (all numbers are present)
Remove one number (chosen randomly or not)
Check if this number can be found back given the current state of the board (you already have a solver, it should be easy)
If this number can be calculated, everything is OK. Go back to 2.
If this number cannot be found back, put it back where it was. Go back to 2.
If no more numbers can be removed, you have reached one of the original states of the puzzle.
If you chose the numbers you remove randomly (step 2), you can execute this several times, and get different starting points that lead to the same final puzzle.
Creating a puzzle from a solution is very simple. You just apply the solving steps backwards.
For example, if a line contains 8 digits, you can fill in the 9th. If you do that backwards, if a line contains 9 digits, you can remove one. This will yield a very boring puzzle, but still a valid one (a valid puzzle is a puzzle with only one solution).
The more complicated the steps you make are, the more difficult the puzzle will be. In fact, brute forcing the puzzle is the most difficult strategy, executing it backwards probably boils down to randomly removing a digit, and brute force check if there is still only 1 unique solution. Notice you do not have to solve the entire puzzle: It's enough to prove there is only 1 way to add the removed digit back into the puzzle.
As for the second part of your question: This feels a bit like a mathematical question, but let me answer:
A good puzzle only has 1 solution. Since there are multiple puzzles that yield the same solution (like, there are 81 ways to fill in 80 of the 81 squares, all yielding the same solution from a different puzzle), you could say there are many more puzzles than solutions.
If you also allow puzzles with multiple solutions, it changes. For every puzzle, there must be one or more solutions, but all those solutions also belong to that puzzle, so the number of solutions to puzzles is equal to the number of puzzles to solutions. Invalid puzzles do not change this: Since they belong to 0 solutions, you need no additional puzzles belonging to those solutions.
Ps. It's also trivial to create puzzles if they do not need to be uniquely solvable: Just randomly remove some digits and you're done.
If I am able to solve the puzzle using the solutions (result is puzzles)
and at the same time able to solve the solutions using the puzzle (result is solutions)
Which has the greater number?
the puzzles? or the solutions?
There is no well-defined answer.
Each solution has exactly 281 corresponding puzzles, including the trivial one. Not all of these have a unique solution, but many do. Therefore, if the chosen set of solutions contains only one element, then the maximal set of puzzles that jointly correspond to the solution set is vastly larger.
On the other hand, the completely blank puzzle affords 6670903752021072936960 solutions. There are many pairs of those that share only the blank grid as a common puzzle. Thus, if the chosen set of puzzles includes only the blank grid then the set of corresponding solutions is vastly larger.
So in my school we have a day where everybody participates in different activities.
Each projects can have like 10 members. The whole day is divided in 2 or 3 different blocks, in which the pupils assigned to the activity change.(So in block 1 pupil x takes part in activity a and in the second block in activity d).
Before this day starts, we give make lists in which each pupil can tell us his 3 (or 4) favorite activities (he only takes part in two of them, these again are ordered from most "favorite" to least) in which he wants to take part.
Now our job is to assign these pupils in a way that we have the best overall satisfaction among the pupils (so everybody did more or less did get his/her chosen activities).What would be a good algorithm to solve this ?(I'm quite familiar with programming (especially java), so the approach would be enough too (although some (pseudo-)code would be great too:) )
Is there any way to do this, apart from calculating such a "satisfaction" value for each possible solution?
An optional feature would be that if someone can't get in to his/her project, they would get into a similar on (also this sounds kind of sexist, you could for example rate how "female"/"male" this activity is and choose similar activities according to this scale)
I'm hope this question is fits into stackexchange, if it is totally off-topic I would be happy to tell me about a more suitable stack.
Looking forward to your suggestions,
John
If the students are ranking each of their favorite activities (1-4) then it's simple to assign those activities a weight (1-4). You group everyone that weights a certain activity at a certain level and compare the number of students to the number of activities. If there's more students than spots the method of choosing is up in the air. I would say random for fairness or if you want to get fancy you can track it from day to day so that everyone gets a chance to participate in a favorite activity.
If there are more slots than students then you could poll for people who rate it a 3 and so on down the line.
That seems like a fair place to start at least.
I don't have an algorithm for you but there is a package which will do much of the job for you. The site is http://www.optaplanner.org/ and it is part of the Drools project.
Configuring the application requires some work. By the time you finish the configuration you will have gotten some hint as to how hard the task is, and why no simple algorithm will do the job.
Came across this interview programming test recently:
You're given a list of top 50 favorite artists for 1000 users (from last.fm)
Generate a list of all artist pairs that appear together at least 50 times.
The solution can't store in memory, or evaluate all possible pairs.
The solution should be scalable to larger datasets.
The solution doesn't have to be exact, ie you can report pairs with a high probability of meeting the cutoff.
I feel I have a pretty workable solution, but I'm wondering if they were looking for something specific that I missed.
(In case it makes a difference - this isn't from my own interviewing, so I'm not trying to cheat any prospective employers)
Here are my assumptions:
There's a finite maximum number of artists (622K according to MusicBrainz), while there is no limit on the number of users (well, not more than ~7 billion, I guess).
Artists follow a "long tail" distribution: a few are popular, but most are favorited by a very small number of users.
The cutoff, is chosen to select a certain percentage of artists (around 1% with 50 and the given data) so it will increase as the number of users increases.
The third requirement is a little vague - technically, if you have any exact solution you've "evaluated all possible pairs".
Practical Solution
first pass: convert artist names to numeric ids; store converted favorite data in a temp file; keep count of user favorites for each artist.
Requires a string->int map to keep track of assigned ids; can use a Patricia tree if space is more important than speed (needed 1/5th the space and twice the time in my, admittedly not very rigorous, tests).
second pass: iterate over the temp file; throw out artists which didn't, individually, meet the cutoff; keep counts of pairs in a 2d matrix.
Will require n(n-1)/2 bytes (or shorts, or ints, depending on the data size) plus the array reference overhead. Shouldn't be a problem since n is, at most, 0.01-0.05 of 622K.
This seems like it can process any sized real-world dataset using less than 100MB of memory.
Alternate Solution
If you can't do multiple passes (for whatever contrived reason), use an array of Bloom filters to keep the pair counts: for each pair you encounter, find the highest filter it's (probably) in, and add to the next highest one. So, first time it's added to bf[0], second time bf[1], and so on until bf[49]. Or can revert to keeping actual counts after a certain point.
I haven't run the numbers, but the lowest few filters will be quite sizable - it's not my favorite solution, but it could work.
Any other ideas?
You should consider one of the existing approaches for mining association rules. This is a well-researched problem, and it is unlikely that a home-grown solution would be much faster. Some references:
Wikipedia has a non-terrible list of implementations http://en.wikipedia.org/wiki/Association_rule_learning .
Citing a previous answer of mine: What is the most efficient way to access particular elements in a SortedSet? .
There is a repository of existing implementations here: http://fimi.ua.ac.be/src/ . These are tools that participated in a performance competition a few years back; many of them come with indicative papers to explain how/when/why they are faster than other algorithms.
With two points of the requirement being about inexact solution, I'm guessing they're looking for a fast shortcut approximation instead of an exhaustive search. So here's my idea:
Suppose that there is absolutely no correlation between a fan's choices for favorite artists. This is, of course, surely false. Someone who likes Rembrandt is far more likely to also like Rubens then he is to also like Pollock. (You did say we were picking favorite artists, right?) I'll get back to this in a moment.
Then make a pass through the data, counting the number of distinct artists, the number of fans, and how often each artist shows up as a favorite. When we're done making this pass: (1) Eliminate any artists who don't individually show up the required number of "pair times". If an arist only shows up 40 times, he can't possibly be included in more than 40 pairs. (2) For the remaining artists, convert each "like count" to a percentage, i.e. this artist was liked by, say, 10% of the fans. Then for each pair of artists, multiple their like percentages together and then multiply by the total number of fans. This is the estimated number of times they'd show up as a pair.
For example, suppose of 1000 fans, 200 say they like Rembrandt and 100 say they like Michaelangelo. That means 20% for Rembrandt and 10% for Michaelangelo. So if there's no correlation, we'd estimate that 20% * 10% * 1000 = 20 like both. This is below the threshold so we wouldn't include this pair.
The catch to this is that there almost surely is a correlation between "likes". My first thought would be to study some real data and see how much of a correlation there is, that is, how the real pair counts differs from the estimate. If we find that, say, the real count is rarely more than twice the estimated count, then we could just say that any pair that gives an estimate over 1/2 of the threshold we declare a "candidate". Then we do an exhaustive count on the candidates to see how many really meet the condition. This would allow us to eliminate all the pairs that fall well below the threshold as "unlikely" and thus not worth the cost of investigating.
This could miss pairs when the artists almost always occur together. If, say, 100 fans like Picasso, 60 like Van Gogh, and of the 60 who like Van Gogh 50 also like Picasso, their estimate will be MUCH lower than their actual. If this happens rarely enough it may fall into the acceptable "exact answer not required" category. If it happens all the time this approach won't work.
I am trying to write a simple AI for a "Get four" game.
The basic game principles are done, so I can throw in coins of different color, and they stack on each other and fill a 2D Array and so on and so forth.
until now this is what the method looks like:
public int insert(int x, int color) //0 = empty, 1=player1 2=player2"
X is the horizontal coordinate, as the y coordinate is determined by how many stones are in the array already, I think the idea is obvious.
Now the problem is I have to rate specific game situations, so find how many new pairs, triplets and possible 4 in a row I can get in a specific situation to then give each situation a specific value. With these values I can setup a "Game tree" to then decide which move would be best next (later on implementing Alpha-Beta-Pruning).
My current problem is that I can't think of an efficient way to implement a rating of the current game situation in a java method.
Any ideas would be greatly appreciated!
I'm guessing that this is a homework assignment, and that you mean you want to write the evaluation function and don't know what tricks to use?
The game is called "Connect 4" in English, so you can google for
"connect 4 evaluation function".
You can find enough people discussion heuristics.
Please don't copy an actual source code, it's an important exercise :)
The search space for Connect 4 isn't impossibly large. For a simple implementation, albeit one that'll take a while to run (perhaps tens of minutes) do a minimax search until someone wins, or the game ends. Assign +1 or -1 for a win for one player or the other, and 0 for a draw.
bollocks. search space is huge. you need to use a predefined table if you want to do that.