Custom Sorting Algorithm in Java? - java

Hey guys so I want to write a small Java Program that helps me sort a list. Imagine the list looks like this:
Apples, Grapefruit, Bananas, Pineapples, Coconuts
Now I don't want to sort alphabetically or anything like that but for example by what fruit I like the most, so the sorted list could look like this: Coconuts, Bananas, Apples, Pineapples, Grapefruit
My idea so far was that it could kinda go like that: Apples is written into the list. Then Grapefruit and apple is compared and the user says what he likes more (here Apples) so Grapefruits move under Apples. Then it compares Bananas with eg Apples and the user tells the program he likes Bananas more so it goes above Apples and doesnt have to compare with Grapefruit anymore which saves a lot of time. The Program should handle a few hundred entries and comparisions in the end so saving time by asking fewer questions will save a lot of time. Am I on the right track? Also what would be the best way to input the list, an array, an arraylist, or...?
How should this be implemented? Is there a fitting sorting Algorithm? Thanks in advance!

You should build a Binary Search Tree.
As you're inserting new fruits, you ask the user which they like best, to find where to insert the new fruit node. To keep the number of questions down, keep the tree Balanced.
Once the "preference tree" has been built, you can iterate the tree depth-first, assigning incremental "preference values" to each fruit, and build a Map<String, Integer>, so you can quickly lookup any fruits preference values, aka sort sequence number.

The simplest way is to use Collections.sort with a custom comparator that asks for user input.
Scanner sc = new Scanner(System.in);
Collections.sort(fruits, (a, b) -> {
System.out.println("Do you prefer " + a + " or " + b + "?");
String preference = sc.next();
return preference.equals(a) ? -1 : preference.equals(b) ? 1 : 0;
});

The accepted answer is fine, but consider also this alternative:
You could store the fruits in a max-heap, which can be done with an ArrayList (when you don't know the number of fruits beforehand) or a plain array.
When you insert a new fruit, it is appended at the end of the array, and as you let it sift up (according to the heap algorithm), you ask the user for the result of the comparison(s), until the fruit is considered less liked than the one compared with (or it becomes the root -- at index 0).
As post processing, you need to pull out all elements from the top of the heap, again using the appropriate heap algorithm. They will come out in sorted order.

Related

Find super-string in a set of strings

I have a list of strings, like:
cargo
cargo pants
cargo pants men buy
cargo pants men
cargo pants men melbourne buy
In this, the string that contains all remaining strings is cargo pants men melbourne buy. I'd like to remove all the shorter strings and preserve only the longest "super string".
Note, if 2 queries cargo pants and cargo shorts exist, they will be treated as 2 different queries and won't be combined.
So far, I've been doing this the brute force way - pick a string from set and walk through the same set deleting all other strings that are "substrings" of the current string. Roughly,
for (String p: big_set) {
for (String q: big_set) {
if (!p.equals(q)) {
if (has_all_words(p, q)) { /* If all words in 'p' is also in 'q' */
big_set.remove(p);
break;
}
}
}
}
Is there an intelligent algorithm to do this in less than O(n^2) time? In this function, has_all_words will preserve the order of words while comparing.
For the curious, I have a massive list of a few billion search queries (like the ones send to Google/Yahoo/Bing) and I'm trying to find hypernyms for these queries. There's a server that parses this string and produces various interesting categories. I am trying to compress the queries list in the hopes of minimizing compute cost and bandwidth. This method surely reduces bandwidth significantly (because humans can't just think of buy cargo pants melbourne in one go), but the pre-computation cost is prohibitive. And so I've been hunting for algorithms that can do this, but I haven't come across anything that does this yet.
I think all you want to ask for is to remove all those sub strings
which can be found in a super string .Like in the case for ["foo
bar", "foo baz"] you will have to store both the strings .
If my guess is right then yes you can achieve it in less than O(n^2).
before starting with anything short each super-strings alphabetically
so that no such case remains like cargo pants pants cargo men buy
first, sort your string in decreasing order according to there
lengths.
Then pick up sub strings of the longest string (as we are
iterating from first index and have sorted in reverse order) and
start searching for it in rest of the strings.
If string is found remove it and Once searching and removing
completes just iterate again with the next sub string of the
same super-string with the last sub-string included.
In the end you will be left with only strings which are unique (if
you consider ["foo bar", "foo baz"] as a unique string.

Can array contain text as a name of something and in same way number as a value?

I'm creating app that shows you name of google search and you press higher or lower button to answer it. But right now i don't know exactly how to do it so i came to idea to make array and put a name as a question and number of searches as a number or value. I did search on google but nothing shows me the right method to do this. If it's not possible can you explain what is better way to make this possible.
You can use a Map<String, Integer> where the key is a question and the value is number of searches.
Map reference
Another option is to use 2 lists with synced indexes, one List will hold the questions and the second list for number of searches.
Correct me if I'm wrong, but what you are asking is how to have an array of both a string and number?
If I am correct in understanding what you are wanting to do, you can't do this with just one array (See ronginats' answer for a solution which is close to this). I would make two parallel arrays, one which stores the google search and the other storing the number, but both values share an index. So you could do something like the following:
String[] searchItems = {"What is stack overflow?", "bing.com", "Does youtube own google?"}; //All of the searches
long[] timesSearched = {5000, 10, 250}; //Times each have been searched
system.out.println(searchItems[0] + " has been searched " + timesSearched[0] + " times.");

Writing a simple text game in java. How to display items picked

I am trying to write the ("That's right, you brought: " + threeItems) part so that it specifies each item, and not just a copy of what the input is into the variable 'threeItems'
System.out.println("What items did you bring with again? (pick 3)");
System.out.println("(Crowbar, pistol, knife, key, flask, dynamite, flint, quill & parchment, devilishly good looks");
String threeItems = s.nextLine();
At this point, I want to display the three items the user picks out without directly printing what the user wrote. I thought about doing an if else statement but the amount of items would cause that part of the code to become extremely nested, which I do not want.
System.out.println("That's right, you brought: " + threeItems);
Thank you everyone for helping me on this! stackoverflow is a great community and I am glad to be a part of it!
You could use a Map to structure your items where the key is the application-specified item name and value a simple boolean which marks an item as being in possession by your character. You would set the boolean flag when the user input is a match to an item in the map.
With this data structure you would then simply iterate through the Map and add any item names that are "in possession" to the threeItems output variable.
How you do the mappings is up to you. Your code currently seems to suggest that the user enter a comma-delimited list of items on a single line and then hits enter. This could work but is error prone and you would have to parse the input.
I would suggest a loop which would ask the user to enter a single item name at a time and provide a way to exit the loop.
Follow up #Krzysztof in response to comment below: I didn't realize you needed to keep the selected list ordered. If you were to stick with the Map then you could use ints instead of booleans where any non-zero value means the item is in possession and the value would be the order of the item. The issue with this solution is that the Map while useful in the earlier scenario now becomes a hindrance.
Now, instead of using a Map I would suggest using Lists or rather, ArrayLists. Others have suggested Lists and arrays but I'll recommend them in a more Java-esque way:
Create a static final ArrayList<String> and populate it with item objects (just the item names really) in a constructor or initialization block. (ArrayList tutorial here.) Create a similar ArrayList for Items in possession which will initially be empty. Populate the list as the user enters the inventory of items. The index of the item +1 in the list is it's place in possession. Don't get cute with 1st, 2nd, 3rd etc. Use KISS: Item 1: ... Item 2: ... etc.
There are many possible options for this that really depend on how you want to work with the data.
The items could be an enum and then use something like an EnumSet to keep track of what you have in your inventory.
You could just store the items in a String array. Or an List of Strings (ArrayList or LinkedList), or Set of Strings (HashSet, TreeSet).
You could have each item be its own class and object. Use annotations to specify which item in the inventory it is and then annotations (but this is an implementation detail). And then you could put the verbs available on the items.
The options are really limitless and depend quite a bit on how you want to have the program work with the items and the inventory. For this, you really do need to think through the problem and the design of the game a bit more.
It depends on how the user enters the information. If they input "knife, pistol, key" you would interpret that differently than "knife pistol key."
If you would like to do it that way, look into using the split function. So, if you wanted to split the first way, it would go something like this:
String[] items = threeItems.split(", ");
where items[0] = "knife", the first thing they wrote, items[1] would equals "pistol", etc.
However, it might be better to clarify how the user should enter the input so you don't need to deal with a bunch of edge cases on how the user might format their input. You could have a new line for each item, like this:
System.out.print("First Item: ");
String first = s.next();
System.out.println();
System.out.print("Second Item: ");
String first = s.next();
System.out.println();
System.out.print("Third Item: ");
String first = s.next();
System.out.println();
The user enters an item on each line, and now you have their inputs in three seperate variables. This reduces the variability on how the user may enter information and allow for a cleaner, less-error prone game.

Anagram Algorithm using a hashtable and/or tries

I have been searching the internet for awhile now for steps to find all the anagrams of a string (word) (i.e. Team produces the word tame) using a hashtable and a trie. All I have found here on SO is to verify 2 words are anagrams. I would like to take it a step further and find an algorithm in english so that I can program it in Java.
For example,
Loop through all the characters.
For each unique character insert into the hashtable.
and so forth.
I don't want a complete program. Yes, I am practicing for an interview. If this question comes up then I will know it and know how to explain it not just memorize it.
the most succinct answer due to some guy quoted in the "programming pearls" book is (paraphrasing):
"sort it this way (waves hand horizontally left to right), and then that way (waves hand vertically top to bottom)"
this means, starting from a one-column table (word), create a two column table: (sorted_word, word), then sort it on the first column.
now to find anagrams of a word, first compute sorted word and do a binary search for its first occurrence in the first column of the table, and read off the second column values while the first column is the same.
input (does not need to be sorted):
mate
tame
mote
team
tome
sorted "this way" (horizontally):
aemt, mate
aemt, tame
emot, mote
aemt, team
emot, tome
sorted "that way" (vertically):
aemt, mate
aemt, tame
aemt, team
emot, mote
emot, tome
lookup "team" -> "aemt"
aemt, mate
aemt, tame
aemt, team
As far as hashtables/tries they only come into the picture if you want a slightly speedier lookup. Using hash tables you can partition the 2-column vertically sorted table into k-partitions based on the hash of the first column. this will give you a constant factor speedup because you have to do a binary search only within one partition. tries are a different way of optimizing by helping you avoid doing too many string comparisons, you hang off the index of the first row for the appropriate section of the table for each terminal in the trie.
Hash tables are not the best solution, so I doubt you would be required to use them.
The simplest approach to finding anagram pairs (that I know of) is as follows:
Map characters as follows:
a -> 2
b -> 3
c -> 5
d -> 7
and so on, such that letters a..z are mapped to the first 26 primes.
Multiply the character values for each character in the word, lets call it the "anagram number". Its pretty easy to see TEAM and TAME will produce the same number. Indeed the anagram values of two different words will be the same if and only if they are anagrams.
Thus the problem of finding anagrams between the two lists reduces to finding anagram values that appear on both lists. This easily done by sorting each list by anagram number and stepping through to find common values, in nlog(n) times.
String to char[]
sort it char[]
generate String from sorted char[]
use it as key to HashMap<String, List<String>>
insert current original String to list of values associated
for example for
car, acr, rca, abc it would have
acr: car, acr, rca
abc: abc

Extracting a given number of the highest values in a List

I'm seeking to display a fixed number of items on a web page according to their respective weight (represented by an Integer). The List where these items are found can be of virtually any size.
The first solution that comes to mind is to do a Collections.sort() and to get the items one by one by going through the List. Is there a more elegant solution though that could be used to prepare, say, the top eight items?
Just go for Collections.sort(..). It is efficient enough.
This algorithm offers guaranteed n log(n) performance.
You can try to implement something more efficient for your concrete case if you know some distinctive properties of your list, but that would not be justified. Furthermore, if your list comes from a database, for example, you can LIMIT it & order it there instead of in code.
Your options:
Do a linear search, maintaining the top N weights found along the way. This should be quicker than sorting a lengthly list if, for some reason, you can't reuse the sorting results between displaying the page (e.g. the list is changing quickly).
UPDATE: I stand corrected on the linear search necessarily being better than sorting. See Wikipedia article "Selection_algorithm - Selecting k smallest or largest elements" for better selection algorithms.
Manually maintain a List (the original one or a parallel one) sorted in weight order. You can use methods like Collections.binarySearch() to determine where to insert each new item.
Maintain a List (the original one or a parallel one) sorted in weight order by calling Collections.sort() after each modification, batch modifications, or just before display (possibly maintaining a modification flag to avoid sorting an already sorted list).
Use a data structure that maintains sorted weight-order for you: priority queue, tree set, etc. You could also create your own data structure.
Manually maintain a second (possibly weight-ordered) data structure of the top N items. This data structure is updated anytime the original data structure is modified. You could create your own data structure to wrap the original list and this "top N cache" together.
You could use a max-heap.
If your data originates from a database, put an index on that column and use ORDER BY and TOP or LIMIT to fetch only the records you need to display.
Or a priority queue.
using dollar:
List<Integer> topTen = $(list).sort().slice(10).toList();
without using dollar you should sort() it using Collections.sort(), then get the first n items using list.sublist(0, n).
Since you say the list of items from which to extract these top N may be of any size, and so may be large I assume, I'd augment the simple sort() answers above (which are entirely appropriate for reasonably-sized input) by suggesting most of the work here is finding the top N -- then sorting those N is trivial. That is:
Queue<Integer> topN = new PriorityQueue<Integer>(n);
for (Integer item : input) {
if (topN.size() < n) {
topN.add(item);
} else if (item > topN.peek()) {
topN.add(item);
topN.poll();
}
}
List<Integer> result = new ArrayList<Integer>(n);
result.addAll(topN);
Collections.sort(result, Collections.reverseOrder());
The heap here (a min-heap) is at least bounded in size. There's no real need to make a heap out of all your items.
No, not really. At least not using Java's built-in methods.
There are clever ways to get the highest (or lowest) N number of items from a list quicker than an O(n*log(n)) operation, but that will require you to code this solution by hand. If the number of items stays relatively low (not more than a couple of hundred), sorting it using Collections.sort() and then grabbing the top N numbers is the way to go IMO.
Depends on how many. Lets define n as the total number of keys, and m as the number you wish to display.
Sorting the entire thing: O(nlogn)
Scanning the array each time for the next highest number: O(n*m)
So the question is - What's the relation between n to m?
If m < log n, scanning will be more efficient.
Otherwise, m >= log n, which means sorting will be better. (Since for the edge case of m = log n it doesn't actually matter, but sorting will also give you the benefit of, well, sorting the array, which is always nice.
If the size of the list is N, and the number of items to be retrieved is K, you need to call Heapify on the list, which converts the list (which has to be indexable, e.g. an array) into a priority queue. (See heapify function in http://en.wikipedia.org/wiki/Heapsort)
Retrieving an item on the top of the heap (the max item) takes O (lg N) time. So your overall time would be:
O(N + k lg N)
which is better than O (N lg N) assuming k is much smaller than N.
If keeping a sorted array or using a different data structure is not an option, you could try something like the following. The O time is similar to sorting the large array but in practice this should be more efficient.
small_array = big_array.slice( number_of_items_to_find );
small_array.sort();
least_found_value = small_array.get(0).value;
for ( item in big_array ) { // needs to skip first few items
if ( item.value > least_found_value ) {
small_array.remove(0);
small_array.insert_sorted(item);
least_found_value = small_array.get(0).value;
}
}
small_array could be an Object[] and the inner loop could be done with swapping instead of actually removing and inserting into an array.

Categories

Resources