Extracting top recommendations from multiple recommendation lists - java

I have four lists of recommendations and lets say the lists are A,B,C,D.
Every list has the same number of items and are represented as key-value pairs. But I need to give more priority(weights) for the elements of list A than list B, so on. Ultimately I need to select the best set of items from the four lists for the final recommendation.
Here is a use case:
List_A:
{item1,weight1}
{item2,weight1}
{item3,weight1}
{item4,weight1}
{item5,weight1}
List_B:
{item8,weight2}
{item5,weight2}
{item7,weight2}
{item2,weight2}
{item6,weight2}
List_C:
{item11,weight3}
{item23,weight3}
{item34,weight3}
{item24,weight3}
{item5,weight3}
List_D:
{item9,weight4
{item7,weight4
{item3,weight4
{item2,weight4
{item5,weight4
Suppose weight1=10, weight2=5, weight3=3, weight1=2
According to these lists the final list should have "item5" as the first item, because it exists in all four lists. How can I get the other best recommendations for these four lists?
Thanks.

If I understand you right, this should be a bit simple. On a higher level you need a data structure as
Map<Item, Map<List, Integer>> where the final integer is number_of_occurrences. Once you have it is straightforward to multiply number_of_occurrences * weight and throw it along with generated value in a TreeMap (a priority queue can also be used here)
Now you can have the top-n list from the TreeMap.

Related

1 to 1 association of 2 string lists in java

I am a relatively new programmer and am working on my first project to build a portfolio. In my project I have 2 rather large lists of strings (about 3.1 million) and I need to "associate" the elements in each one with a 1 to 1 relationship from predetermined values (elements are selected according to a set method) not just linearly (from top to bottom). For example:
lista(0) = list1(5);
listb(0) = list2(2);
lista(1) = list1(1);
listb(1) = list2(4);
lista(2) = list1(3);
listb(2) = list2(1);
The point of this is to reorder the lists in a manner that can be recreated at a later time or by a different program by "remembering" a set of values. I am using 2 lists because I need to be able to search one list for a String then pull the value from the corresponding element in the other list.
I have tried many different methods like storing each list in an arrayList then accessing the elements in the preset order and storing them in new arrayLists in the new order, then removing the elements from the old arrayLists. This would be ideal but didn't work because removing elements from a really large arrayList was very slow. I figured that removing an element from the lists will prevent it from being used again.
I tried storing them in String arrays, then accessing each element in the predefined method, storing them in another array then nulling out the elements so that they wont be used again, but creating null spaces made searching a nightmare, because if the program hit a null element during the predefined "move" value, I had to add in checks for nulls, then more movement which made things more complicated and harder to reproduce later.
I need an easy, and efficient way to create these associations between these 2 lists and ANY ideas are welcome.
This is my first post to stackoverflow and I apologize if its formatted improperly or confusing, but please be gentle.
if you need to pull one value from a given string, why not using a map ? The key is the value of the first list and the value is the value of the second list
use Map<String,String> which stores Key as a string and value as a string.And the best part is time complexity of removing an element would be O(1).
As mentioned before, Map is an option.More specifically HashMap, or another option could be Hashtable. Make sure you look at what each has to offer. Some major differences are HashMap allows nulls but it is not synchronized. On the other hand Hashtable is synchronized and does not accept null as key.

which data structure to chose Guava library

Here's my problem i have two data structure one : countMark = new HashMap<Pair<String, String>, Integer>(); and the other which is the inverse orderMark = new TreeMap<Integer, List<Pair<String, String>>>();. I use the second one the quickly find the maximum value and the select a pair according to some rules.
But in my code i need to use orderMark.containsKey(counter) and that's not very efficient. As i increment the counter i also need to delete the specific pair. In consequence i have to do this orderMark.get(count - 1).remove(key);.
My question is i find that i could use MultiSet and MultiMap from Guava library but i didn't find the complexity about this data structure for add, contains, remove and get. And i would need a sorted map in order to select the pair which has the maximum value.
I hope that was sufficiently clear and thank you in advance for your answers.

which datastructure for this hashmap scenario

I have a scenario where i store values in a hashmap.
Keys are strings like
fruits
fruits_citrus_orange
fruits_citrus_lemon
fruits_fleshly_apple
fruits_fleshly
fruits_dry
and so on.
Values are some objects. Now for a given input say fruits_fleshly i need to retrieve all cases where it starts with "fruits_fleshly"
In the above case I need to fetch
fruits_fleshly_apple
fruits_fleshly
One way to do this is by doing String.indexOf over all the keys. Is there any other effective way to do this instead of iterating over all the keys in a map
though these are strings, but to me, it looks like these are certain categories & sub categories, like fruit, fruit-freshly, fruit-citrus etc..
If that is a case you can instead implement a Tree data-structure. This would be most effective for search operation.
since Tree has a parent-child structure, there is a root node & child node. You can have a structure like this:
(0) (1) (2)
fruit
|_____citrus
| |_____lemon
| |_____orange
|
|_____freshly
|_____apple
|_____
in this structure, say if you want to search for citrus fruit, you can just go to citrus, and list all its child. And finally you can construct full name by concatenating the name as a path from root to leaves.
Iterating the map seems quite simple and straight-forward way of doing this. However, since you don't want to iterate over keys on your own, you can use Guava's Maps#filterEntries, if you are ok with using 3rd party library.
Here's how it would work:
Map<String, Object> = Maps.filterEntries(
yourMap,
Predicate.containsPattern("^fruits_fleshly"));
But, that would too iterate over the map in the backyard. So, iteration is still there, if you are bothered about efficiency.
Since HashMap doesn't maintain any order for its keys it's not a very good choice for this problem. A better choice is the TreeMap: it has methods for retrieving a sub map for a range of keys. These methods run in O(log n) time (n number of entries) so it's better than iterating over the keys.
Map subMap = myMap.subMap("fruits_fleshly", true, "fruits_fleshly\uffff", true);
The nature of a hashmap means that there's no way to do a "like" comparison on keys - you have to iterate over them all to find where key.startsWith(input).
I suppose you could nest hashmaps and split up your keys. E.g.,
{
"fruits":{
"citrus":{
"orange":(value),
"lemon":(value)
},
"fleshly":{
"apple":(value),
"":(value)
}
}
}
...etc.
The performance implications are probably horrific on a small scale, but that may not matter in a homework context but maybe not so bad if you're dealing with a lot of data and only a couple layers of nesting.
Alternatively, create a Category object with a List of Categories (sub-categories) and a List of entries.
I believe Radix Trie is what you are looking for. It is similar idea as #ay89 solution.
You can just use this open source library Radix Trie example. It perform better than O(log(N)). You will be able to find a hashmap assigned to a key in average constant time (number of underscores in your search key string) with a decent implementation of Radix Trie.fruits
fruits_citrus_orange
fruits_citrus_lemon
fruits_fleshly_apple
fruits_fleshly
fruits_dry
Trie<String, Map> trie = new PatriciaTrie<>;
trie.put("fruits", hashmap1);
trie.put("fruits_citrus_orange", hashmap2);
trie.put("fruits_citrus_lemon", hashmap3);
trie.put("fruits_fleshly_apple", hashmap4);
trie.put("fruits_fleshly", hashmap5);
Map.Entry<String, Map> entry = trie.select("fruits_fleshy");
If you just want one hashmap to be return by select you might be able to get slightly better performance if you implement your own Radix Trie.

convert multi key list into multidimensional sorted map

I am using multi key bags in order to count occurrences of certain combinations of values and I was wondering if there is an elegant way to convert these bags into nested SortedMaps like TreeMaps. The number of nested TreeMaps being equal to the number of components in the multi key. For instance, let's say I have a multi key bag which has a defined key:
multiKey = new String[]{"age", "height", "gender"}
thus, the object I would like to obtain from it would be:
TreeMap<Integer, TreeMap<Integer, TreeMap<Integer, Integer>>>
and populate it with the values from the multi key bag. So, the nested structure would contain the values from the multi key like this:
TreeMap<"age", TreeMap<"height", TreeMap<"gender", count>>>
where "age" is replaced by the corresponding value from the bag, "height" as well and so on.. count is the number of occurrences of that particular combination (which is returned by the multi key bag itself).
Of course, the number of components of the multi key is dynamic.
If the multiKey would have only two components, then the resulting object would be:
TreeMap<Integer<TreeMap<Integer, Integer>>
Retrieving the values from the bag and populating the (nested) TreeMaps does not represent an issue. Only the conversion. Any help is appreciated.
Instead of using a bunch of wrappers, why don't you just create your own class that groups related data together? It seems like this would very much simplify the process.
Nonetheless, If what you actually want is to be able to perform complex queries on your data (pseudo-code):
SELECT ALL (MALES >= 25 && HEIGHT < 6'1) && (FEMALES < 40 && HEIGHT > 5'0)
Then you should probably look into using a database. I'm not saying that a Tree is bad, but if your goal is to be able to easily/quickly perform complex queries, then a database is the way to go. Of course, you could write your own classes/methods to perform these calculations for you, but why reinvent the wheel if you don't have to?

nth item of hashmap

HashMap selections = new HashMap<Integer, Float>();
How can i get the Integer key of the 3rd smaller value of Float in all HashMap?
Edit
im using the HashMap for this
for (InflatedRunner runner : prices.getRunners()) {
for (InflatedMarketPrices.InflatedPrice price : runner.getLayPrices()) {
if (price.getDepth() == 1) {
selections.put(new Integer(runner.getSelectionId()), new Float(price.getPrice()));
}
}
}
i need the runner of the 3rd smaller price with depth 1
maybe i should implement this in another way?
Michael Mrozek nails it with his question if you're using HashMap right: this is highly atypical scenario for HashMap. That said, you can do something like this:
get the Set<Map.Entry<K,V>> from the HashMap<K,V>.entrySet().
addAll to List<Map.Entry<K,V>>
Collections.sort the list with a custom Comparator<Map.Entry<K,V>> that sorts based on V.
If you just need the 3rd Map.Entry<K,V> only, then a O(N) selection algorithm may suffice.
//after edit
It looks like selection should really be a SortedMap<Float, InflatedRunner>. You should look at java.util.TreeMap.
Here's an example of how TreeMap can be used to get the 3rd lowest key:
TreeMap<Integer,String> map = new TreeMap<Integer,String>();
map.put(33, "Three");
map.put(44, "Four");
map.put(11, "One");
map.put(22, "Two");
int thirdKey = map.higherKey(map.higherKey(map.firstKey()));
System.out.println(thirdKey); // prints "33"
Also note how I take advantage of Java's auto-boxing/unboxing feature between int and Integer. I noticed that you used new Integer and new Float in your original code; this is unnecessary.
//another edit
It should be noted that if you have multiple InflatedRunner with the same price, only one will be kept. If this is a problem, and you want to keep all runners, then you can do one of a few things:
If you really need a multi-map (one key can map to multiple values), then you can:
have TreeMap<Float,Set<InflatedRunner>>
Use MultiMap from Google Collections
If you don't need the map functionality, then just have a List<RunnerPricePair> (sorry, I'm not familiar with the domain to name it appropriately), where RunnerPricePair implements Comparable<RunnerPricePair> that compares on prices. You can just add all the pairs to the list, then either:
Collections.sort the list and get the 3rd pair
Use O(N) selection algorithm
Are you sure you're using hashmaps right? They're used to quickly lookup a value given a key; it's highly unusual to sort the values and then try to find a corresponding key. If anything, you should be mapping the float to the int, so you could at least sort the float keys and get the integer value of the third smallest that way
You have to do it in steps:
Get the Collection<V> of values from the Map
Sort the values
Choose the index of the nth smallest
Think about how you want to handle ties.
You could do it with the google collections BiMap, assuming that the Floats are unique.
If you regularly need to get the key of the nth item, consider:
using a TreeMap, which efficiently keeps keys in sorted order
then using a double map (i.e. one TreeMap mapping integer > float, the other mapping float > integer)
You have to weigh up the inelegance and potential risk of bugs from needing to maintain two maps with the scalability benefit of having a structure that efficiently keeps the keys in order.
You may need to think about two keys mapping to the same float...
P.S. Forgot to mention: if this is an occasional function, and you just need to find the nth largest item of a large number of items, you could consider implementing a selection algorithm (effectively, you do a sort, but don't actually bother sorting subparts of the list that you realise you don't need to sort because their order makes no difference to the position of the item you're looking for).

Categories

Resources