optimize source code to decrease time execution - java

I'm working on the next exercise from HackerRank: https://www.hackerrank.com/challenges/migratory-birds/problem?isFullScreen=false
So far I need to optimize my sourcecode in order to pass the tests related to time execution
This is my sourcecode:
class Result {
/*
* Complete the 'migratoryBirds' function below.
*
* The function is expected to return an INTEGER.
* The function accepts INTEGER_ARRAY arr as parameter.
*/
public static int migratoryBirds(List<Integer> arr) {
// Write your code here
int coincidences = 0;
int maxValuesPerCategory = 0;
//I'm using TreeMap because sort isthe key on this exercise
Map<Integer, Integer> results = new TreeMap<>();
List<Integer> targetKeys = new ArrayList<>();
//1. classifying values by coincidences
for(Integer element: arr){
coincidences = Collections.frequency(arr, element);
results.put(element, coincidences);
}
/*
2. filtering categories by highest coincidences,
if there are more than 1, choose the label with the lowest value
example: 4=5; 3=5 ->output= 3
*/
//getting the value with most coincidences
maxValuesPerCategory = Collections.max(results.values());
//iterate the map to identify which keys have the maxvalue
Set<Integer> keySet = results.keySet();
for(Integer key : keySet){
if(results.get(key) == maxValuesPerCategory){
targetKeys.add(key);
}
}
//3. sorting the list ascending to obtain the lowest value
Collections.sort(targetKeys);
//get the first value (it should be the lowest label category)
return targetKeys.get(0);
}
}
I would like to ask you about suggestions how to optimize stages 2 and 3 because, from my point of view, the first stage is efficient in terms of execution but if you have suggestions about it please let me know.
Thanks a lot in advanced

Related

Should I sort a hashmap that contains frequency with bucketsort or heapsort?

I have a hashmap in Java in this form HashMap<String, Integer> frequency. The key is a string where I hold the name of a movie and the value is the frequency of the said movie.
My program takes input from users so whenever someone is adding a video to favorite I go in the hashmap and I increment its frequency.
Now the problem is at one point I need to take the most k frequent movies. I've found that I could use bucketsort or heapsort in this leetcode problem (check the first comment), however I am not sure if it is more efficient in my case. My hashmap constantly updates, therefore I need to call the sorting algorithm again times if one frequency changed.
From my understanding, it takes O(N) time to build the map, where 'N' is the number of movies even with duplicates as it needs to add to the frequency, which gets me 'M' unique movie titles. Would that mean that heapsort will result in O(M * log(k)) and bucketsort O(M) for any given k?
Having a map that sorts on values (the thing you map to) isn't a thing, unfortunately. You could instead have a set whose keys sort themselves on frequency, but given that frequency is the key at that point, you couldn't look up entries in this set without knowing the frequency beforehand which eliminates the point of the exercise.
One strategy that comes to mind is to have 2 separate data structures. One serves to let you look up the actual object based on the name of the movie, the other is to be self-sorting:
#Data
public class MovieFrequencyTuple implements Comparable<MovieFrequencyTable> {
#NonNull private final String name;
private int frequency;
public void incrementFrequency() {
frequency++;
}
#Override public int compareTo(MovieFrequencyTuple other) {
int c = Integer.compare(frequency, other.frequency);
if (c != 0) return -c;
return name.compareTo(other.name);
}
}
and with that available to you:
SortedSet<MovieFrequencyTuple> frequencies = new TreeSet<>();
Map<String, MovieFrequencyTuple> movies = new HashMap<>();
public int increment(String movieName) {
MovieFrequencyTuple tuple = movies.get(name);
if (tuple == null) {
tuple = new MovieFrequencyTuple(name);
movies.put(name, tuple);
}
// Self-sorting data structures will just fail
// to do the job if you modify a sorting order on
// an object already in the collection. Thus,
// we take it out, modify, put it back in.
frequencies.remove(tuple);
tuple.incrementFrequency();
frequencies.add(tuple);
return tuple.getFrequency();
}
public int get(String movieName) {
MovieFrequencyTuple tuple = movies.get(movieName);
if (tuple == null) return 0;
return tuple.getFrequency();
}
public List<String> getTop10() {
var out = new ArrayList<String>();
for (MovieFrequencyTuple tuple : frequencies) {
out.add(tuple.getName());
if (out.size() == 10) break;
}
return out;
}
Each operation is amortized O(1) or O(logn), even the top10 operation. So, if you run a million times 'increment a movie's frequency, then obtain the top 10', with n = # of times we do that, then the worst case scenario is O(nlogn) performance.
NB: Uses lombok for constructors, getters, etc - if you don't like that, have your IDE generate these things.

How to return all subsets of size k of a set using recursion?

I am completing a lab assignment in which I need to return all of the subsets of a certain size k using recursion. The function accepts the set S and this value of k. I am doing this in Java; I have already seen answers to this question, but they're in C and I am struggling to make the connections between the two languages.
I have already written a function to find the powerset of a given set S using recursion and understand how that code works (shown below). I am struggling the most with figuring out base and recursive cases for this problem, so I haven't really had any success in writing code that works. For the problem, we are not allowed to create a powerset from the function we already wrote and then choose the subsets of correct size; we must do it more efficiently.
public static Set<Set<String>> allSubsets(Set<String> s) {
Set<Set<String>> pSet = new HashSet<>();
Set<String> temp = new HashSet<>();
temp.addAll(s);
// base case
// if temp is empty set, add the empty set to the powerset
if (temp.isEmpty()) {
pSet.add(temp);
}
// recursive case
else {
Iterator<String> itr = temp.iterator();
String current = itr.next();
temp.remove(current);
Set<Set<String>> pSetTemp = allSubsets(temp);
for (Set<String> x : pSetTemp) {
pSet.add(x);
Set<String> copySubset = new HashSet<>();
copySubset.addAll(x);
copySubset.add(current);
pSet.add(copySubset);
}
}
return pSet;
}
Like I said, this code works, I just cannot solve the second part of the lab asking for a function that finds subsets of specific size k.
Here's an example in JavaScript that should give you some hints.
function f(S, k){
if (S.size == k)
return [S]
if (k == 0)
return [new Set]
let result = []
for (let e of S){
S.delete(e)
let smaller = f(new Set(S), k - 1)
for (let s of smaller){
s.add(e)
result.push(s)
}
}
return result
}
var S = new Set([1, 2, 3, 4, 5])
console.log(JSON.stringify(
f(S, 3).map(s => Array.from(s))))

Genetic Algorithm using Roulette Wheel Selection

I'm trying to create different selection methods for a genetic algorithm I'm working on but one problem I meet in all selection methods is that my fitness of each node must be different. This is a problem for me as my fitness calculator is quite basic and will yield several identical fitness's
public static Map<String, Double> calculateRouletteSelection(Map<String, Double> population) {
String[] keys = new String[population.size()];
Double[] values = new Double[population.size()];
Double[] unsortedValues = new Double[population.size()];
int index = 0;
for(Map.Entry<String, Double> mapEntry : population.entrySet()) {
keys[index] = mapEntry.getKey();
values[index] = mapEntry.getValue();
unsortedValues[index] = mapEntry.getValue();
index++;
}
Arrays.sort(values);
ArrayList<Integer> numbers = new ArrayList<>();
while(numbers.size() < values.length/2) {
int random = rnd.nextInt(values.length);
if (!numbers.contains(random)) {
numbers.add(random);
}
}
HashMap<String, Double> finalHashMap = new HashMap<>();
for(int i = 0; i<numbers.size(); i++) {
for(int j = 0; j<values.length; j++) {
if(values[numbers.get(i)] == unsortedValues[j]) {
finalHashMap.put(keys[j], unsortedValues[j]);
}
}
}
return finalHashMap;
}
90% of all my different selection methods are the same so I'm sure if I could solve it for one I can solve it for all.
Any help on what I'm doing wrong would be appreciated
EDIT: I saw I'm meant to post the general behavior of what's happening so essentially the method takes in a HashMap<>, sorts the values based on their fitness, picks half sorted values randomly and adds these to a new HashMap<> with their corresponding chromosomes.
I think you'd be much better off using collection classes.
List<Map.Entry<String, Double>> sorted = new ArrayList<>(population.entrySet());
// sort by fitness
Collections.sort(sorted, Comparator.comparing(Map.Entry::getValue));
Set<Integer> usedIndices = new HashSet<>(); // keep track of used indices
Map<String, Double> result = new HashMap<>();
while (result.size() < sorted.size()/2) {
int index = rnd.nextInt(sorted.size());
if (!usedIndices.add(index)) {
continue; // was already used
}
Map.Entry<String,Double> survivor = sorted.get(index);
result.put(survivor.getKey(), survivor.getValue());
}
return result;
But, as Sergey stated, I don't believe this is what you need for your algorithm; you do need to favor the individuals with higher fitness.
As mentioned in the comments, in roulette wheel selection order is not important, only weights are. A roulette wheel is like a pie chart with different sections occupying different portions of the disk, but in the end they all sum up to unit area (the area of the disk).
I'm not sure if there is an equivalent in Java, but in C++ you have std::discrete_distribution. It generates a distribution [0,n) which you initialise with weights representing the probability of each of those integers being picked. So what I normally do is have the IDs of my agents in an array and their corresponding fitness values in another array. Order is not important as long as indices match. I pass the array of fitness values to the discrete distribution, which returns an integer interpretable as an array index. I then use that integer to select the individual from the other array.

PairWise matching millions of records

I have an algorithmic problem at hand. To easily explain the problem, I will be using a simple analogy.
I have an input file
Country,Exports
Austrailia,Sheep
US, Apple
Austrialia,Beef
End Goal:
I have to find the common products between the pairs of countries so
{"Austrailia,New Zealand"}:{"apple","sheep}
{"Austrialia,US"}:{"apple"}
{"New Zealand","US"}:{"apple","milk"}
Process :
I read in the input and store it in a TreeMap > Where the List, the strings are interned due to many duplicates.
Essentially, I am aggregating by country.
where Key is country, Values are its Exports.
{"austrailia":{"apple","sheep","koalas"}}
{"new zealand":{"apple","sheep","milk"}}
{"US":{"apple","beef","milk"}}
I have about 1200 keys (countries) and total number of values(exports) is 80 million altogether.
I sort all the values of each key:
{"austrailia":{"apple","sheep","koalas"}} -- > {"austrailia":{"apple","koalas","sheep"}}
This is fast as there are only 1200 Lists to sort.
for(k1:keys)
for(k2:keys)
if(k1.compareTo(k2) <0){ //Dont want to double compare
List<String> intersectList = intersectList_func(k1's exports,k2's exports);
countriespair.put({k1,k2},intersectList)
}
This code block takes so long.I realise it O(n2) and around 1200*1200 comparisions.Thus,Running for almost 3 hours till now..
Is there any way, I can speed it up or optimise it.
Algorithm wise is best option, or are there other technologies to consider.
Edit:
Since both List are sorted beforehand, the intersectList is O(n) where n is length of floor(listOne.length,listTwo.length) and NOT O(n2) as discussed below
private static List<String> intersectList(List<String> listOne,List<String> listTwo){
int i=0,j=0;
List<String> listResult = new LinkedList<String>();
while(i!=listOne.size() && j!=listTwo.size()){
int compareVal = listOne.get(i).compareTo(listTwo.get(j));
if(compareVal==0){
listResult.add(listOne.get(i));
i++;j++;} }
else if(compareVal < 0) i++;
else if (compareVal >0) j++;
}
return listResult;
}
Update 22 Nov
My current implementation is still running for almost 18 hours. :|
Update 25 Nov
I had run the new implementation as suggested by Vikram and a few others. It's been running this Friday.
My question, is that how does grouping by exports rather than country save computational complexity. I find that the complexity is the same. As Groo mentioned, I find that the complexity for the second part is O(E*C^2) where is E is exports and C is country.
This can be done in one statement as a self-join using SQL:
test data. First create a test data set:
Lines <- "Country,Exports
Austrailia,Sheep
Austrailia,Apple
New Zealand,Apple
New Zealand,Sheep
New Zealand,Milk
US,Apple
US,Milk
"
DF <- read.csv(text = Lines, as.is = TRUE)
sqldf Now that we have DF issue this command:
library(sqldf)
sqldf("select a.Country, b.Country, group_concat(Exports) Exports
from DF a, DF b using (Exports)
where a.Country < b.Country
group by a.Country, b.Country
")
giving this output:
Country Country Exports
1 Austrailia New Zealand Sheep,Apple
2 Austrailia US Apple
3 New Zealand US Apple,Milk
with index If its too slow add an index to the Country column (and be sure not to forget the main. parts:
sqldf(c("create index idx on DF(Country)",
"select a.Country, b.Country, group_concat(Exports) Exports
from main.DF a, main.DF b using (Exports)
where a.Country < b.Country
group by a.Country, b.Country
"))
If you run out memory then add the dbname = tempfile() sqldf argument so that it uses disk.
Store something like following datastructure:- (following is a pseudo code)
ValuesSet ={
apple = {"Austrailia","New Zealand"..}
sheep = {"Austrailia","New Zealand"..}
}
for k in ValuesSet
for k1 in k.values()
for k2 in k.values()
if(k1<k2)
Set(k1,k2).add(k)
time complextiy: O(No of distinct pairs with similar products)
Note: I might be wrong but i donot think u can reduce this time complexity
Following is a java implementation for your problem:-
public class PairMatching {
HashMap Country;
ArrayList CountNames;
HashMap ProdtoIndex;
ArrayList ProdtoCount;
ArrayList ProdNames;
ArrayList[][] Pairs;
int products=0;
int countries=0;
public void readfile(String filename) {
try {
BufferedReader br = new BufferedReader(new FileReader(new File(filename)));
String line;
CountNames = new ArrayList();
Country = new HashMap<String,Integer>();
ProdtoIndex = new HashMap<String,Integer>();
ProdtoCount = new ArrayList<ArrayList>();
ProdNames = new ArrayList();
products = countries = 0;
while((line=br.readLine())!=null) {
String[] s = line.split(",");
s[0] = s[0].trim();
s[1] = s[1].trim();
int k;
if(!Country.containsKey(s[0])) {
CountNames.add(s[0]);
Country.put(s[0],countries);
k = countries;
countries++;
}
else {
k =(Integer) Country.get(s[0]);
}
if(!ProdtoIndex.containsKey(s[1])) {
ProdNames.add(s[1]);
ArrayList n = new ArrayList();
ProdtoIndex.put(s[1],products);
n.add(k);
ProdtoCount.add(n);
products++;
}
else {
int ind =(Integer)ProdtoIndex.get(s[1]);
ArrayList c =(ArrayList) ProdtoCount.get(ind);
c.add(k);
}
}
System.out.println(CountNames);
System.out.println(ProdtoCount);
System.out.println(ProdNames);
} catch (FileNotFoundException ex) {
Logger.getLogger(PairMatching.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(PairMatching.class.getName()).log(Level.SEVERE, null, ex);
}
}
void FindPairs() {
Pairs = new ArrayList[countries][countries];
for(int i=0;i<ProdNames.size();i++) {
ArrayList curr = (ArrayList)ProdtoCount.get(i);
for(int j=0;j<curr.size();j++) {
for(int k=j+1;k<curr.size();k++) {
int u =(Integer)curr.get(j);
int v = (Integer)curr.get(k);
//System.out.println(u+","+v);
if(Pairs[u][v]==null) {
if(Pairs[v][u]!=null)
Pairs[v][u].add(i);
else {
Pairs[u][v] = new ArrayList();
Pairs[u][v].add(i);
}
}
else Pairs[u][v].add(i);
}
}
}
for(int i=0;i<countries;i++) {
for(int j=0;j<countries;j++) {
if(Pairs[i][j]==null)
continue;
ArrayList a = Pairs[i][j];
System.out.print("\n{"+CountNames.get(i)+","+CountNames.get(j)+"} : ");
for(int k=0;k<a.size();k++) {
System.out.print(ProdNames.get((Integer)a.get(k))+" ");
}
}
}
}
public static void main(String[] args) {
PairMatching pm = new PairMatching();
pm.readfile("Input data/BigData.txt");
pm.FindPairs();
}
}
[Update] The algorithm presented here shouldn't improve time complexity compared to the OP's original algorithm. Both algorithms have the same asymptotic complexity, and iterating through sorted lists (as OP does) should generally perform better than using a hash table.
You need to group the items by product, not by country, in order to be able to quickly fetch all countries belonging to a certain product.
This would be the pseudocode:
inputList contains a list of pairs {country, product}
// group by product
prepare mapA (product) => (list_of_countries)
for each {country, product} in inputList
{
if mapA does not contain (product)
create a new empty (list_of_countries)
and add it to mapA with (product) as key
add this (country) to the (list_of_countries)
}
// now group by country_pair
prepare mapB (country_pair) => (list_of_products)
for each {product, list_of_countries} in mapA
{
for each pair {countryA, countryB} in list_of_countries
{
if mapB does not countain country_pair {countryA, countryB}
create a new empty (list_of_products)
and add it to mapB with country_pair {countryA, countryB} as key
add this (product) to the (list_of_products)
}
}
If your input list is length N, and you have C distinct countries and P distinct products, then the running time of this algorithm should be O(N) for the first part and O(P*C^2) for the second part. Since your final list needs to have pairs of countries mapping to lists of products, I don't think you will be able to lose the P*C^2 complexity in any case.
I don't code in Java too much, so I added a C# example which I believe you'll be able to port pretty easily:
// mapA maps each product to a list of countries
var mapA = new Dictionary<string, List<string>>();
foreach (var t in inputList)
{
List<string> countries = null;
if (!mapA.TryGetValue(t.Product, out countries))
{
countries = new List<string>();
mapA[t.Product] = countries;
}
countries.Add(t.Country);
}
// note (this is very important):
// CountryPair tuple must have value-type comparison semantics,
// i.e. you need to ensure that two CountryPairs are compared
// by value to allow hashing (mapping) to work correctly, in O(1).
// In C# you can also simply use a Tuple<string,string> to
// represent a pair of countries (which implements this correctly),
// but I used a custom class to emphasize the algorithm
// mapB maps each CountryPair to a list of products
var mapB = new Dictionary<CountryPair, List<string>>();
foreach (var kvp in mapA)
{
var product = kvp.Key;
var countries = kvp.Value;
for (int i = 0; i < countries.Count; i++)
{
for (int j = i + 1; j < countries.Count; j++)
{
var pair = CountryPair.Create(countries[i], countries[j]);
List<string> productsForCountryPair = null;
if (!mapB.TryGetValue(pair, out productsForCountryPair))
{
productsForCountryPair = new List<string>();
mapB[pair] = productsForCountryPair;
}
productsForCountryPair.Add(product);
}*
}
}
This is a great example to use Map Reduce.
At your map phase you just collect all the exports that belong to each Country.
Then, the reducer sorts the products (Products belong to the same country, because of mapper)
You will benefit from distributed, parallel algorithm that can be distributed into a cluster.
You are actually taking O(n^2 * time required for 1 intersect).
Lets see if we can improve time for intersect. We can maintain map for every country which stores corresponding products, so you have n hash maps for n countries. Just need to iterate thru all products once for initializing. If you want quick lookup, maintain a map of maps as:
HashMap<String,HashMap<String,Boolean>> countryMap = new HashMap<String, HashMap<String,Boolean>>();
Now if you want to find the common products for countries str1 and str2 do:
HashMap<String,Boolean> map1 = countryMap.get("str1");
HashMap<String,Boolean> map2 = countryMap.get("str2");
ArrayList<String > common = new ArrayList<String>();
Iterator it = map1.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String,Boolean> pairs = (Map.Entry)it.next();
//Add to common if it is there in other map
if(map2.containsKey(pairs.getKey()))
common.add(pairs.getKey());
}
So, total it will be O(n^2 * k) if there are k entries in one map assuming hash map lookup implementation is O(1) (I guess it is log k for java).
Using hashmaps where necessary to speed things up:
1) Go through the data and create a map with keys Items and values a list of countries associated with that item. So e.g. Sheep:Australia, US, UK, New Zealand....
2) Create a hashmap with keys each pair of countries and (initially) an empty list as values.
3) For each Item retrieve the list of countries associated with it and for each pair of countries within that list, add that item to the list created for that pair in step (2).
4) Now output the updated list for each pair of countries.
The largest costs are in steps (3) and (4) and both of these costs are linear in the amount of output produced, so I think this is not too far from optimal.

Is there a way to get the value of a HashMap randomly in Java?

Is there a way to get the value of a HashMap randomly in Java?
This works:
Random generator = new Random();
Object[] values = myHashMap.values().toArray();
Object randomValue = values[generator.nextInt(values.length)];
If you want the random value to be a type other than an Object simply add a cast to the last line. So if myHashMap was declared as:
Map<Integer,String> myHashMap = new HashMap<Integer,String>();
The last line can be:
String randomValue = (String) values[generator.nextInt(value.length)];
The below doesn't work, Set.toArray() always returns an array of Objects, which can't be coerced into an array of Map.Entry.
Random generator = new Random();
Map.Entry[] entries = myHashMap.entrySet().toArray();
randomValue = entries[generator.nextInt(entries.length)].getValue();
Since the requirements only asks for a random value from the HashMap, here's the approach:
The HashMap has a values method which returns a Collection of the values in the map.
The Collection is used to create a List.
The size method is used to find the size of the List, which is used by the Random.nextInt method to get a random index of the List.
Finally, the value is retrieved from the List get method with the random index.
Implementation:
HashMap<String, Integer> map = new HashMap<String, Integer>();
map.put("Hello", 10);
map.put("Answer", 42);
List<Integer> valuesList = new ArrayList<Integer>(map.values());
int randomIndex = new Random().nextInt(valuesList.size());
Integer randomValue = valuesList.get(randomIndex);
The nice part about this approach is that all the methods are generic -- there is no need for typecasting.
Should you need to draw futher values from the map without repeating any elements you can put the map into a List and then shuffle it.
List<Object> valuesList = new ArrayList<Object>(map.values());
Collections.shuffle( valuesList );
for ( Object obj : valuesList ) {
System.out.println( obj );
}
Generate a random number between 0 and the number of keys in your HashMap. Get the key at the random number. Get the value from that key.
Pseudocode:
int n = random(map.keys().length());
String key = map.keys().at(n);
Object value = map.at(key);
If it's hard to implement this in Java, then you could create and array from this code using the toArray() function in Set.
Object[] values = map.values().toArray(new Object[map.size()]);
Object random_value = values[random(values.length)];
I'm not really sure how to do the random number.
Converting it to an array and then getting the value is too slow when its in the hot path.
so get the set (either the key or keyvalue set) and do something like:
public class SetUtility {
public static<Type> Type getRandomElementFromSet(final Set<Type> set, Random random) {
final int index = random.nextInt(set.size());
Iterator<Type> iterator = set.iterator();
for( int i = 0; i < index-1; i++ ) {
iterator.next();
}
return iterator.next();
}
A good answer depends slightly on the circumstances, in particular how often you need to get a random key for a given map (N.B. the technique is essentially the same whether you take key or value).
If you need various random keys
from a given map, without the map
changing in between getting the
random keys, then use the random
sampling method as you iterate
through the key set. Effectively what
you do is iterate over the set
returned by keySet(), and on each
item calculate the probability of
wanting to take that key, given how
many you will need overall and the
number you've taken so far. Then
generate a random number and see if
that number is lower than the
probability. (N.B. This method will always work, even if you only need 1 key; it's just not necessarily the most efficient way in that case.)
The keys in a HashMap are effectively
in pseudo-random order already. In an
extreme case where you will only
ever need one random key for a
given possible map, you could even just
pull out the first element of the
keySet().
In other cases (where you either
need multiple possible random keys
for a given possible map, or the map
will change between you taking random
keys), you essentially have to
create or maintain an array/list of the keys from which you select a
random key.
If you are using Java 8, findAny function in a pretty solution:
MyEntityClass myRandomlyPickedObject = myHashMap.values().stream().findAny();
i really don't know why you want to do this... but if it helps, i've created a RandomMap that automatically randomizes the values when you call values(), then the following runnable demo application might do the job...
package random;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
public class Main {
public static void main(String[] args) {
Map hashMap = makeHashMap();
// you can make any Map random by making them a RandomMap
// better if you can just create the Map as a RandomMap instead of HashMap
Map randomMap = new RandomMap(hashMap);
// just call values() and iterate through them, they will be random
Iterator iter = randomMap.values().iterator();
while (iter.hasNext()) {
String value = (String) iter.next();
System.out.println(value);
}
}
private static Map makeHashMap() {
Map retVal;
// HashMap is not ordered, and not exactly random (read the javadocs)
retVal = new HashMap();
// TreeMap sorts your map based on Comparable of keys
retVal = new TreeMap();
// RandomMap - a map that returns stuff randomly
// use this, don't have to create RandomMap after function returns
// retVal = new HashMap();
for (int i = 0; i < 20; i++) {
retVal.put("key" + i, "value" + i);
}
return retVal;
}
}
/**
* An implementation of Map that shuffles the Collection returned by values().
* Similar approach can be applied to its entrySet() and keySet() methods.
*/
class RandomMap extends HashMap {
public RandomMap() {
super();
}
public RandomMap(Map map) {
super(map);
}
/**
* Randomize the values on every call to values()
*
* #return randomized Collection
*/
#Override
public Collection values() {
List randomList = new ArrayList(super.values());
Collections.shuffle(randomList);
return randomList;
}
}
Here is an example how to use the arrays approach described by Peter Stuifzand, also through the values()-method:
// Populate the map
// ...
Object[] keys = map.keySet().toArray();
Object[] values = map.values().toArray();
Random rand = new Random();
// Get random key (and value, as an example)
String randKey = keys[ rand.nextInt(keys.length) ];
String randValue = values[ rand.nextInt(values.length) ];
// Use the random key
System.out.println( map.get(randKey) );
Usually you do not really want a random value but rather just any value, and then it's nice doing this:
Object selectedObj = null;
for (Object obj : map.values()) {
selectedObj = obj;
break;
}
I wrote a utility to retrieve a random entry, key, or value from a map, entry set, or iterator.
Since you cannot and should not be able to figure out the size of an iterator (Guava can do this) you will have to overload the randEntry() method to accept a size which should be the length of the entries.
package util;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
public class MapUtils {
public static void main(String[] args) {
Map<String, Integer> map = new HashMap<String, Integer>() {
private static final long serialVersionUID = 1L;
{
put("Foo", 1);
put("Bar", 2);
put("Baz", 3);
}
};
System.out.println(randEntryValue(map));
}
static <K, V> Entry<K, V> randEntry(Iterator<Entry<K, V>> it, int count) {
int index = (int) (Math.random() * count);
while (index > 0 && it.hasNext()) {
it.next();
index--;
}
return it.next();
}
static <K, V> Entry<K, V> randEntry(Set<Entry<K, V>> entries) {
return randEntry(entries.iterator(), entries.size());
}
static <K, V> Entry<K, V> randEntry(Map<K, V> map) {
return randEntry(map.entrySet());
}
static <K, V> K randEntryKey(Map<K, V> map) {
return randEntry(map).getKey();
}
static <K, V> V randEntryValue(Map<K, V> map) {
return randEntry(map).getValue();
}
}
If you are fine with O(n) time complexity you can use methods like values() or values().toArray() but if you look for a constant O(1) getRandom() operation one great alternative is to use a custom data structure. ArrayList and HashMap can be combined to attain O(1) time for insert(), remove() and getRandom(). Here is an example implementation:
class RandomizedSet {
List<Integer> nums = new ArrayList<>();
Map<Integer, Integer> valToIdx = new HashMap<>();
Random rand = new Random();
public RandomizedSet() { }
/**
* Inserts a value to the set. Returns true if the set did not already contain
* the specified element.
*/
public boolean insert(int val) {
if (!valToIdx.containsKey(val)) {
valToIdx.put(val, nums.size());
nums.add(val);
return true;
}
return false;
}
/**
* Removes a value from the set. Returns true if the set contained the specified
* element.
*/
public boolean remove(int val) {
if (valToIdx.containsKey(val)) {
int idx = valToIdx.get(val);
int lastVal = nums.get(nums.size() - 1);
nums.set(idx, lastVal);
valToIdx.put(lastVal, idx);
nums.remove(nums.size() - 1);
valToIdx.remove(val);
return true;
}
return false;
}
/** Get a random element from the set. */
public int getRandom() {
return nums.get(rand.nextInt(nums.size()));
}
}
The idea comes from this problem from leetcode.com.
It seems that all other high voted answers iterate over all the elements. Here, at least, not all elements must be iterated over:
Random generator = new Random();
return myHashMap.values().stream()
.skip(random.nextInt(myHashMap.size()))
.findFirst().get();
It depends on what your key is - the nature of a hashmap doesn't allow for this to happen easily.
The way I can think of off the top of my head is to select a random number between 1 and the size of the hashmap, and then start iterating over it, maintaining a count as you go - when count is equal to that random number you chose, that is your random element.

Categories

Resources