Java 8 - Collections.groupingBy result order - java

I'm preparing for a Java exam and have one question that got me lots of tough time. Despite studying it hard I'm not able to find out what determines the order of the result.
Have a look, please:
class Country {
public enum Continent {
ASIA, EUROPE
}
String name;
Continent region;
public Country(String na, Continent reg) {
name = na;
region = reg;
}
public String getName() {
return name;
}
public Continent getRegion() {
return region;
}
}
public class OrderQuestion {
public static void main(String[] args) {
List<Country> couList = Arrays.asList(
new Country("Japan", Country.Continent.ASIA),
new Country("Italy", Country.Continent.EUROPE),
new Country("Germany", Country.Continent.EUROPE));
Map<Country.Continent, List<String>> regionNames = couList.stream()
.collect(Collectors.groupingBy(Country::getRegion,
Collectors.mapping(Country::getName, Collectors.toList())));
System.out.println(regionNames);
}
}
What is the result?
A. {EUROPE = [Italy, Germany], ASIA = [Japan]}
B. {ASIA = [Japan], EUROPE = [Italy, Germany]}
C. {EUROPE = [Germany, Italy], ASIA = [Japan]}
D. {EUROPE = [Germany], EUROPE = [Italy], ASIA = [Japan]}
and what most important what determines the specific result and not another?

We can eliminate D because keys in Map need to be unique which fails for EUROPE.
We can eliminate C because of order in [Germany, Italy]. Italy was placed before Germany in list, so it also has to be stored in that order in result list.
But how we should decide if we should eliminate B or A? Well, we cant.
Map doesn't guarantee specific order of key-value pairs. Some maps allow remembering order of placing key-value pairs like LinkedHashMap, some allow to order entries by keys like TreeMap, but this behaviour is not specified for Collectors.groupingBy.
It is confirmed by fact that this method is using HashMap, which orders key-value pairs based on hashCode() of key (Country.Continent enum here) and amount of pairs already held. Implementation of hashCode() for Enum is inherited from Object class which means it is based on memory location which can change for each time when we run JVM, so it is random value which prevents us from assuming any order (which confirms that it is unspecified).
So based on lack of specification about Map returned by groupingBy both orders of entries is possible so both A and B are possible answers.

Related

Should I sort a hashmap that contains frequency with bucketsort or heapsort?

I have a hashmap in Java in this form HashMap<String, Integer> frequency. The key is a string where I hold the name of a movie and the value is the frequency of the said movie.
My program takes input from users so whenever someone is adding a video to favorite I go in the hashmap and I increment its frequency.
Now the problem is at one point I need to take the most k frequent movies. I've found that I could use bucketsort or heapsort in this leetcode problem (check the first comment), however I am not sure if it is more efficient in my case. My hashmap constantly updates, therefore I need to call the sorting algorithm again times if one frequency changed.
From my understanding, it takes O(N) time to build the map, where 'N' is the number of movies even with duplicates as it needs to add to the frequency, which gets me 'M' unique movie titles. Would that mean that heapsort will result in O(M * log(k)) and bucketsort O(M) for any given k?
Having a map that sorts on values (the thing you map to) isn't a thing, unfortunately. You could instead have a set whose keys sort themselves on frequency, but given that frequency is the key at that point, you couldn't look up entries in this set without knowing the frequency beforehand which eliminates the point of the exercise.
One strategy that comes to mind is to have 2 separate data structures. One serves to let you look up the actual object based on the name of the movie, the other is to be self-sorting:
#Data
public class MovieFrequencyTuple implements Comparable<MovieFrequencyTable> {
#NonNull private final String name;
private int frequency;
public void incrementFrequency() {
frequency++;
}
#Override public int compareTo(MovieFrequencyTuple other) {
int c = Integer.compare(frequency, other.frequency);
if (c != 0) return -c;
return name.compareTo(other.name);
}
}
and with that available to you:
SortedSet<MovieFrequencyTuple> frequencies = new TreeSet<>();
Map<String, MovieFrequencyTuple> movies = new HashMap<>();
public int increment(String movieName) {
MovieFrequencyTuple tuple = movies.get(name);
if (tuple == null) {
tuple = new MovieFrequencyTuple(name);
movies.put(name, tuple);
}
// Self-sorting data structures will just fail
// to do the job if you modify a sorting order on
// an object already in the collection. Thus,
// we take it out, modify, put it back in.
frequencies.remove(tuple);
tuple.incrementFrequency();
frequencies.add(tuple);
return tuple.getFrequency();
}
public int get(String movieName) {
MovieFrequencyTuple tuple = movies.get(movieName);
if (tuple == null) return 0;
return tuple.getFrequency();
}
public List<String> getTop10() {
var out = new ArrayList<String>();
for (MovieFrequencyTuple tuple : frequencies) {
out.add(tuple.getName());
if (out.size() == 10) break;
}
return out;
}
Each operation is amortized O(1) or O(logn), even the top10 operation. So, if you run a million times 'increment a movie's frequency, then obtain the top 10', with n = # of times we do that, then the worst case scenario is O(nlogn) performance.
NB: Uses lombok for constructors, getters, etc - if you don't like that, have your IDE generate these things.

how do I combine values that have the same key into one value in a hashMap?

I am tasked with the following problem: given an array of IngredientPortions (an object i made), if two of the elements in the IngredientPortions are the same Ingredient (Ingredient is another object that is a component of IngredientPortion objects), I am supposed to combine the IngredientPortion elements.
For example, if the IngredientPortion array I am given has two elements, such as an avacodo portion of say 1.5 oz, and another avocado portion of 2.0 oz, I should produce a new IngredientPortion array of 1 element: an avocado of 3.5 oz.
I am not sure how to do this, but I was thinking of using a hashmap of Strings as keys, representing the ingredient name, and values of IngredientPortion objects. If the hashmap already has a key of the given ingredientPortion.getName(), i would put in that specific ingredientPortion for that key, but I'm not sure how to combine the ingredientPortion amounts. Would it automatically combine it or would it store it as two different ingredientPortions under that one key??? Thanks in advance!
If your hashmap is storing values, then you can use this to stuff a value in it. This works whether it's the first ingredient (i.e. it's not in the map) or it's the second one (which needs to be added to the value already in the map:
map.put(ingredientKey, map.getOrDefault(ingredientKey, 0.0) + ingredientAmount);
I implement two solutions for your problem using the Java 8 streams and the groupingBy method. I hope this will help you.
import java.util.*;
import java.util.stream.Collectors;
public class Main {
public static class Ingredient {
private final String name;
private final int quantity;
public Ingredient(String name, int quantity) {
Objects.requireNonNull(name);
this.name = name;
this.quantity = quantity;
}
public String getName() {
return name;
}
public int getQuantity() {
return quantity;
}
#Override
public String toString() {
return "Ingredient{" +
"name='" + name + '\'' +
", quantity=" + quantity +
'}';
}
}
public static void main(String[] args) {
Ingredient[] ingredients = new Ingredient[]{
new Ingredient("banana", 1),
new Ingredient("cherry", 5),
new Ingredient("banana", 3),
new Ingredient("floor", 1)
};
// First solution: Group all quantities
Map<String, List<Integer>> collect = Arrays.stream(ingredients)
.collect(Collectors.groupingBy(Ingredient::getName,
Collectors.mapping(Ingredient::getQuantity, Collectors.toList())
));
System.out.println(collect);
// Second solution: Sum all quantities
Map<String, Integer> sum = Arrays.stream(ingredients)
.collect(
Collectors.groupingBy(Ingredient::getName,
Collectors.summingInt(Ingredient::getQuantity)
));
System.out.println(sum);
}
}
I am not sure how to do this, but I was thinking of using a hashmap of Strings as keys, representing the ingredient name, and values of IngredientPortion objects.
Sounds good so far.
If the hashmap already has a key of the given ingredientPortion.getName(), i would put in that specific ingredientPortion for that key, but I'm not sure how to combine the ingredientPortion amounts. Would it automatically combine it or would it store it as two different ingredientPortions under that one key???
Obviously, there's no problem when handling an IngredientPortion whose name is not already enrolled as a key in your map. When the key is already enrolled, however, putting a new value into the map with that same key will simply replace the previous value assigned to that key, not somehow combine the values. The documentation for the Map interface should be clear about that.
Indeed, how can you even hope a Map would automatically combine values when how to do so and even whether it's possible to do so depends on the type of the values? It sounds like you might not be certain how to form such a combination. If that's so, then figuring that out must be your first priority. That's, again, a question specific to the type, so we've no way to advise you about the details.
Once you know how to combine these objects in a suitable way, use that. For example, look up each name in the map, and depending on whether it is found, either insert or replace its value.

Is there an aggregateBy method in the stream Java 8 api?

Run across this very interesting but one year old presentation by Brian Goetz - in the slide linked he presents an aggregateBy() method supposedly in the Stream API, which is supposed to aggregate the elements of a list (?) to a map (given a default initial value and a method manipulating the value (for duplicate keys also) - see next slide in the presentation).
Apparently there is no such method in the Stream API. Is there another method that does something analogous in Java 8 ?
The aggregate operation can be done using the Collectors class. So in the video, the example would be equivalent to :
Map<String, Integer> map =
documents.stream().collect(Collectors.groupingBy(Document::getAuthor, Collectors.summingInt(Document::getPageCount)));
The groupingBy method will give you a Map<String, List<Document>>. Now you have to use a downstream collector to sum all the page count for each document in the List associated with each key.
This is done by providing a downstream collector to groupingBy, which is summingInt, resulting in a Map<String, Integer>.
They give basically the same example in the documentation where they compute the sum of the employees' salary by department.
I think that they removed this operation and created the Collectors class instead to have a useful class that contains a lot of reductions that you will use commonly.
Let's say we have a list of employees with their department and salary and we want the total salary paid by each department.
There are several ways to do it and you could for example use a toMap collector to aggregate the data per department:
the first argument is the key mapper (your aggregation axis = the department),
the second is the value mapper (the data you want to aggregate = salaries), and
the third is the merging function (how you want to aggregate data = sum the values).
Example:
import static java.util.stream.Collectors.*;
public static void main(String[] args) {
List<Person> persons = Arrays.asList(new Person("John", "Sales", 10000),
new Person("Helena", "Sales", 10000),
new Person("Somebody", "Marketing", 15000));
Map<String, Double> salaryByDepartment = persons.stream()
.collect(toMap(Person::department, Person::salary, (s1, s2) -> s1 + s2));
System.out.println("salary by department = " + salaryByDepartment);
}
As often with streams, there are several ways to get the desired result, for example:
import static java.util.stream.Collectors.*;
Map<String, Double> salaryByDepartment = persons.stream()
.collect(groupingBy(Person::department, summingDouble(Person::salary)));
For reference, the Person class:
static class Person {
private final String name, department;
private final double salary;
public Person(String name, String department, double salary) {
this.name = name;
this.department = department;
this.salary = salary;
}
public String name() { return name; }
public String department() { return department; }
public double salary() { return salary; }
}
This particular Javadoc entry is about the closest thing I could find on this piece of aggregation in Java 8. Even though it's a third party API, the signatures seem to line up pretty well - you provide some function to get values from, some terminal function for values (zero, in this case), and some function to combine the function and the values together.
It feels a lot like a Collector, which would offer us the ability to do this.
Map<String, Integer> strIntMap =
intList.stream()
.collect(Collectors
.groupingBy(Document::getAuthor,
Collectors.summingInt(Document::getPageCount)));
The idea then is that we group on the author's name for each entry in our list, and add up the total page numbers that the author has into a Map<String, Integer>.

Which collections to use?

Suppose I want to store phone numbers of persons. Which kind of collection should I use for key value pairs? And it should be helpful for searching. The name may get repeated, so there may be the same name having different phone numbers.
In case you want to use key value pair. Good choice is to use Map instead of collection.
So what should that map store ?
As far it goes for key. First thing you want to assure is that your key is unique to avoid collisions.
class Person {
long uniqueID;
String name;
String lastname;
}
So we will use the uniqueID of Person for key.
What about value ?
In this case is harder. As the single Person can have many phone numbers. But for simple task lest assume that a person can have only one phone number. Then what you look is
class PhoneNumberRegistry {
Map<Long,String> phoneRegistry = new HashMap<>();
}
Where the long is taken from person. When you deal with Maps, you should implement the hashCode and equals methods.
Then your registry could look like
class PhoneNumberRegistry {
Map<Person,String> phoneRegistry = new HashMap<>();
}
In case when you want to store more then one number for person, you will need to change the type of value in the map.
You can use Set<String> to store multiple numbers that will not duplicate. But to have full control you should introduce new type that not only store the number but also what king of that number is.
class PhoneNumberRegistry {
Map<Person,HashSet<String>> phoneRegistry = new HashMap<>();
}
But then you will have to solve various problems like, what phone number should i return ?
Your problem has different solutions. For example, I'll go with a LIST: List<Person>, where Person is a class like this:
public class Person{
private String name;
private List<String> phoneNumbers;
// ...
}
For collections searching/filtering I suggest Guava Collections2.filter method.
You should use this:
Hashtable<String, ArrayList<String>> addressbook = new Hashtable<>();
ArrayList<String> persons = new ArrayList<String>()
persons.add("Tom Butterfly");
persons.add("Maria Wanderlust");
addressbook.put("+0490301234567", persons);
addressbook.put("+0490301234560", persons);
Hashtable are save to not have empty elements, the ArrayList is fast in collect small elements. Know that multiple persons with different names may have same numbers.
Know that 2 persons can have the same number and the same Name!
String name = "Tom Butterfly";
String[] array = addressbook.keySet().toArray(new String[] {});
int firstElement = Collections.binarySearch(Arrays.asList(array),
name, new Comparator<String>() {
#Override
public int compare(String top, String bottom) {
if (addressbook.get(top).contains(bottom)) {
return 0;
}
return -1;
}
});
System.out.println("Number is " + array[firstElement]);
Maybe
List<Pair<String, String> (for one number per person)
or
List<Pair<String, String[]> (for multiple numbers per person)
will fit your needs.

Which collection to use?

What kind of collection should I use if I need to create a collection that will allow me to store books and how many copies there are in circulation (for a library)?
I would use an ArrayList, but I also want to be able to sort the books by order of issue year.
You can create a Book Class with all the attributes you have for a book. And implement a Comparable for that Book Class and write sorting logic in there.
Maintain a List<Book>, and use Collections.sort method, to sort your List according to the implemented Sorting logic.
UPDATE: -
As far as, fast look-up is concerned, a Map is always the best bet. And is appropriate to implement a dictionary look-up kind of structure. For that, you would need some attribute that uniquely identifies each book. And then store your book as Map<String, Book>, where your key might be id of type String.
Also, in this case, your sorting logic will change a little. Now you would have to sort on the basis of your Map's value, i.e. on the basis of attributes of Book.
Here's a sample code you can make use of. I have just considered sorting on the basis of id. You can change the sorting logic as needed: -
class Book {
private int id;
private String title;
public Book() {
}
public Book(int id, String title) {
this.id = id;
this.title = title;
}
#Override
public String toString() {
return "Book[Title:" + this.getTitle() + ", Id:" + this.getId() + "]";
}
// Getters and Setters
}
public class Demo {
public static void main(String[] args) {
final Map<String, Book> map = new HashMap<String, Book>() {
{
put("b1", new Book(3, "abc"));
put("b2", new Book(2, "c"));
}
};
List<Map.Entry<String, Book>> keyList = new LinkedList<Map.Entry<String, Book>>(map.entrySet());
Collections.sort(keyList, new Comparator<Map.Entry<String, Book>>() {
#Override
public int compare(Map.Entry<String, Book> o1, Map.Entry<String, Book> o2) {
return o1.getValue().getId() - o2.getValue().getId();
}
});
Map<String, Book> result = new LinkedHashMap<String, Book>();
for (Iterator<Map.Entry<String, Book>> it = keyList.iterator(); it.hasNext();) {
Map.Entry<String, Book> entry = it.next();
result.put(entry.getKey(), entry.getValue());
}
System.out.println(result);
}
}
OUTPUT: -
"{b2=Book[Title:c, Id:2], b1=Book[Title:abc, Id:3]}"
Well, If the entire purpose of your collection is to store the counts of the books, than a dictionary/map, or whatever java's key-value collection is called.
It would probably have title as your key, and the count as your value.
Now I suspect that your collection might be a little more complicated than that, so you might want to make a Book class which has Count as a field, and then I'd probably have a string -> Book dictionary/map anyway, with the string as it's dewy decimal number or some other unique identifier.
Beyond a simple educational or toy project, you'd want to use a database rather than an in-memory collection. (Not really an answer, but I think worth stating.)
java.util.TreeMap can be used to index and sort this kind of requirements.
Check http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html for more details.
You can use your Book object as key mapped to the number of copies as the value.

Categories

Resources