Groupby counts in java

Groupby counts in java - java

I am pretty new to java moving from c#. I have the following class.
class Resource {
String name;
String category;
String component;
String group;
}
I want to know the following numbers:
1. Count of resources in the category.
2. Distinct count of components in each category. (component names can be duplicate)
3. Count of resources grouped by category and group.
I was able to achieve a little bit of success using Collectors.groupingBy. However, the result is always like this.
Map<String, List<Resource>>
To get the counts I have to parse the keyset and compute the sizes.
Using c# linq, I can easily compute all the above metrics.
I am assuming there is definitely a better way to do this in java as well. Please advise.

For #1, I'd use Collectors.groupingBy along with Collectors.counting:
Map<String, Long> resourcesByCategoryCount = resources.stream()
.collect(Collectors.groupingBy(
Resource::getCategory,
Collectors.counting()));
This groups Resource elements by category, counting how many of them belong to each category.
For #2, I wouldn't use streams. Instead, I'd use the Map.computeIfAbsent operation (introduced in Java 8):
Map<String, Set<String>> distinctComponentsByCategory = new LinkedHashMap<>();
resources.forEach(r -> distinctComponentsByCategory.computeIfAbsent(
r.getCategory(),
k -> new HashSet<>())
.add(r.getGroup()));
This first creates a LinkedHashMap (which preserves insertion order). Then, Resource elements are iterated and put into this map in such a way that they are grouped by category and each group is added to a HashSet that is mapped to each category. As sets don't allow duplicates, there won't be duplicated groups for any category. Then, the distinct count of groups is the size of each set.
For #3, I'd again use Collectors.groupingBy along with Collectors.counting, but I'd use a composite key to group by:
Map<List<String>, Long> resourcesByCategoryAndGroup = resources.stream()
.collect(Collectors.groupingBy(
r -> Arrays.asList(r.getCategory(), r.getGroup()), // or List.of
Collectors.counting()));
This groups Resource elements by category and group, counting how many of them belong to each (category, group) pair. For the grouping key, a two-element List<String> is being used, with the category being its 1st element and the component being its 2nd element.
Or, instead of using a composite key, you could use nested grouping:
Map<String, Map<String, Long>> resourcesByCategoryAndGroup = resources.stream()
.collect(Collectors.groupingBy(
Resource::getCategory,
Collectors.groupingBy(
Resource::getGroup,
Collectors.counting())));

Thanks Fedrico for detailed response. #1 and #3 worked great. For #2, i would like to see an output of Map. Here's the code that i am using currently to get that count. This is without using collectors in old style.
HashMap<String, HashSet<String>> map = new HashMap<>();
for (Resource resource : resources) {
if (map.containsKey(resource.getCategory())) {
map.get(resource.getCategory()).add(resource.getGroup());
} else
HashSet<String> componentSet = new HashSet<>();
componentSet.add(resource.getGroup());
map.put(resource.getCategory(), componentSet);
}
}
log.info("Group count in each category");
for (Map.Entry<String, HashSet<String>> entry : map.entrySet()) {
log.info("{} - {}", entry.getKey(), entry.getValue().size());
}

Related

Java List<E> to Map<P, List<E>> were key is some property of E and value is E with that property

I would like how to convert Java List to Map. Were key in a map is some property of the list element (different elements might have the same property) and value is a list of those list items (having the same property).
eg.List<Owner> --> Map<Item, List<Owner>>. I found a few List to Map questions, but it was not I want to do.
What I came with is:
List<Owner> owners = new ArrayList<>(); // populate from file
Map<Item, List<Owner>> map = new HashMap<>();
owners.parallelStream()
.map(Owner::getPairStream)
.flatMap(Function.identity())
.forEach(pair -> {
map.computeIfPresent(pair.getItem(), (k,v)-> {
v.add(pair.getOwner());
return v;
});
map.computeIfAbsent(pair.getItem(), (k) -> {
List<Owner> list = new ArrayList<>();
list.add(pair.getOwner());
return list;
});
});
PasteBin
I can put forEach part to a separate method, but it still feels too verbose. Plus I made a Pair class just to make it work. I tried to look in to Collectors but couldn't get my head around to do what I wanted.

From where this is, you can simplify your code by using groupingBy:
Map<Item, List<Owner>> map = owners.stream()
.flatMap(Owner::getPairStream)
.collect(Collectors.groupingBy(Pair::getItem,
Collectors.mapping(Pair::getOwner,
Collectors.toList())));
You can also dispense with the Pair class by using SimpleEntry:
Map<Item, List<Owner>> map = owners.stream()
.flatMap(owner -> owner.getItems()
.stream()
.map(item -> new AbstractMap.SimpleEntry<>(item, owner)))
.collect(Collectors.groupingBy(Entry::getKey,
Collectors.mapping(Entry::getValue,
Collectors.toList())));
Note that I'm assuming that Item has equals and hashCode overridden accordingly.
Side notes:
You can use map.merge instead of successively calling map.computeIfPresent and map.computeIfAbsent
HashMap and parallelStream make a bad combination (HashMap isn't thread-safe)

Java Stream or Map Merge by key

What would be the simplest way to merge Map key values like keys "55", "55004", "550009", "550012" into one key: "55" and a sum of all those values().
I'm trying to think of ways to use containsKey or trimming the key. It's very hard to think about this.
Maybe a flatMap to flatten the map and reduce.
#Test
public void TestM(){
Map<String,Object> map1 = new HashMap();
map1.put("55", 3453.34);
map1.put("55001", 5322.44);
map1.put("55003", 10112.44);
map1.put("55004", 15555.74);
map1.put("77", 1000.74); // instead of 1000 it should be ~1500
map1.put("77004", 444.74);
map1.put("77003", 66.74);
// in real example I'll need "77" and "88" and "101" etc.
// All of which has little pieces like 77004, 77006
Map<String,Double> SumMap = new HashMap<String, Double>();
SumMap = map1.entrySet().stream().map
(e->e.getValue()).reduce(0d, Double::sum);
// INCORRECT
// REDUCE INTO ONE KEY startsWith 55
System.out.println("Map: " + SumMap);
// RESULT should be :
// Map<String, Double> result = { "55": TOTAL }
// real example might be "77": TOTAL, "88": TOTAL, "101": TOTAL
//(reducing away the "77004", "88005" etc.)
}
Basically this code reduces and rolls subitem totals into a bigger key.

It looks like you could use Collectors.groupingBy.
It requires Function which would allow us decide which elements belong to same group. Such function for elements from same group should always return same value which will be used as key in resulting map. In your case it looks like you want to group elements with same first two characters stored in key, which suggest mapping to substring(0,2).
When we already have way to determine which elements belong to same group, we can now specify how we want map to collect them. By default it collects them in list so we have key->[elemnt0, element1, ...] mapping.
But we can specify your own way of handling elements from same group by providing our own Collector. Since we want to create sum of values we can use Collectors.summingDouble(mappingToDouble).
DEMO:
Map<String, Double> map1 = new HashMap<>();
map1.put("661", 123d);
map1.put("662", 321d);
map1.put("55", 3453.34);
map1.put("55001", 5322.44);
map1.put("55003", 10112.44);
map1.put("55004", 15555.74);
Map<String, Double> map = map1.entrySet()
.stream()
.collect(
Collectors.groupingBy(
entry -> entry.getKey().substring(0, 2),
Collectors.summingDouble(Map.Entry::getValue)
)
);
System.out.println(map);
Output: {66=444.0, 55=34443.96}

Simplifying loop with Java 8

I have a method that adds maps to a cache and I was wondering what I could do more to simplify this loop with Java 8.
What I have done so far:
Standard looping we all know:
for(int i = 0; i < catalogNames.size(); i++){
List<GenericCatalog> list = DummyData.getCatalog(catalogNames.get(i));
Map<String, GenericCatalog> map = new LinkedHashMap<>();
for(GenericCatalog item : list){
map.put(item.name.get(), item);
}
catalogCache.put(catalogNames.get(i), map);};
Second iteration using forEach:
catalogNames.forEach(e -> {
Map<String, GenericCatalog> map = new LinkedHashMap<>();
DummyData.getCatalog(e).forEach(d -> {
map.put(d.name.get(), d);
});
catalogCache.put(e, map);});
And third iteration that removes unnecessary bracers:
catalogNames.forEach(objName -> {
Map<String, GenericCatalog> map = new LinkedHashMap<>();
DummyData.getCatalog(objName).forEach(obj -> map.put(obj.name.get(), obj));
catalogCache.put(objName, map);});
My question now is what can be further done to simplify this?
I do understand that it's not really necessary to do anything else with this method at this point, but, I was curios about the possibilities.

There is small issue with solution 2 and 3 they might cause a side effects
Side-effects in behavioral parameters to stream operations are, in
general, discouraged, as they can often lead to unwitting violations
of the statelessness requirement, as well as other thread-safety
hazards.
As an example of how to transform a stream pipeline that
inappropriately uses side-effects to one that does not, the following
code searches a stream of strings for those matching a given regular
expression, and puts the matches in a list.
ArrayList<String> results = new ArrayList<>();
stream.filter(s -> pattern.matcher(s).matches())
.forEach(s -> results.add(s)); // Unnecessary use of side-effects!
So instead of using forEach to populate the HashMap it is better to use Collectors.toMap(..). I am not 100% sure about your data structure, but I hope it is close enough.
There is a List and corresponding Map:
List<Integer> ints = Arrays.asList(1,2,3);
Map<Integer,List<Double>> catalog = new HashMap<>();
catalog.put(1,Arrays.asList(1.1,2.2,3.3,4.4));
catalog.put(2,Arrays.asList(1.1,2.2,3.3));
catalog.put(3,Arrays.asList(1.1,2.2));
now we would like to get a new Map where a map key is element from the original List and map value is an other Map itself. The nested Map's key is transformed element from catalog List and value is the List element itself. Crazy description and more crazy code below:
Map<Integer, Map<Integer, Double>> result = ints.stream().collect(
Collectors.toMap(
el -> el,
el -> catalog.get(el).stream().
collect(Collectors.toMap(
c -> c.intValue(),
c -> c
))
)
);
System.out.println(result);
// {1={1=1.1, 2=2.2, 3=3.3, 4=4.4}, 2={1=1.1, 2=2.2, 3=3.3}, 3={1=1.1, 2=2.2}}
I hope this helps.

How about utilizing Collectors from the stream API? Specifically, Collectors#toMap
Map<String, Map<String, GenericCatalog>> cache = catalogNames.stream().collect(Collectors.toMap(Function.identity(),
name -> DummyData.getCatalog(name).stream().collect(Collectors.toMap(t -> t.name.get(), Function.identity(),
//these two lines only needed if HashMap can't be used
(o, t) -> /* merge function */,
LinkedHashMap::new));
This avoids mutating an existing collection, and provides you your own individual copy of a map (which you can use to update a cache, or whatever you desire).
Also I would disagree with arbitrarily putting end braces at the end of a line of code - most style guides would also be against this as it somewhat disturbs the flow of the code to most readers.

How can I do the following using Java 8 Lambdas

I have a set of Strings as follows
Set<String> ids;
Each id is of the form #userId:#sessionId so for e.g. 1:2 where 1 is the userId and 2 is the sessionId.
I want to split these into userId which would be the key in a HashMap and each userId is unique. But each userId can have multiple sessions. So how do I get the values from Set<String> to Map<String, List<String>>
For e.g.
If the set contains the following values {1:2, 2:2, 1:3}
The map should contain
key=1 value=<2,3>
key=2 value=<2>

By "lambdas" I'm assuming you mean streams, because a straightforward loop to build a map wouldn't really require lambdas. If so, you can get close, but not quite there, with some of the built-in Collectors.
Map<String, List<String>> map = ids.stream()
.collect(Collectors.groupingBy(id -> id.split(":")[0]));
// result: {"1": ["1:2", "1:3"], "2": ["2:2"]}
This will group by the left number, but will store the full strings in the map values rather than just the right-hand portion.
Map<String, List<String>> map = ids.stream()
.collect(Collectors.toMap(
id -> id.split(":")[0],
id -> new ArrayList<>(Arrays.asList(id.split(":")[1])),
(l1, l2) -> {
List<String> l3 = new ArrayList<>(l1);
l3.addAll(l2);
return l3;
}
);
// result: {"1": ["2", "3"], "2": ["2"]}
This will return exactly what you want, but suffers from severe inefficiency. Rather than adding all equal elements to a single list, it will create many temporary lists and join them together. That turns what should be an O(n) operation into an O(n2) one.

You could use HashMap<Integer,Set<T>> or HashMap<Integer,List<T>>, where T is the type of value1, value2, etc..
I think you're asking for something similar to this question.
Also, here is a link for several solutions on how to proceed with this problem.

Java 8 re-map with modified value

I'm about to get a grasp on the new Java 8 stream and lambda options, but there are still a few subtleties that I haven't yet wrapped my mind around.
Let's say that I have a map where the keys are the names of people. The value for each name is a map of ages and Person instances. Further assume that there does not exist more than one person with the same name and age.
Map<String, NavigableMap<Long, Person>> names2PeopleAges = new HashMap<String, NavigableMap<Long, Person>>();
After populating that map (elsewhere), I want to produce another map of the oldest person for each name. I want to wind up with a Map<String, Person> in which the keys are identical to those in the first map, but the value for each entry is the value of the value map for which the key of the value map has the highest number.
Taking advantage of the fact that a NavigableMap sorts its keys, I can do this:
Map<String, Person> oldestPeopleByName = new HashMap<String, Person>();
names2PeopleAges.forEach((name, peopleAges) -> {
oldestPeopleByName.put(name, peopleAges.lastEntry().getValue());
});
Question: Can I replace the last bit of code above with a single Java 8 stream/collect/map/flatten/etc. operation to produce the same result? In pseudo-code, my first inclination would be:
Map<String, Person> oldestPeopleByName = names2PeopleAges.forEachEntry().mapValue(value->value.lastEntry().getValue());
This question is meant to be straightforward without any tricks or oddities---just a simple question of how I can fully leverage Java 8!
Bonus: Let's say that the NavigableMap<Long, Person> above is instead merely a Map<Long, Person>. Could you extend the first answer so that it collects the person with the highest age value, now that NavigableMap.lastEntry() is not available?

You can create a Stream of the entries and collect it to a Map :
Map<String, Person> oldestPeopleByName =
names2PeopleAges.entrySet()
.stream()
.collect (Collectors.toMap(e->e.getKey(),
e->e.getValue().lastEntry().getValue())
);
Now, without lastEntry :
Map<String, Person> oldestPeopleByName =
names2PeopleAges.entrySet()
.stream()
.collect (Collectors.toMap(e->e.getKey(),
e->e.getValue().get(e.getValue().keySet().stream().max(Long::compareTo)))
);
Here, instead of relying on lastEntry, we search for the max key in each of the internal Maps, and get the corresponding Person of the max key.
I might have some silly typos, since I haven't actually tested it, by in principle it should work.

A similar question is asked before, and I provided there my answers https://stackoverflow.com/a/75004577/6777695 The answer there is slighty different than the one accepted here. I think the remapping of a key or/and value should belong in the map part of a stream instead of the collect as the map function for a stream is designed to do the actual transforming of data. In your use case I would suggest the following code snippet:
import java.util.Map;
import java.util.Map.Entry;
import java.util.stream.Collectors;
public class App {
public static void main(String[] args) {
Map<String, Person> oldestPeopleByName = namesToPeopleAges.entrySet().stream()
.map(entry -> Map.entry((entry.getKey(), entry.getValue().lastEntry().getValue()))
.collect(Collectors.toMap(Entry::getKey, Entry::getValue));
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Groupby counts in java - java

Related

Java List<E> to Map<P, List<E>> were key is some property of E and value is E with that property

Java Stream or Map Merge by key

Simplifying loop with Java 8

How can I do the following using Java 8 Lambdas

Java 8 re-map with modified value

Categories

Resources