Filtering a map

Filtering a map - java

I have a list of entries ,where entry has studentId and subjectId attributes.
List<Candidate> candidates
class Candidate {
...
String studentId;
String subjectId;
}
The objective is to derive a map of subjectId to list of studentIds,for those subjects which have been subscribed to by MORE than one student.
I can obviously create a temporary map by iterating over the candidates(a big count),remove single entries later - which seems a costly route.
Any other suggestions ?
We are using Java 1.7

Following this answer (corrected and completed) :
Map<String, Integer> map = candidates.stream()
.collect(Collectors.groupingBy(Candidate::getSubjectId))
.entrySet().stream().filter(x -> x.getValue().size()>1)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue().size()));
You could also use candidates.parallelStream().

When you say 'costly', it's not exactly clear what you mean. - but I'll make an attempt.
With Java8, you can use the groupingBy construct to create a
Map<String, List<Candidate> groupedResults = candidates.stream().collect(Collectors.groupingBy(Candidate::getSubjectId));
and then simply filter out the entries where size <= 1. Rather simple and since it uses streams, memory efficiency should not be an issue.

Related

GroupBy on ArrayList of HashMap in java

I want to do a "group-by" on arrayList of HashMap Data structure. As my data is not fixed, so I don't have any fixed classes.
Data is shown as below.
[{"name":"laxman","state":"Karnataka","Mobile":9034782882},
{"name":"rahul","state":"Kerala","Mobile":9034782882},
{"name":"laxman","state":"karnataka","Mobile":9034782882},
{"name":"ram","state":"delhi","Mobile":9034782882}]
The above keys are not fixed, So, I can't have classes for it.
Data and formulas will be dynamical. But for now, I am taking this example to understand Stream.Collector on this data.
Now, I want to get the count on basis of name and state,
So basically I want to group-by on name and state and want to get count.
I tried to use Stream.Collector but am not able to achieve what I want.

You can accomplish this with Collectors.groupingBy, using a List as the key of the returned Map:
Map<List<String>, Long> result = yourListOfMaps.stream()
.collect(Collectors.groupingBy(
m -> Arrays.asList(String.valueOf(m.get("name")), String.valueOf(m.get("state"))),
Collectors.counting()));
This works well because all implementations of List in Java implement hashCode and equals consistently, which is a must for every class that is to be used as the key of any Map implementation.

You have to do groupingBy twice once on the key and once again on the value.
Map<String, Map<Object, Long>> map = listOfMap.stream().flatMap(a -> a.entrySet().stream())
.collect(Collectors.groupingBy(Map.Entry<String, String>::getKey,
Collectors.groupingBy(Map.Entry::getValue, Collectors.counting())));
Output
{mobile={9034782882=4}, name={rahul=1, laxman=2, ram=1}, state={Karnataka=2, delhi=1, Kerala=1}}

How to create a nested Map using Collectors.groupingBy?

I have a list of class say ProductDto
public class ProductDto {
private String Id;
private String status;
private Booker booker;
private String category;
private String type;
}
I want to have a Map as below:-
Map<String,Map<String,Map<String,Booker>>
The properties are to be mapped as below:
Map<status,Map<category,Map<type,Booker>
I know one level of grouping could be done easily without any hassles using Collectors.groupingBy.
I tried to use this for nested level but it failed for me when same values started coming for fields that are keys.
My code is something like below:-
list.stream()
.collect(Collectors.groupingBy(
(FenergoProductDto productDto) ->
productDto.getStatus()
,
Collectors.toMap(k -> k.getProductCategory(), fProductDto -> {
Map<String, Booker> productTypeMap = new ProductTypes();
productTypeMap.put(fProductDto.getProductTypeName(),
createBooker(fProductDto.getBookingEntityName()));
return productTypeMap;
})
));
If anyone knows a good approach to do this by using streams, please share!

Abstract / Brief discussion
Having a map of maps of maps is questionable when seen from an object-oriented prespective, as it might seem that you're lacking some abstraction (i.e. you could create a class Result that encapsulates the results of the nested grouping). However, it's perfectly reasonable when considered exclusively from a pure data-oriented approach.
So here I present two approaches: the first one is purely data-oriented (with nested groupingBy calls, hence nested maps), while the second one is more OO-friendly and makes a better job at abstracting the grouping criteria. Just pick the one which better represents your intentions and coding standards/traditions and, more importantly, the one you most like.
Data-oriented approach
For the first approach, you can just nest the groupingBy calls:
Map<String, Map<String, Map<String, List<Booker>>>> result = list.stream()
.collect(Collectors.groupingBy(ProductDto::getStatus,
Collectors.groupingBy(ProductDto::getCategory,
Collectors.groupingBy(ProductDto::getType,
Collectors.mapping(
ProductDto::getBooker,
Collectors.toList())))));
As you see, the result is a Map<String, Map<String, Map<String, List<Booker>>>>. This is because there might be more than one ProductDto instance with the same (status, category, type) combination.
Also, as you need Booker instances instead of ProductDto instances, I'm adapting the last groupingBy collector so that it returns Bookers instead of productDtos.
About reduction
If you need to have only one Booker instance instead of a List<Booker> as the value of the innermost map, you would need a way to reduce Booker instances, i.e. convert many instances into one by means of an associative operation (accumulating the sum of some attribute being the most common one).
Object-oriented friendly approach
For the second approach, having a Map<String, Map<String, Map<String, List<Booker>>>> might be seen as bad practice or even as pure evil. So, instead of having a map of maps of maps of lists, you could have only one map of lists whose keys represent the combination of the 3 properties you want to group by.
The easiest way to do this is to use a List as the key, as lists already provide hashCode and equals implementations:
Map<List<String>, List<Booker>> result = list.stream()
.collect(Collectors.groupingBy(
dto -> Arrays.asList(dto.getStatus(), dto.getCategory(), dto.getType()),
Collectors.mapping(
ProductDto::getBooker,
Collectors.toList())))));
If you are on Java 9+, you can use List.of instead of Arrays.asList, as List.of returns a fully immutable and highly optimized list.

nested groupingBy questions and solutions:
q. print all male and female dept-wise(nested groupingBy):
ans:
employeeList.stream().collect(Collectors.groupingBy(Employee::getDepartment,Collectors.groupingBy(Employee::getGender)))
.entrySet().stream().forEach(System.out::println)
q. print the employees more than 25 and not - male and female - dept-wise
ans:
employeeList.stream().collect(
Collectors.groupingBy(Employee::getDepartment, Collectors.groupingBy(Employee::getGender, Collectors.partitioningBy(emp -> emp.getAge() > 25))))
.entrySet().stream().forEach(System.out::println);
q. eldest male and female from each department
ans:
employeeList.stream().collect(Collectors.groupingBy(Employee::getDepartment,Collectors.groupingBy(Employee::getGender,Collectors.maxBy(Comparator.comparing(Employee::getAge)))))
.entrySet().stream().forEach(System.out::println);
some more helpful questions #:
[1]: https://www.youtube.com/watch?v=AFmyV43UBgc

Lowercase all HashMap keys

I 've run into a scenario where I want to lowercase all the keys of a HashMap (don't ask why, I just have to do this). The HashMap has some millions of entries.
At first, I thought I 'd just create a new Map, iterate over the entries of the map that is to be lowercased, and add the respective values. This task should run only once per day or something like that, so I thought I could bare this.
Map<String, Long> lowerCaseMap = new HashMap<>(myMap.size());
for (Map.Entry<String, Long> entry : myMap.entrySet()) {
lowerCaseMap.put(entry.getKey().toLowerCase(), entry.getValue());
}
this, however, caused some OutOfMemory errors when my server was overloaded during this one time that I was about to copy the Map.
Now my question is, how can I accomplish this task with the smallest memory footprint?
Would removing each key after lowercased - added to the new Map help?
Could I utilize java8 streams to make this faster? (e.g something like this)
Map<String, Long> lowerCaseMap = myMap.entrySet().parallelStream().collect(Collectors.toMap(entry -> entry.getKey().toLowerCase(), Map.Entry::getValue));
Update
It seems that it's a Collections.unmodifiableMap so I don't have the option of
removing each key after lowercased - added to the new Map

Instead of using HashMap, you could try using a TreeMap with case-insensitive ordering. This would avoid the need to create a lower-case version of each key:
Map<String, Long> map = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
map.putAll(myMap);
Once you've constructed this map, put() and get() will behave case-insensitively, so you can save and fetch values using all-lowercase keys. Iterating over keys will return them in their original, possibly upper-case forms.
Here are some similar questions:
Case insensitive string as HashMap key
Is there a good way to have a Map<String, ?> get and put ignoring case?

You cannot remove the entry while iterating over the map. You will have a ConcurentModificationException if you try to do this.
As the issue is an OutOfMemoryError, not a performance error, using parallel stream will not help either.
Despite some task on the Stream API will be done lately, this will still lead to have two maps in memory at some point so you will still have the issue.
To workaround it, I only saw two ways :
Give more memory to your process (by increasing -Xmx on the Java command line). Memory is cheap these days ;)
Split the map and work in chunks : for example you divide the size of the map by ten and you process one chunck at a time and delete the processed entries before processing the new chunk. By this instead of having two times the map in memory you will just have 1.1 times the map.
For the split algorithm, you can try someting like this using the Stream API :
Map<String, String> toMap = new HashMap<>();
int chunk = fromMap.size() / 10;
for(int i = 1; i<= 10; i++){
//process the chunk
List<Entry<String, String>> subEntries = fromMap.entrySet().stream().limit(chunk)
.collect(Collectors.toList());
for(Entry<String, String> entry : subEntries){
toMap.put(entry.getKey().toLowerCase(), entry.getValue());
fromMap.remove(entry.getKey());
}
}

the concerns in the above answers are correct and you might need to reconsider changing the data structure you are using.
for me, I had a simple map I needed to change its keys to lower case
take a look at my snippet, its a trivial solution and bad at performance
private void convertAllFilterKeysToLowerCase() {
HashSet keysToRemove = new HashSet();
getFilters().keySet().forEach(o -> {
if(!o.equals(((String) o).toLowerCase()))
keysToRemove.add(o);
});
keysToRemove.forEach(o -> getFilters().put(((String) o).toLowerCase(), getFilters().remove(o)));
}

Not sure about the memory footprint. If using Kotlin, you can try the following.
val lowerCaseMap = myMap.mapKeys { it.key.toLowerCase() }
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/map-keys.html

Convert CSV string value to Hashmap using Stream lambda

I am trying to get a HashMap<String,String> from a CSV String value using Java 8 Streams API. I am able to get the values etc, but how do I add the index of the List as the key in my HashMap.
(HashMap<String, String>) Arrays.asList(sContent.split(",")).stream()
.collect(Collectors.toMap(??????,i->i );
So my map will contain like Key ,Value as below.
0->Value1
1->Value2
2->Value3
...
Using Normal Java I can do it easily but I wanted to use the JAVA 8 stream API.

That’s a strange requirement. When you call Arrays.asList(sContent.split(",")), you already have a data structure which maps int numbers to their Strings. The result is a List<String>, something on which you can invoke .get(intNumber) to get the desired value as you can with a Map<Integer,String>…
However, if it really has to be a Map and you want to use the stream API, you may use
Map<Integer,String> map=new HashMap<>();
Pattern.compile(",").splitAsStream(sContent).forEachOrdered(s->map.put(map.size(), s));
To explain it, Pattern.compile(separator).splitAsStream(string) does the same as Arrays.stream(string.split(separator)) but doesn’t create an intermediate array, so it’s preferable. And you don’t need a separate counter as the map intrinsically maintains such a counter, its size.
The code above in the simplest code for creating such a map ad-hoc whereas a clean solution would avoid mutable state outside of the stream operation itself and return a new map on completion. But the clean solution is not always the most concise:
Map<Integer,String> map=Pattern.compile(",").splitAsStream(sContent)
.collect(HashMap::new, (m,s)->m.put(m.size(), s),
(m1,m2)->{ int off=m1.size(); m2.forEach((k,v)->m1.put(k+off, v)); }
);
While the first two arguments to collect define an operation similar to the previous solution, the biggest obstacle is the third argument, a function only used when requesting parallel processing though a single csv line is unlikely to ever benefit from parallel processing. But omitting it is not supported. If used, it will merge two maps which are the result of two parallel operations. Since both used their own counter, the indices of the second map have to be adapted by adding the size of the first map.

You can use below approach to get you the required output
private Map<Integer, String> getMapFromCSVString(String csvString) {
AtomicInteger integer = new AtomicInteger();
return Arrays.stream(csvString.split(","))
.collect(Collectors.toMap(splittedStr -> integer.getAndAdd(1), splittedStr -> splittedStr));
}
I have written below test to verify the output.
#Test
public void getCsvValuesIntoMap(){
String csvString ="shirish,vilas,Nikhil";
Map<Integer,String> expected = new HashMap<Integer,String>(){{
put(0,"shirish");
put(1,"vilas");
put(2,"Nikhil");
}};
Map<Integer,String> result = getMapFromCSVString(csvString);
System.out.println(result);
assertEquals(expected,result);
}

You can do it creating a range of indices like this:
String[] values = sContent.split(",");
Map<Integer, String> result = IntStream.range(0, values.length)
.boxed()
.collect(toMap(Function.identity(), i -> values[i]));

Java 8 List<T> into Map<K, V>

I want to convert List of Objects to Map, where Map's key and value located as attributes inside Object in List.
Here Java 7 snippet of such convertation:
private Map<String, Child> getChildren(List<Family> families ) {
Map<String, Child> convertedMap = new HashMap<String, Child>();
for (Family family : families) {
convertedMap.put(family.getId(), family.getParent().getChild());
}
return convertedMap;
}

It should be something similar to...
Map<String, Child> m = families.stream()
.collect(Collectors.toMap(Family::getId, f -> f.getParent().getChild()));

Jason gave a decent answer (+1) but I should point out that it has different semantics from the OP's Java 7 code. The issue concerns the behavior if two family instances in the input list have duplicate IDs. Maybe they're guaranteed unique, in which case there is no difference. If there are duplicates, though, with the OP's original code, a Family later in the list will overwrite the map entry for a Family earlier in the list that has the same ID.
With Jason's code (shown below, slightly modified):
Map<String, Child> getChildren(List<Family> families) {
return families.stream()
.collect(Collectors.toMap(Family::getId, f -> f.getParent().getChild()));
}
the Collectors.toMap operation will throw IllegalStateException if there are any duplicate keys. This is somewhat unpleasant, but at least it notifies you that there are duplicates instead of potentially losing data silently. The rule for Collectors.toMap(keyMapper, valueMapper) is that you need to be sure that the key mapper function returns a unique key for every element of the stream.
What you need to do about this -- if anything -- depends on the problem domain. One possibility is to use the three-arg version: Collectors.toMap(keyMapper, valueMapper, mergeFunction). This specifies an extra function that gets called in the case of duplicates. If you want to have later entries overwrite earlier ones (matching the original Java 7 code), you'd do this:
Map<String, Child> getChildren(List<Family> families) {
return families.stream()
.collect(Collectors.toMap(Family::getId, f -> f.getParent().getChild(),
(child1, child2) -> child2));
}
An alternative would be to build up a list of children for each family instead of having just one child. You could write a more complicated merging function that created a list for the first child and appended to this list for the second and subsequent children. This is so common that there is a special groupingBy collector that does this automatically. By itself this would produce a list of families grouped by ID. We don't want a list of families but instead we want a list of children, so we add a downstream mapping operation to map from family to child, and then collect the children into a list. The code would look like this:
Map<String, List<Child>> getChildren(List<Family> families) {
return families.stream()
.collect(Collectors.groupingBy(Family::getId,
Collectors.mapping(f -> f.getParent().getChild(),
Collectors.toList())));
}
Note that the return type has changed from Map<String, Child> to Map<String, List<Child>>.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Filtering a map - java

Related

GroupBy on ArrayList of HashMap in java

How to create a nested Map using Collectors.groupingBy?

Lowercase all HashMap keys

Convert CSV string value to Hashmap using Stream lambda

Java 8 List<T> into Map<K, V>

Categories

Resources