Unexpected closing of a Stream while concatenating them - java

Using Java 8 (if that matters), I have a behavior I struggle to understand.
Let's say I have an Entry class as such :
static class Entry {
String key;
List<String> values;
public Entry(String key, String... values) {
this.key = key;
this.values = Arrays.asList(values);
}
}
And a list of instances :
List<Entry> entries = Arrays.asList(
new Entry("a", "a1"),
new Entry("b", "b1"),
new Entry("a", "a2"));
);
Now I want to collect all entries having the same key (and keep distinct values), and I stumbled upon a "IllegalStateException: stream has already been operated upon or closed".
The minimal code for producing it is :
entries.stream().collect(
Collectors.groupingBy(
e -> e.key,
Collectors.mapping(
e -> e.values.stream(),
Collectors.reducing(Stream.<String>empty(), Stream::concat))
)
);
(I'd add a collectingAndThen to meet my requirement, but it's not the point of my question)
I fail to see which part of the code consumes / acts on the streams. Furthermore, if I change the code to the following, it works :
entries.stream().collect(
Collectors.groupingBy(
e -> e.key,
Collectors.mapping(
e -> e.values.stream(),
Collectors.reducing(Stream::concat))
)
);
I'd rather use the former code, because the later gives me a Map<K, Optional<V>> while the former gives a Map<K, V>.
But the question is : what difference does the usage of a neutral element does in the reduction, that ultimately causes (at least) one of the stream to be consumed ?

The main problem can be reduced to this similar example:
Stream<String> identity = Stream.empty();
Stream<String> stream1 = Stream.of("1");
Stream<String> stream2 = Stream.of("2");
Stream.concat(identity, stream1); //works
Stream.concat(identity, stream2); //java.lang.IllegalStateException
In other words,
Collectors.reducing(Stream.<String>empty(), Stream::concat)
Creates one stream object with Stream.<String>empty(), and reuses it as the identity value in your multi-level reduction. Fortunately, you already have a workaround.
As warned against in the docs, and also pointed out in comments, repeated stream concatenation is discouraged:
Use caution when constructing streams from repeated concatenation. Accessing an element of a deeply concatenated stream can result in deep call chains, or even StackOverflowException.
One alternative approach I can think of is to flatten the stream before grouping:
//This yields a Map<String, List<String>>
entries.stream()
.flatMap(v -> v.values.stream().map(val -> new SimpleEntry<>(v.key, val)))
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collectors.mapping(Map.Entry::getValue,
Collectors.toList())));

The main problem is you cannot have a stream as identity element because streams cannot be reused, so when it tries to reuse it, throws saying it is operated upon or closed.
This is an alternative to the approach (returning List instead of Optional):
Map<String, List<String>> collect = entries.stream().collect(
Collectors.groupingBy(
e -> e.key,
Collectors.flatMapping(e -> e.values.stream(), Collectors.toList())))

Related

Java Streams - Are you able to have multiple terminal operations (e.g. forEach)

I'm fairly new to Java and trying to learn how to use streams for easier code writing. If I can code like this:
Map<String, SomeConfig> temp = new HashMap<>();
resultStorage.forEach((key, value) -> key.getUsers().forEach(user -> {
if (!temp.containsKey(user.getMeta())) {
SomeConfig emailConfiguration = key
.withCheck1(masterAccountId)
.withCheck2(getClientTimezone())
.withCheck3(user.getMeta());
temp.put(user.getMeta(), emailConfiguration);
}
temp.get(user. getMeta()).getStreams().add(value);
}));
return new ArrayList<>(temp.values());
resultStorage declaration:
private Map< SomeConfig, byte[]> resultStorage = new ConcurrentHashMap<>();
getStreams is a getter on SomeConfig that returns a List<byte[]> as here:
private List<byte[]> attachmentStreams = new ArrayList<>();
public List<byte[]> getAttachmentStreams() {
return attachmentStreams;
}
My first attempt was something similar to this:
resultStorage.entrySet().stream()
.forEach(entry -> entry.getKey().getUsers().forEach(user -> {
}));
Are we able to use a forEach within one of the streams terminating operation, forEach? How would a stream benefit in this case as I saw documentation that it can significantly improve readability and performance of older pre-Java8 code?
Edit:
resultStorage holds a ConcurrentHashMap. It will contain Map<SomeConfig, byte[]> for email and attachments. Using another HashMap temp that is initially empty - we analyze resultStorage , see if temp contains a specific email key, and then put or add based on the existence of a user's email
The terminal operation of entrySet().stream().forEach(…) is entirely unrelated to the getUsers().forEach(…) call within the Consumer. So there’s no problem of “multiple terminal operations” here.
However, replacing the Map operation forEach((key, value) -> … with an entrySet() .stream() .forEach(entry -> …) rarely adds a benefit. So far, you’re not only made the code longer, you introduced the necessity to deal with a Map.Entry instead of just using key and value.
But you can simplify your operation by using a single computeIfAbsent instead of containsKey, put, and get:
resultStorage.forEach((key, value) -> key.getUsers().forEach(user ->
temp.computeIfAbsent(user.getMeta(), meta ->
key.withCheck1(masterAccountId).withCheck2(getClientTimezone()).withCheck3(meta))
.getStreams().add(value)));
Notes after the code.
Map<String, SomeConfig> temp = resultStorage.keySet()
.stream()
.flatMap(key -> key.getUsers()
.stream()
.map(user -> new AbstractMap.SimpleEntry(user, key)))
.collect(Collectors.toMap(e -> e.getKey().getMeta(),
e -> e.getValue()
.withCheck1(masterAccountId)
.withCheck2(getClientTimezone())
.withCheck3(e.getKey().getMeta())
resultStorage.keySet()
This returns Set<SomeConfig>.
stream()
This returns a stream where every element in the stream is an instance of SomeConfig.
.flatMap(key -> key.getUsers()
.stream()
.map(user -> new AbstractMap.SimpleEntry(user, key)))
Method flatMap() must return a Stream. The above code returns a Stream where every element is an instance of AbstractMap.SimpleEntry. The "entry" key is the user and the entry value is the key from resultStorage.
Finally I create a Map<String, SomeConfig> via [static] method toMap of class Collectors.
The first argument to method toMap is the key mapper, i.e. a method that extracts the [map] key from the AbstractMap.SimpleEntry. In your case this is the value returned by method getMeta() of the user – which is the key from AbstractMap.SimpleEntry, i.e. e.getKey() returns a user object.
The second argument to toMap is the value mapper. e.getValue() returns a SomeConfig object and the rest is your code, i.e. the withChecks.
There is no way I can test the above code because not only did you not post a minimal, reproducible example, you also did not post any sample data. Hence the above may be way off what you actually require.
Also note that the above code simply creates your Map<String, SomeConfig> temp. I could not understand the code in your question that processes that Map so I did not try to implement that part at all.

Java Map with List value to list using streams?

I am trying to rewrite the method below using streams but I am not sure what the best approach is? If I use flatMap on the values of the entrySet(), I lose the reference to the current key.
private List<String> asList(final Map<String, List<String>> map) {
final List<String> result = new ArrayList<>();
for (final Entry<String, List<String>> entry : map.entrySet()) {
final List<String> values = entry.getValue();
values.forEach(value -> result.add(String.format("%s-%s", entry.getKey(), value)));
}
return result;
}
The best I managed to do is the following:
return map.keySet().stream()
.flatMap(key -> map.get(key).stream()
.map(value -> new AbstractMap.SimpleEntry<>(key, value)))
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Is there a simpler way without resorting to creating new Entry objects?
A stream is a sequence of values (possibly unordered / parallel). map() is what you use when you want to map a single value in the sequence to some single other value. Say, map "alturkovic" to "ALTURKOVIC". flatMap() is what you use when you want to map a single value in the sequence to 0, 1, or many other values. Hence why a flatMap lambda needs to turn a value into a stream of values. flatMap can thus be used to take, say, a list of lists of string, and turn that into a stream of just strings.
Here, you want to map a single entry from your map (a single key/value pair) into a single element (a string describing it). 1 value to 1 value. That means flatMap is not appropriate. You're looking for just map.
Furthermore, you need both key and value to perform your mapping op, so, keySet() is also not appropriate. You're looking for entrySet(), which gives you a set of all k/v pairs, juts what we need.
That gets us to:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), e.getValue()))
.collect(Collectors.toList());
Your original code makes no effort to treat a single value from a map (which is a List<String>) as separate values; you just call .toString() on the entire ordeal, and be done with it. This means the produced string looks like, say, [Hello, World] given a map value of List.of("Hello", "World"). If you don't want this, you still don't want flatmap, because streams are also homogenous - the values in a stream are all of the same kind, and thus a stream of 'key1 value1 value2 key2 valueA valueB' is not what you'd want:
map.entrySet().stream()
.map(e -> String.format("%s-%s", e.getKey(), myPrint(e.getValue())))
.collect(Collectors.toList());
public static String myPrint(List<String> in) {
// write your own algorithm here
}
Stream API just isn't the right tool to replace that myPrint method.
A third alternative is that you want to smear out the map; you want each string in a mapvalue's List<String> to first be matched with the key (so that's re-stating that key rather a lot), and then do something to that. NOW flatMap IS appropriate - you want a stream of k/v pairs first, and then do something to that, and each element is now of the same kind. You want to turn the map:
key1 = [value1, value2]
key2 = [value3, value4]
first into a stream:
key1:value1
key1:value2
key2:value3
key2:value4
and take it from there. This explodes a single k/v entry in your map into more than one, thus, flatmapping needed:
return map.entrySet().stream()
.flatMap(e -> e.getValue().stream()
.map(v -> String.format("%s-%s", e.getKey(), v))
.collect(Collectors.toList());
Going inside-out, it maps a single entry within a list that belongs to a single k/v pair into the string Key-SingleItemFromItsList.
Adding my two cents to excellent answer by #rzwitserloot. Already flatmap and map is explained in his answer.
List<String> resultLists = myMap.entrySet().stream()
.flatMap(mapEntry -> printEntries(mapEntry.getKey(),mapEntry.getValue())).collect(Collectors.toList());
System.out.println(resultLists);
Splitting this to a separate method gives good readability IMO,
private static Stream<String> printEntries(String key, List<String> values) {
return values.stream().map(val -> String.format("%s-%s",key,val));
}

Elegant way to flatMap Set of Sets inside groupingBy

So I have a piece of code where I'm iterating over a list of data. Each one is a ReportData that contains a case with a Long caseId and one Ruling. Each Ruling has one or more Payment. I want to have a Map with the caseId as keys and sets of payments as values (i.e. a Map<Long, Set<Payments>>).
Cases are not unique across rows, but cases are.
In other words, I can have several rows with the same case, but they will have unique rulings.
The following code gets me a Map<Long, Set<Set<Payments>>> which is almost what I want, but I've been struggling to find the correct way to flatMap the final set in the given context. I've been doing workarounds to make the logic work correctly using this map as is, but I'd very much like to fix the algorithm to correctly combine the set of payments into one single set instead of creating a set of sets.
I've searched around and couldn't find a problem with the same kind of iteration, although flatMapping with Java streams seems like a somewhat popular topic.
rowData.stream()
.collect(Collectors.groupingBy(
r -> r.case.getCaseId(),
Collectors.mapping(
r -> r.getRuling(),
Collectors.mapping(ruling->
ruling.getPayments(),
Collectors.toSet()
)
)));
Another JDK8 solution:
Map<Long, Set<Payment>> resultSet =
rowData.stream()
.collect(Collectors.toMap(p -> p.Case.getCaseId(),
p -> new HashSet<>(p.getRuling().getPayments()),
(l, r) -> { l.addAll(r);return l;}));
or as of JDK9 you can use the flatMapping collector:
rowData.stream()
.collect(Collectors.groupingBy(r -> r.Case.getCaseId(),
Collectors.flatMapping(e -> e.getRuling().getPayments().stream(),
Collectors.toSet())));
The cleanest solution is to define your own collector:
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collector.of(HashSet::new,
(s, r) -> s.addAll(r.getRuling().getPayments()),
(s1, s2) -> { s1.addAll(s2); return s1; })
));
Two other solutions to which I thought first but are actually less efficient and readable, but still avoid constructing the intermediate Map:
Merging the inner sets using Collectors.reducing():
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collectors.reducing(Collections.emptySet(),
r -> r.getRuling().getPayments(),
(s1, s2) -> {
Set<Payment> r = new HashSet<>(s1);
r.addAll(s2);
return r;
})
));
where the reducing operation will merge the Set<Payment> of entries with the same caseId. This can however cause a lot of copies of the sets if you have a lot of merges needed.
Another solution is with a downstream collector that flatmaps the nested collections:
Map<Long, Set<Payment>> result = rowData.stream()
.collect(Collectors.groupingBy(
ReportData::getCaseId,
Collectors.collectingAndThen(
Collectors.mapping(r -> r.getRuling().getPayments(), Collectors.toList()),
s -> s.stream().flatMap(Set::stream).collect(Collectors.toSet())))
);
Basically it puts all sets of matching caseId together in a List, then flatmaps that list into a single Set.
There are probably better ways to do this, but this is the best I found:
Map<Long, Set<Payment>> result =
rowData.stream()
// First group by caseIds.
.collect(Collectors.groupingBy(r -> r.case.getCaseId()))
.entrySet().stream()
// By streaming over the entrySet, I map the values to the set of payments.
.collect(Collectors.toMap(
Map.Entry::getKey,
entry -> entry.getValue().stream()
.flatMap(r -> r.getRuling().getPayments().stream())
.collect(Collectors.toSet())));

Simplifying loop with Java 8

I have a method that adds maps to a cache and I was wondering what I could do more to simplify this loop with Java 8.
What I have done so far:
Standard looping we all know:
for(int i = 0; i < catalogNames.size(); i++){
List<GenericCatalog> list = DummyData.getCatalog(catalogNames.get(i));
Map<String, GenericCatalog> map = new LinkedHashMap<>();
for(GenericCatalog item : list){
map.put(item.name.get(), item);
}
catalogCache.put(catalogNames.get(i), map);};
Second iteration using forEach:
catalogNames.forEach(e -> {
Map<String, GenericCatalog> map = new LinkedHashMap<>();
DummyData.getCatalog(e).forEach(d -> {
map.put(d.name.get(), d);
});
catalogCache.put(e, map);});
And third iteration that removes unnecessary bracers:
catalogNames.forEach(objName -> {
Map<String, GenericCatalog> map = new LinkedHashMap<>();
DummyData.getCatalog(objName).forEach(obj -> map.put(obj.name.get(), obj));
catalogCache.put(objName, map);});
My question now is what can be further done to simplify this?
I do understand that it's not really necessary to do anything else with this method at this point, but, I was curios about the possibilities.
There is small issue with solution 2 and 3 they might cause a side effects
Side-effects in behavioral parameters to stream operations are, in
general, discouraged, as they can often lead to unwitting violations
of the statelessness requirement, as well as other thread-safety
hazards.
As an example of how to transform a stream pipeline that
inappropriately uses side-effects to one that does not, the following
code searches a stream of strings for those matching a given regular
expression, and puts the matches in a list.
ArrayList<String> results = new ArrayList<>();
stream.filter(s -> pattern.matcher(s).matches())
.forEach(s -> results.add(s)); // Unnecessary use of side-effects!
So instead of using forEach to populate the HashMap it is better to use Collectors.toMap(..). I am not 100% sure about your data structure, but I hope it is close enough.
There is a List and corresponding Map:
List<Integer> ints = Arrays.asList(1,2,3);
Map<Integer,List<Double>> catalog = new HashMap<>();
catalog.put(1,Arrays.asList(1.1,2.2,3.3,4.4));
catalog.put(2,Arrays.asList(1.1,2.2,3.3));
catalog.put(3,Arrays.asList(1.1,2.2));
now we would like to get a new Map where a map key is element from the original List and map value is an other Map itself. The nested Map's key is transformed element from catalog List and value is the List element itself. Crazy description and more crazy code below:
Map<Integer, Map<Integer, Double>> result = ints.stream().collect(
Collectors.toMap(
el -> el,
el -> catalog.get(el).stream().
collect(Collectors.toMap(
c -> c.intValue(),
c -> c
))
)
);
System.out.println(result);
// {1={1=1.1, 2=2.2, 3=3.3, 4=4.4}, 2={1=1.1, 2=2.2, 3=3.3}, 3={1=1.1, 2=2.2}}
I hope this helps.
How about utilizing Collectors from the stream API? Specifically, Collectors#toMap
Map<String, Map<String, GenericCatalog>> cache = catalogNames.stream().collect(Collectors.toMap(Function.identity(),
name -> DummyData.getCatalog(name).stream().collect(Collectors.toMap(t -> t.name.get(), Function.identity(),
//these two lines only needed if HashMap can't be used
(o, t) -> /* merge function */,
LinkedHashMap::new));
This avoids mutating an existing collection, and provides you your own individual copy of a map (which you can use to update a cache, or whatever you desire).
Also I would disagree with arbitrarily putting end braces at the end of a line of code - most style guides would also be against this as it somewhat disturbs the flow of the code to most readers.

Explanation of this Lambda Expression

I am creating a Word Comparison class and it will count the occurrences of words as well. (This is Java)
This was my original method:
/**
* #param map The map of words to search
* #param num The number of words you want printed
* #return list of words
*/
public static List<String> findMaxOccurrence(Map<String, Integer> map, int num) {
List<WordComparable> l = new ArrayList<>();
for (Map.Entry<String, Integer> entry : map.entrySet())
l.add(new WordComparable(entry.getKey(), entry.getValue()));
My IDE suggested that the loop and list assignment could be replaced with a "collect call": "stream api calls"
In which it generated this code:
List<WordComparable> l =
map.entrySet().stream()
.map(entry -> new WordComparable
(entry.getKey(), entry.getValue())).collect(Collectors.toList());
I am kinda confused on how the lambda math works. If my memory serves correctly, the -> is the for each loop, but the other calls are completely confusing.
My IDE can also expand the code into these two snippets:
List<WordComparable> l =
map.entrySet().stream()
.map(entry -> {
return new WordComparable
(entry.getKey(), entry.getValue());
}).collect(Collectors.toList());
And
List<WordComparable> l =
map.entrySet().stream()
.map(new Function<Map.Entry<String, Integer>, WordComparable>() {
#Override
public WordComparable apply(Map.Entry<String, Integer> entry) {
return new WordComparable
(entry.getKey(), entry.getValue());
}
}).collect(Collectors.toList());
Any light-shedding would be awesome.
Let's take a look at the for loop a bit closer to see how we can write it functionally:
List<WordComparable> l = new ArrayList<>();
for (Map.Entry<String, Integer> entry : map.entrySet())
l.add(new WordComparable(entry.getKey(), entry.getValue()));
If we read that code in plain English, we might say "for each entry of my map, let's convert it to a WordComparable and add it to a list".
Now, we can rephrase that sentence to "for each entry of my map, let's convert it to a WordComparable, and when we have converted it all, let's make a list out of it".
Using that sentence, we see that we need to create a function: one that takes an entry of the map and converts it to a WordComparable. So let's build one! Java 8 introduces a new type named Function, which has one important method: apply. This method takes one input, transforms it and returns one output.
Writing good old Java, since Function is an interface, we can implement it to write our conversion code:
public class EntryConverter implements Function<Map.Entry<String, Integer>, WordComparable> {
public WordComparable apply(Map.Entry<String, Integer> entry) {
return new WordComparable(entry.getKey(), entry.getValue());
}
}
Now that we have this converter, we need to use it on all the entries. Java 8 also introduces the notion of Stream, that is to say, a sequence of elements (note that this sequence can be infinite). Using this sequence, we can finally write into code what we said earlier, i.e. "for each entry, let's convert it to a WordComparable". We make use of the map method, whose goal is to apply a method on each element of the stream.
We have the method: EntryConverter, and we build a Stream of our entries using the stream method.
So, we get:
map.entrySet().stream().map(new EntryConverter());
What remains is the last part of the sentence: "make a List out of it", i.e. collect all the elements into a List. This is done using the collect method. This method takes a Collector as argument, i.e. an object capable of reducing a stream into a final container. Java 8 comes with a lot of prebuilt collectors; one of them being Collectors.toList().
Finally, we get:
map.entrySet().stream().map(new EntryConverter()).collect(Collectors.toList());
Now, if we remove the temporary class EntryConverter and make it anonymous, we get what your IDE is proposing:
List<WordComparable> l = map.entrySet()
.stream() //make a Stream of our entries
.map(new Function<Map.Entry<String, Integer>, WordComparable>() {
#Override
public WordComparable apply(Map.Entry<String, Integer> entry) {
return new WordComparable(entry.getKey(), entry.getValue());
}
}) //let's convert each entry to a WordComparable
.collect(Collectors.toList()); //and make a List out of it
Now, writing all that code is a bit cumbersome, especially the declaration of the anonymous class. Java 8 comes to the rescue with the new -> operator. This operator allows the creation of a Function much more painlessly than before: the left side corresponds to the argument of the function and the right side corresponds to the result. This is called a lambda expression.
In our case, we get:
entry -> new WordComparable(entry.getKey(), entry.getValue())
It is also possible to write this lambda expression using a block body and a return statement:
entry -> {
return new WordComparable(entry.getKey(), entry.getValue());
}
Notice how that corresponds to what we had written earlier in EntryConverter.
This means we can refactor our code to:
List<WordComparable> l = map.entrySet()
.stream()
.map(entry -> new WordComparable(entry.getKey(), entry.getValue()))
.collect(Collectors.toList());
which is much more readable, and is what your IDE proposes.
You can find more about lambda expressions on Oracle site.
This is a lambda expression for a Function. It takes an object and returns an object. In this case, it takes a Map.Entry<String, Integer>, and returns a WordComparable.
entry -> new WordComparable(entry.getKey(), entry.getValue())
You could write the equivalent code by hand:
final class ConversionFunction
implements Function<Map.Entry<String, Integer>, WordComparable>
{
#Override
public WordComparable apply(Map.Entry<String, Integer> entry) {
return new WordComparable(entry.getKey(), entry.getValue());
}
}
map.entrySet().stream().map(new ConversionFunction()).collect(...);
The Stream.map() method takes a Function that can be applied to each element (Map.Entry) in the stream, and produces a stream of elements of a new type (WordComparable).
The Stream.collect() method uses a Collector to condense all elements of a stream to a single object. Usually it's a collection, like it is here, but it could be any sort of aggregate function.
List<WordComparable> l = map.entrySet().stream()
.map(entry -> new WordComparable(entry.getKey(), entry.getValue()))
.collect(Collectors.toList());
"->" is a part of lambda itself.
In this snippet .stream() is like foreach loop and then begins the set of data processing "directives" (map, collect, etc).
map means than you map each element of current collection to some new collection with some rule:
entry -> new WordComparable(entry.getKey(), entry.getValue())
your rule means that you use each element (with "entry" alias) to create the new elements for the map() result collection.
then you should collect your elements to appropriate collection by using suitable collector.
note, that collect applies to map() result.

Categories

Resources