permutate two streams without the need of materialization - java

I have two streams which I must materialize into two lists to get the permutations from both streams:
public Stream<Permutation> getAll() {
Stream.Builder<Permutation> all = Stream.builder();
// unfortunately, I must collect it into a list
var list1 = IntStream.iterate(0, d -> d - 1).limit(1000000).boxed().collect(Collectors.toList());
var list2 = IntStream.iterate(0, d -> d + 1).limit(1000000).boxed().collect(Collectors.toList());
// I must use a classic loop (and cannot operate on those streams) to avoid java.lang.IllegalStateException
for(var l1: list1) {
for(var l2: list2) {
// the permutation class consists of two int properties
all.add(new Permutation(l1, l2));
}
}
return all.build();
}
Is there a way to avoid to materialize list1 and list2 and operate only on those streams to return the permutations? I have tried it but I get
java.lang.IllegalStateException: stream has already been operated upon or closed
Therefore I have materialized the two lists and used a classic loop. However, I would like to improve the performance by doing the following steps:
avoid the materialization of list1 and list2
and maybe also use parallelStream for list1 and list2 to get the permutations faster
Is this possible? If so, how?
EDIT:
Thanks to #Andreas for the solution which works so far. However, I wonder how I can create a permutation from two getAll()-streams without the need to materialize it in between:
// The `Permutations` class holds two `Permuation`-instances.
Stream<Permutations> allPermutations(){
Stream<Permutation> stream1 = getAll();
Stream<Permutation> stream2 = getAll();
// returns java.lang.IllegalStateException: stream has already been operated upon or closed
return stream1.flatMap(s1->stream2.map(s2->new Permutations(s1,s2));
}

You can do it like this:
public Stream<Permutation> getAll() {
return IntStream.iterate(0, d -> d - 1).limit(1000000).boxed()
.flatMap(l1 -> IntStream.iterate(0, d -> d + 1).limit(1000000)
.mapToObj(l2 -> new Permutation(l1, l2)));
}
The caller can decide whether or not to use parallel processing:
// Sequential
Stream<Permutation> stream = getAll();
// Parallel
Stream<Permutation> stream = getAll().parallel();
No need to call sequential(), since iterate() returns a new sequential IntStream.

Related

Java 8 Streams : Count the occurrence of elements(List<String> list1) from list of text data(List<String> list2)

Input :
List<String> elements= new ArrayList<>();
elements.add("Oranges");
elements.add("Figs");
elements.add("Mangoes");
elements.add("Apple");
List<String> listofComments = new ArrayList<>();
listofComments.add("Apples are better than Oranges");
listofComments.add("I love Mangoes and Oranges");
listofComments.add("I don't know like Figs. Mangoes are my favorites");
listofComments.add("I love Mangoes and Apples");
Output : [Mangoes, Apples, Oranges, Figs] -> Output must be in descending order of the number of occurrences of the elements. If elements appear equal no. of times then they must be arranged alphabetically.
I am new to Java 8 and came across this problem. I tried solving it partially; I couldn't sort it. Can anyone help me with a better code?
My piece of code:
Function<String, Map<String, Long>> function = f -> {
Long count = listofComments.stream()
.filter(e -> e.toLowerCase().contains(f.toLowerCase())).count();
Map<String, Long> map = new HashMap<>(); //creates map for every element. Is it right?
map.put(f, count);
return map;
};
elements.stream().sorted().map(function).forEach(e-> System.out.print(e));
Output: {Apple=2}{Figs=1}{Mangoes=3}{Oranges=2}
In real life scenarios you would have to consider that applying an arbitrary number of match operations to an arbitrary number of comments can become quiet expensive when the numbers grow, so it’s worth doing some preparation:
Map<String,Predicate<String>> filters = elements.stream()
.sorted(String.CASE_INSENSITIVE_ORDER)
.map(s -> Pattern.compile(s, Pattern.LITERAL|Pattern.CASE_INSENSITIVE))
.collect(Collectors.toMap(Pattern::pattern, Pattern::asPredicate,
(a,b) -> { throw new AssertionError("duplicates"); }, LinkedHashMap::new));
The Predicate class is quiet valuable even when not doing regex matching. The combination of the LITERAL and CASE_INSENSITIVE flags enables searches with the intended semantic without the need to convert entire strings to lower case (which, by the way, is not sufficient for all possible scenarios). For this kind of matching, the preparation will include building the necessary data structure for the Boyer–Moore Algorithm for more efficient search, internally.
This map can be reused.
For your specific task, one way to use it would be
filters.entrySet().stream()
.map(e -> Map.entry(e.getKey(), listofComments.stream().filter(e.getValue()).count()))
.sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.forEachOrdered(e -> System.out.printf("%-7s%3d%n", e.getKey(), e.getValue()));
which will print for your example data:
Mangoes 3
Apple 2
Oranges 2
Figs 1
Note that the filters map is already sorted alphabetically and the sorted of the second stream operation is stable for streams with a defined encounter order, so it only needs to sort by occurrences, the entries with equal elements will keep their relative order, which is the alphabetical order from the source map.
Map.entry(…) requires Java 9 or newer. For Java 8, you’d have to use something like
new AbstractMap.SimpleEntry(…) instead.
You can still modify your function to store Map.Entry instead of a complete Map
Function<String, Map.Entry<String, Long>> function = f -> Map.entry(f, listOfComments.stream()
.filter(e -> e.toLowerCase().contains(f.toLowerCase())).count());
and then sort these entries before performing a terminal operation forEach in your case to print
elements.stream()
.map(function)
.sorted(Comparator.comparing(Map.Entry<String, Long>::getValue)
.reversed().thenComparing(Map.Entry::getKey))
.forEach(System.out::println);
This will then give you as output the following:
Mangoes=3
Apples=2
Oranges=2
Figs=1
First thing is to declare an additional class. It'll hold element and count:
class ElementWithCount {
private final String element;
private final long count;
ElementWithCount(String element, long count) {
this.element = element;
this.count = count;
}
String element() {
return element;
}
long count() {
return count;
}
}
To compute count let's declare an additional function:
static long getElementCount(List<String> listOfComments, String element) {
return listOfComments.stream()
.filter(comment -> comment.contains(element))
.count();
}
So now to find the result we need to transform stream of elements to stream of ElementWithCount objects, then sort that stream by count, then transform it back to stream of elements and collect it into result list.
To make this task easier, let's define comparator as a separate variable:
Comparator<ElementWithCount> comparator = Comparator
.comparing(ElementWithCount::count).reversed()
.thenComparing(ElementWithCount::element);
and now as all parts are ready, final computation is easy:
List<String> result = elements.stream()
.map(element -> new ElementWithCount(element, getElementCount(listOfComments, element)))
.sorted(comparator)
.map(ElementWithCount::element)
.collect(Collectors.toList());
You can use Map.Entry instead of a separate class and inline getElementCount, so it'll be "one-line" solution:
List<String> result = elements.stream()
.map(element ->
new AbstractMap.SimpleImmutableEntry<>(element,
listOfComments.stream()
.filter(comment -> comment.contains(element))
.count()))
.sorted(Map.Entry.<String, Long>comparingByValue().reversed().thenComparing(Map.Entry.comparingByKey()))
.map(Map.Entry::getKey)
.collect(Collectors.toList());
But it's much harder to understand in this form, so I recommend to split it to logical parts.

How to apply some changes to each element in List<List<>> structure using Java 8 approach

I have some structure like List<List<Double>> listOfDoubles I need to transform it to List<List<Integer>> listOfInteger. I wrote code without using Java 8. And it looks like:
List<List<Integer>> integerList = new ArrayList<>();
for (int i = 0; i < doubleList.size(); i++) {
for (Double doubleValue : doubleList.get(i)) {
integerList.get(i).add(doubleValue.intValue());
}
}
I tried to replace second for by foreach but can't because i should be final. How to write this code using Java 8 approach?
You could stream the "outer" list, generating a Stream<List<Double>. Then, stream each of the "inner" lists, convert each element to an Integer and collect the result. Then just collect the "outer" stream. If you bring it all together, you'd get something like this:
List<List<Integer>> integerList =
doubleList.stream()
.map(l -> l.stream()
.map(Double::intValue)
.collect(Collectors.toList()))
.collect(Collectors.toList());
In your code you lack the initialisation of the content. I would do it like this.
public List<List<Integer>> convert(List<List<Double>> doubles){
List<List<Integer>> toReturn = new ArrayList<>();
for(List<Double> ld: doubles){
List<Integer> temp = new ArrayList<>();
for(Double d: ld){
temp.add(d.intValue());
}
toReturn.add(temp)
}
return toReturn;
}
Basically, for each list in the double list of lists ( not really a matrix, but you could view it as one), you create a list in the integer list of lists and convert the single content.
P.S. this approach is more like Java 7 though, since a pure Java 8 would use streams as suggested in another answer.
We can get rid of the boilerplate code collect(Collectors.toList()) with StreamEx:
StreamEx.of(listOfDoubles)
.map(e -> StreamEx.of(e).map(Double::intValue).toList())
.toList();
To me, the readability and maintainability of above code is much better the cod e with JDK stream API:
List<List<Integer>> integerList =
doubleList.stream()
.map(l -> l.stream()
.map(Double::intValue)
.collect(Collectors.toList()))
.collect(Collectors.toList());

Is there any way to reuse a Stream? [duplicate]

This question already has answers here:
Copy a stream to avoid "stream has already been operated upon or closed"
(10 answers)
Closed 3 years ago.
I'm learning the new Java 8 features, and while experimenting with streams (java.util.stream.Stream) and collectors, I realized that a stream can't be used twice.
Is there any way to reuse it?
If you want to have the effect of reusing a stream, you might wrap the stream expression in a Supplier and call myStreamSupplier.get() whenever you want a fresh one. For example:
Supplier<Stream<String>> sup = () -> someList.stream();
List<String> nonEmptyStrings = sup.get().filter(s -> !s.isEmpty()).collect(Collectors.toList());
Set<String> uniqueStrings = sup.get().collect(Collectors.toSet());
From the documentation:
A stream should be operated on (invoking an intermediate or terminal stream operation) only once.
A stream implementation may throw IllegalStateException if it detects that the stream is being reused.
So the answer is no, streams are not meant to be reused.
As others have said, "no you can't".
But it's useful to remember the handy summaryStatistics() for many basic operations:
So instead of:
List<Person> personList = getPersons();
personList.stream().mapToInt(p -> p.getAge()).average().getAsDouble();
personList.stream().mapToInt(p -> p.getAge()).min().getAsInt();
personList.stream().mapToInt(p -> p.getAge()).max().getAsInt();
You can:
// Can also be DoubleSummaryStatistics from mapToDouble()
IntSummaryStatistics stats = personList.stream()
.mapToInt(p-> p.getAge())
.summaryStatistics();
stats.getAverage();
stats.getMin();
stats.getMax();
The whole idea of the Stream is that it's once-off. This allows you to create non-reenterable sources (for example, reading the lines from the network connection) without intermediate storage. If you, however, want to reuse the Stream content, you may dump it into the intermediate collection to get the "hard copy":
Stream<MyType> stream = // get the stream from somewhere
List<MyType> list = stream.collect(Collectors.toList()); // materialize the stream contents
list.stream().doSomething // create a new stream from the list
list.stream().doSomethingElse // create one more stream from the list
If you don't want to materialize the stream, in some cases there are ways to do several things with the same stream at once. For example, you may refer to this or this question for details.
As others have noted the stream object itself cannot be reused.
But one way to get the effect of reusing a stream is to extract the stream creation code to a function.
You can do this by creating a method or a function object which contains the stream creation code. You can then use it multiple times.
Example:
public static void main(String[] args) {
List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);
// The normal way to use a stream:
List<String> result1 = list.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
// The stream operation can be extracted to a local function to
// be reused on multiple sources:
Function<List<Integer>, List<String>> listOperation = l -> l.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
List<String> result2 = listOperation.apply(list);
List<String> result3 = listOperation.apply(Arrays.asList(1, 2, 3));
// Or the stream operation can be extracted to a static method,
// if it doesn't refer to any local variables:
List<String> result4 = streamMethod(list);
// The stream operation can also have Stream as argument and return value,
// so that it can be used as a component of a longer stream pipeline:
Function<Stream<Integer>, Stream<String>> streamOperation = s -> s
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i);
List<String> result5 = streamOperation.apply(list.stream().map(i -> i * 2))
.filter(s -> s.length() < 7)
.sorted()
.collect(toCollection(LinkedList::new));
}
public static List<String> streamMethod(List<Integer> l) {
return l.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
}
If, on the other hand, you already have a stream object which you want to iterate over multiple times, then you must save the content of the stream in some collection object.
You can then get multiple streams with the same content from than collection.
Example:
public void test(Stream<Integer> stream) {
// Create a copy of the stream elements
List<Integer> streamCopy = stream.collect(toList());
// Use the copy to get multiple streams
List<Integer> result1 = streamCopy.stream() ...
List<Integer> result2 = streamCopy.stream() ...
}
Come to think of it, this will of "reusing" a stream is just the will of carry out the desired result with a nice inline operation. So, basically, what we're talking about here, is what can we do to keep on processing after we wrote a terminal operation?
1) if your terminal operation returns a collection, the problem is solved right away, since every collection can be turned back into a stream (JDK 8).
List<Integer> l=Arrays.asList(5,10,14);
l.stream()
.filter(nth-> nth>5)
.collect(Collectors.toList())
.stream()
.filter(nth-> nth%2==0).forEach(nth-> System.out.println(nth));
2) if your terminal operations returns an optional, with JDK 9 enhancements to Optional class, you can turn the Optional result into a stream, and obtain the desired nice inline operation:
List<Integer> l=Arrays.asList(5,10,14);
l.stream()
.filter(nth-> nth>5)
.findAny()
.stream()
.filter(nth-> nth%2==0).forEach(nth-> System.out.println(nth));
3) if your terminal operation returns something else, i really doubt that you should consider a stream to process such result:
List<Integer> l=Arrays.asList(5,10,14);
boolean allEven=l.stream()
.filter(nth-> nth>5)
.allMatch(nth-> nth%2==0);
if(allEven){
...
}
The Functional Java library provides its own streams that do what you are asking for, i.e. they're memoized and lazy. You can use its conversion methods to convert between Java SDK objects and FJ objects, e.g. Java8.JavaStream_Stream(stream) will return a reusable FJ stream given a JDK 8 stream.

Cartesian product of streams in Java 8 as stream (using streams only)

I would like to create a method which creates a stream of elements which are cartesian products of multiple given streams (aggregated to the same type at the end by a binary operator). Please note that both arguments and results are streams, not collections.
For example, for two streams of {A, B} and {X, Y} I would like it produce stream of values {AX, AY, BX, BY} (simple concatenation is used for aggregating the strings). So far, I have came up with this code:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator, Stream<T>... streams) {
Stream<T> result = null;
for (Stream<T> stream : streams) {
if (result == null) {
result = stream;
} else {
result = result.flatMap(m -> stream.map(n -> aggregator.apply(m, n)));
}
}
return result;
}
This is my desired use case:
Stream<String> result = cartesian(
(a, b) -> a + b,
Stream.of("A", "B"),
Stream.of("X", "Y")
);
System.out.println(result.collect(Collectors.toList()));
Expected result: AX, AY, BX, BY.
Another example:
Stream<String> result = cartesian(
(a, b) -> a + b,
Stream.of("A", "B"),
Stream.of("K", "L"),
Stream.of("X", "Y")
);
Expected result: AKX, AKY, ALX, ALY, BKX, BKY, BLX, BLY.
However, if I run the code, I get this error:
IllegalStateException: stream has already been operated upon or closed
Where is the stream consumed? By flatMap? Can it be easily fixed?
Passing the streams in your example is never better than passing Lists:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator, List<T>... lists) {
...
}
And use it like this:
Stream<String> result = cartesian(
(a, b) -> a + b,
Arrays.asList("A", "B"),
Arrays.asList("K", "L"),
Arrays.asList("X", "Y")
);
In both cases you create an implicit array from varargs and use it as data source, thus the laziness is imaginary. Your data is actually stored in the arrays.
In most of the cases the resulting Cartesian product stream is much longer than the inputs, thus there's practically no reason to make the inputs lazy. For example, having five lists of five elements (25 in total), you will have the resulting stream of 3125 elements. So storing 25 elements in the memory is not very big problem. Actually in most of the practical cases they are already stored in the memory.
In order to generate the stream of Cartesian products you need to constantly "rewind" all the streams (except the first one). To rewind, the streams should be able to retrieve the original data again and again, either buffering them somehow (which you don't like) or grabbing them again from the source (colleciton, array, file, network, random numbers, etc.) and perform again and again all the intermediate operations. If your source and intermediate operations are slow, then lazy solution may be much slower than buffering solution. If your source is unable to produce the data again (for example, random numbers generator which cannot produce the same numbers it produced before), your solution will be incorrect.
Nevertheless totally lazy solution is possbile. Just use not streams, but stream suppliers:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator,
Supplier<Stream<T>>... streams) {
return Arrays.stream(streams)
.reduce((s1, s2) ->
() -> s1.get().flatMap(t1 -> s2.get().map(t2 -> aggregator.apply(t1, t2))))
.orElse(Stream::empty).get();
}
The solution is interesting as we create and reduce the stream of suppliers to get the resulting supplier and finally call it. Usage:
Stream<String> result = cartesian(
(a, b) -> a + b,
() -> Stream.of("A", "B"),
() -> Stream.of("K", "L"),
() -> Stream.of("X", "Y")
);
result.forEach(System.out::println);
stream is consumed in the flatMap operation in the second iteration. So you have to create a new stream every time you map your result. Therefore you have to collect the stream in advance to get a new stream in every iteration.
private static <T> Stream<T> cartesian(BiFunction<T, T, T> aggregator, Stream<T>... streams) {
Stream<T> result = null;
for (Stream<T> stream : streams) {
if (result == null) {
result = stream;
} else {
Collection<T> s = stream.collect(Collectors.toList());
result = result.flatMap(m -> s.stream().map(n -> aggregator.apply(m, n)));
}
}
return result;
}
Or even shorter:
private static <T> Stream<T> cartesian(BiFunction<T, T, T> aggregator, Stream<T>... streams) {
return Arrays.stream(streams).reduce((r, s) -> {
List<T> collect = s.collect(Collectors.toList());
return r.flatMap(m -> collect.stream().map(n -> aggregator.apply(m, n)));
}).orElse(Stream.empty());
}
You can create a method that returns a stream of List<T> of objects and does not aggregate them. The algorithm is the same: at each step, collect the elements of the second stream to a list and then append them to the elements of the first stream.
The aggregator is outside the method.
#SuppressWarnings("unchecked")
public static <T> Stream<List<T>> cartesianProduct(Stream<T>... streams) {
// incorrect incoming data
if (streams == null) return Stream.empty();
return Arrays.stream(streams)
// non-null streams
.filter(Objects::nonNull)
// represent each list element as SingletonList<Object>
.map(stream -> stream.map(Collections::singletonList))
// summation of pairs of inner lists
.reduce((stream1, stream2) -> {
// list of lists from second stream
List<List<T>> list2 = stream2.collect(Collectors.toList());
// append to the first stream
return stream1.flatMap(inner1 -> list2.stream()
// combinations of inner lists
.map(inner2 -> {
List<T> list = new ArrayList<>();
list.addAll(inner1);
list.addAll(inner2);
return list;
}));
}).orElse(Stream.empty());
}
public static void main(String[] args) {
Stream<String> stream1 = Stream.of("A", "B");
Stream<String> stream2 = Stream.of("K", "L");
Stream<String> stream3 = Stream.of("X", "Y");
#SuppressWarnings("unchecked")
Stream<List<String>> stream4 = cartesianProduct(stream1, stream2, stream3);
// output
stream4.map(list -> String.join("", list)).forEach(System.out::println);
}
String.join is a kind of aggregator in this case.
Output:
AKX
AKY
ALX
ALY
BKX
BKY
BLX
BLY
See also: Stream of cartesian product of other streams, each element as a List?

Java 8 Stream Closes when using map inside map

I am trying to join two streams together. I ran into the problem that my stream closes and I don't understand why. Couldn't anyone please explain to me why the following happens.
The code below doesn't work. I get an exception on the flatMap function that the stream is already closed.
private Stream<KeyValuePair<T, U>> joinStreams(Stream<T> first, Stream<U> second) {
return
first
.map(x -> second
.map(y -> new KeyValuePair<T, U>(x, y))
)
.flatMap(x -> x);
}
When I first collect a list from the second stream and then grab a stream from that list it does work. See the example below.
private Stream<KeyValuePair<T, U>> joinStreams(Stream<T> first, Stream<U> second) {
List<U> secondList = second.collect(Collectors.toList());
return
first
.map(x -> secondList.stream()
.map(y -> new KeyValuePair<T, U>(x, y))
)
.flatMap(x -> x);
}
I can't figure out why this happens. Could anyone please explain this?
Edit:
Example of the code calling this function.
List<Integer> numbers1 = Arrays.asList(1, 2);
List<Integer> numbers2 = Arrays.asList(3, 4);
List<KeyValuePair<Integer, Integer>> combined = joinStreams(numbers1.stream(), numbers2.stream())
.collect(Collectors.toList());
// Expected result
// 1 3
// 1 4
// 2 3
// 2 4
The problem is that your code attempts to process the second Stream twice (once for each element of the first Stream). A Stream can only be processed once, just like an Iterator can only iterate over the elements of the underlying class once.
If your first Stream had only one element, the code would work, since the second Stream would only be processed once.
In the code that does work, you produce a new Stream (from secondList) for each element of the first Stream, so each Stream is processed once, regardless of how many elements are in the first Stream.

Categories

Resources