I am learning some cool stuff about Java StreamAPI and got stuck'd into one problem:
I have a use case where I want to return newly create hashmap using stream. I am using the traditional way of defining a HashMap in the function and adding up values to it.
I was more interested in knowing some better ways to achieve so
public Map<String,String> constructMap(List<CustomObject> lists){
Map<String,String> newMap = new HashMap<>();
lists.stream().filter(x->x!=null).forEach(map -> newMap.putAll(map.getSomeMapping(studentId));
return newMap;
}
Can I achieve this using reduceAPI or any other way without having to create a custom hashmap (directly return the stream one liner)?
Edit:
for Example:
CustomObject c1 = new CustomObject("bookId1", "book1");
CustomObject c2 = new CustomObject("bookId2", "book2");
List<CustomObject> lists = new ArrayList();
lists.add(c1); lists.add(c2);
The getter in class CustomObject is: getSomeMapping(input)
which return Map<BookID, Book>
Expected output:
{"bookId1" : "book1", "bookId2" : "book2"}
Edit2:
One more thing to clarify, the CustomObject class does not have any other getters defined. The only function I have access to is getSomeMapping(input) which returns a mapping
thank you for any help.
Assuming CustomObject has the following structure and getter getSomeMapping which returns a map:
class CustomObject {
private Map<String, String> someMapping;
public CustomObject(String key, String value) {
this.someMapping = new HashMap<>();
someMapping.put(key, value);
}
public Map<String, String> getSomeMapping() {
return someMapping;
}
}
Then constructMap will use already mentioned Collectors.toMap after flattening the entries in someMapping:
public static Map<String, String> constructMap(List<CustomObject> list) {
return list.stream()
.filter(Objects::nonNull)
.map(CustomObject::getSomeMapping)
.flatMap(map -> map.entrySet().stream())
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(v1, v2) -> v1, // merge function to handle possible duplicates
LinkedHashMap::new
));
}
Test
CustomObject c1 = new CustomObject("bookId1", "book1");
CustomObject c2 = new CustomObject("bookId2", "book2");
List<CustomObject> lists = Arrays.asList(c1, c2);
Map<String, String> result = constructMap(lists);
System.out.println(result);
Output:
{bookId1=book1, bookId2=book2}
You can use Collectors#toMap(Function<? super T,? extends K> keyMapper, Function<? super T,? extends U> valueMapper, BinaryOperator<U> mergeFunction, Supplier<M> mapSupplier) to create a LinkedHashMap using the bookId as the key, and bookName as the value.
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.stream.Collectors;
class CustomObject {
private String bookId;
private String bookName;
public CustomObject(String bookId, String bookName) {
this.bookId = bookId;
this.bookName = bookName;
}
public String getBookId() {
return bookId;
}
public String getBookName() {
return bookName;
}
// Other stuff e.g. equals, hashCode etc.
}
public class Main {
public static void main(String[] args) {
List<CustomObject> list = List.of(new CustomObject("bookId1", "book1"), new CustomObject("bookId2", "book2"));
System.out.println(constructMap(list));
}
public static Map<String, String> constructMap(List<CustomObject> list) {
return list.stream()
.filter(Objects::nonNull)
.collect(Collectors.toMap(CustomObject::getBookId, CustomObject::getBookName, (a, b) -> a, LinkedHashMap::new));
}
}
Output:
{bookId1=book1, bookId2=book2}
Note: The mergeFunction, (a, b) -> a resolves the collision between values associated with the same key e.g. in this case, we have defined it to select a out of a and b having the same key. If the order of elements does not matter, you can use Collectors#toMap(Function<? super T,? extends K> keyMapper, Function<? super T,? extends U> valueMapper) as shown below:
public static Map<String, String> constructMap(List<CustomObject> list) {
return list.stream()
.filter(Objects::nonNull)
.collect(Collectors.toMap(CustomObject::getBookId, CustomObject::getBookName));
}
A sample output:
{bookId2=book2, bookId1=book1}
To turn a stream into a map you're better off using collect(). For instance:
public Map<String,String> toMap(List<Entry<String,String>> entries) {
return entries.stream().collect(Collectors.toMap(Entry::getKey, Entry::getValue));
}
Or if your keys are non-unique and you want the values to be combined as a list:
public Map<String,List<CustomObject>> toMap(List<CustomObject> entries) {
return entries.stream().collect(Collectors.groupingBy(CustomObject::getKey));
}
Look into [Collectors.toMap()] 1. This can return the items as a new Map.
lists.stream().filter(x->x!=null).collect(Collectors.toMap(CustomObject::getMapKey(), CustomObject::getMapValue()));
getMapKey and getMapValue are here methods returning the key and value of the CustomObject for the map. Instead of using simple getters it might also be necessary to execute some more advanced logic.
lists.stream().filter(x->x!=null).collect(Collectors.toMap(l -> {...; return key;}, l -> { ...; return value;}));
Let's assume your CustomObject class has getters to retrieve a school id with a name. You could do it like this. I declared it static as it does not appear to depend on instance fields.
public static Map<String,String> constructMap(List<CustomObject> lists){
return lists.stream()
.filter(Objects::nonNull)
.collect(Collectors.toMap(CustomObject::getName, CustomObject::getID));
}
This presumes that names and Id's are one-to-one, as this does not handle duplicate keys.
I use this code to to generate chart with data:
#GetMapping("/terminals")
public ResponseEntity<Map<String, List<TopTerminalsDTO>>> getTopTerminals(
#RequestParam(value = "start_date", required = true) String start_date,
#RequestParam(value = "end_date", required = true) String end_date) {
final List<PaymentTransactionsDailyFacts> list = dashboardService.findTop_Terminals(start_dateTime, end_dateTime);
final Collector<PaymentTransactionsDailyFacts, List<TopTerminalsDTO>, List<TopTerminalsDTO>> terminalsCollector = Collector
.of(ArrayList::new, (terminals, p) -> terminals.add(mapper.toTopTerminalsDTO(p)),
(accumulator, terminals) -> {
accumulator.addAll(terminals);
return accumulator;
});
final Map<String, List<TopTerminalsDTO>> final_map = list.stream().filter(p -> p.getTerminal_id() != null)
.collect(Collectors.groupingBy(p -> getTerminalName(p.getTerminal_id()), terminalsCollector));
return ResponseEntity.ok(final_map);
}
private String getTerminalName(Integer id) {
Optional<Terminals> obj = terminalService.findById(id);
return obj.map(Terminals::getName).orElse("");
}
But I noticed that getTerminalName is called more than 10 times to translate the name from number. Do you know how I can reduce the number of calls with some optimization?
Modify findTop_Terminals and PaymentTransactionsDailyFacts to include the name (using a SQL LEFT JOIN clause).
Alternative, scan the list for all terminal ids, then call a List<Terminals> list = terminalService.findByIds(idList); method to get all those terminals using a SQL IN clause.
Note: Beware limit on how many ? markers can be in a SQL statement.
Then build a Map<Integer, String> mapping terminal id to name, and replace getTerminalName method with map lookup.
Sounds like a case for a temporary cache, scoped to just this request, or perhaps longer if the terminal names are stable enough.
Clearly something like ehCache behind the scenes would suit this well, but I am often tempted with a tactical bit of caching, especially if I don't want to keep the cached values beyond this immediate request.
For example:
TerminalNameCache cache = new TerminalNameCache();
final Map<String, List<TopTerminalsDTO>> final_map = list.stream()
.filter(p -> p.getTerminal_id() != null)
.collect(Collectors.groupingBy(
p -> cache.getTerminalName(p.getTerminal_id()),
terminalsCollector));
Then the TerminalNameCache is just an inner class in the parent Controller class. This is to allow it to call the existing private String getTerminalName(Integer id) method from the question (shown as on ParentControllerClass below):
private class TerminalNameCache {
private final Map<Integer, String> cache = new ConcurrentHashMap<>();
private String getTerminalName(Integer id) {
return cache.computeIfAbsent(id,
id2 -> ParentControllerClass.this.getTerminalName(id2));
}
}
If this looks like emerging into a pattern, it would be worth refactoring the caching to something more re-useable. E.g. this could based around caching calls to a generic function:
public class CachedFunction<T, R> implements Function<T, R> {
private final Function<T, R> function;
private final Map<T, R> cache = new ConcurrentHashMap<>();
public CachedFunction(Function<T, R> function) {
this.function = function;
}
#Override
public R apply(T t) {
return cache.computeIfAbsent(t, t2 -> function.apply(t2));
}
}
This would then be used like this:
CachedFunction<Integer, String> cachedFunction = new CachedFunction<>(
id -> getTerminalName(id));
final Map<String, List<TopTerminalsDTO>> final_map = list.stream()
.filter(p -> p.getTerminal_id() != null)
.collect(Collectors.groupingBy(
p -> cachedFunction.apply(p.getTerminal_id()),
terminalsCollector));
In JDK 8 with lambda b93 there was a class java.util.stream.Streams.zip in b93 which could be used to zip streams (this is illustrated in the tutorial Exploring Java8 Lambdas. Part 1 by Dhananjay Nene). This function :
Creates a lazy and sequential combined Stream whose elements are the
result of combining the elements of two streams.
However in b98 this has disappeared. Infact the Streams class is not even accessible in java.util.stream in b98.
Has this functionality been moved, and if so how do I zip streams concisely using b98?
The application I have in mind is in this java implementation of Shen, where I replaced the zip functionality in the
static <T> boolean every(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred)
static <T> T find(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred)
functions with rather verbose code (which doesn't use functionality from b98).
I needed this as well so I just took the source code from b93 and put it in a "util" class. I had to modify it slightly to work with the current API.
For reference here's the working code (take it at your own risk...):
public static<A, B, C> Stream<C> zip(Stream<? extends A> a,
Stream<? extends B> b,
BiFunction<? super A, ? super B, ? extends C> zipper) {
Objects.requireNonNull(zipper);
Spliterator<? extends A> aSpliterator = Objects.requireNonNull(a).spliterator();
Spliterator<? extends B> bSpliterator = Objects.requireNonNull(b).spliterator();
// Zipping looses DISTINCT and SORTED characteristics
int characteristics = aSpliterator.characteristics() & bSpliterator.characteristics() &
~(Spliterator.DISTINCT | Spliterator.SORTED);
long zipSize = ((characteristics & Spliterator.SIZED) != 0)
? Math.min(aSpliterator.getExactSizeIfKnown(), bSpliterator.getExactSizeIfKnown())
: -1;
Iterator<A> aIterator = Spliterators.iterator(aSpliterator);
Iterator<B> bIterator = Spliterators.iterator(bSpliterator);
Iterator<C> cIterator = new Iterator<C>() {
#Override
public boolean hasNext() {
return aIterator.hasNext() && bIterator.hasNext();
}
#Override
public C next() {
return zipper.apply(aIterator.next(), bIterator.next());
}
};
Spliterator<C> split = Spliterators.spliterator(cIterator, zipSize, characteristics);
return (a.isParallel() || b.isParallel())
? StreamSupport.stream(split, true)
: StreamSupport.stream(split, false);
}
zip is one of the functions provided by the protonpack library.
Stream<String> streamA = Stream.of("A", "B", "C");
Stream<String> streamB = Stream.of("Apple", "Banana", "Carrot", "Doughnut");
List<String> zipped = StreamUtils.zip(streamA,
streamB,
(a, b) -> a + " is for " + b)
.collect(Collectors.toList());
assertThat(zipped,
contains("A is for Apple", "B is for Banana", "C is for Carrot"));
If you have Guava in your project, you can use the Streams.zip method (was added in Guava 21):
Returns a stream in which each element is the result of passing the corresponding element of each of streamA and streamB to function. The resulting stream will only be as long as the shorter of the two input streams; if one stream is longer, its extra elements will be ignored. The resulting stream is not efficiently splittable. This may harm parallel performance.
public class Streams {
...
public static <A, B, R> Stream<R> zip(Stream<A> streamA,
Stream<B> streamB, BiFunction<? super A, ? super B, R> function) {
...
}
}
Zipping two streams using JDK8 with lambda (gist).
public static <A, B, C> Stream<C> zip(Stream<A> streamA, Stream<B> streamB, BiFunction<A, B, C> zipper) {
final Iterator<A> iteratorA = streamA.iterator();
final Iterator<B> iteratorB = streamB.iterator();
final Iterator<C> iteratorC = new Iterator<C>() {
#Override
public boolean hasNext() {
return iteratorA.hasNext() && iteratorB.hasNext();
}
#Override
public C next() {
return zipper.apply(iteratorA.next(), iteratorB.next());
}
};
final boolean parallel = streamA.isParallel() || streamB.isParallel();
return iteratorToFiniteStream(iteratorC, parallel);
}
public static <T> Stream<T> iteratorToFiniteStream(Iterator<T> iterator, boolean parallel) {
final Iterable<T> iterable = () -> iterator;
return StreamSupport.stream(iterable.spliterator(), parallel);
}
Since I can't conceive any use of zipping on collections other than indexed ones (Lists) and I am a big fan of simplicity, this would be my solution:
<A,B,C> Stream<C> zipped(List<A> lista, List<B> listb, BiFunction<A,B,C> zipper){
int shortestLength = Math.min(lista.size(),listb.size());
return IntStream.range(0,shortestLength).mapToObj( i -> {
return zipper.apply(lista.get(i), listb.get(i));
});
}
The methods of the class you mentioned have been moved to the Stream interface itself in favor to the default methods. But it seems that the zip method has been removed. Maybe because it is not clear what the default behavior for different sized streams should be. But implementing the desired behavior is straight-forward:
static <T> boolean every(
Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) {
Iterator<T> it=c2.iterator();
return c1.stream().allMatch(x->!it.hasNext()||pred.test(x, it.next()));
}
static <T> T find(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) {
Iterator<T> it=c2.iterator();
return c1.stream().filter(x->it.hasNext()&&pred.test(x, it.next()))
.findFirst().orElse(null);
}
I humbly suggest this implementation. The resulting stream is truncated to the shorter of the two input streams.
public static <L, R, T> Stream<T> zip(Stream<L> leftStream, Stream<R> rightStream, BiFunction<L, R, T> combiner) {
Spliterator<L> lefts = leftStream.spliterator();
Spliterator<R> rights = rightStream.spliterator();
return StreamSupport.stream(new AbstractSpliterator<T>(Long.min(lefts.estimateSize(), rights.estimateSize()), lefts.characteristics() & rights.characteristics()) {
#Override
public boolean tryAdvance(Consumer<? super T> action) {
return lefts.tryAdvance(left->rights.tryAdvance(right->action.accept(combiner.apply(left, right))));
}
}, leftStream.isParallel() || rightStream.isParallel());
}
Using the latest Guava library (for the Streams class) you should be able to do
final Map<String, String> result =
Streams.zip(
collection1.stream(),
collection2.stream(),
AbstractMap.SimpleEntry::new)
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
The Lazy-Seq library provides zip functionality.
https://github.com/nurkiewicz/LazySeq
This library is heavily inspired by scala.collection.immutable.Stream and aims to provide immutable, thread-safe and easy to use lazy sequence implementation, possibly infinite.
Would this work for you? It's a short function, which lazily evaluates over the streams it's zipping, so you can supply it with infinite streams (it doesn't need to take the size of the streams being zipped).
If the streams are finite it stops as soon as one of the streams runs out of elements.
import java.util.Objects;
import java.util.function.BiFunction;
import java.util.stream.Stream;
class StreamUtils {
static <ARG1, ARG2, RESULT> Stream<RESULT> zip(
Stream<ARG1> s1,
Stream<ARG2> s2,
BiFunction<ARG1, ARG2, RESULT> combiner) {
final var i2 = s2.iterator();
return s1.map(x1 -> i2.hasNext() ? combiner.apply(x1, i2.next()) : null)
.takeWhile(Objects::nonNull);
}
}
Here is some unit test code (much longer than the code itself!)
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.BiFunction;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static org.junit.jupiter.api.Assertions.assertEquals;
class StreamUtilsTest {
#ParameterizedTest
#MethodSource("shouldZipTestCases")
<ARG1, ARG2, RESULT>
void shouldZip(
String testName,
Stream<ARG1> s1,
Stream<ARG2> s2,
BiFunction<ARG1, ARG2, RESULT> combiner,
Stream<RESULT> expected) {
var actual = StreamUtils.zip(s1, s2, combiner);
assertEquals(
expected.collect(Collectors.toList()),
actual.collect(Collectors.toList()),
testName);
}
private static Stream<Arguments> shouldZipTestCases() {
return Stream.of(
Arguments.of(
"Two empty streams",
Stream.empty(),
Stream.empty(),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"One singleton and one empty stream",
Stream.of(1),
Stream.empty(),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"One empty and one singleton stream",
Stream.empty(),
Stream.of(1),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"Two singleton streams",
Stream.of("blah"),
Stream.of(1),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("blah", 1))),
Arguments.of(
"One singleton, one multiple stream",
Stream.of("blob"),
Stream.of(2, 3),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("blob", 2))),
Arguments.of(
"One multiple, one singleton stream",
Stream.of("foo", "bar"),
Stream.of(4),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("foo", 4))),
Arguments.of(
"Two multiple streams",
Stream.of("nine", "eleven"),
Stream.of(10, 12),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("nine", 10), pair("eleven", 12)))
);
}
private static List<Object> pair(Object o1, Object o2) {
return List.of(o1, o2);
}
static private <T1, T2> List<Object> combine(T1 o1, T2 o2) {
return List.of(o1, o2);
}
#Test
void shouldLazilyEvaluateInZip() {
final var a = new AtomicInteger();
final var b = new AtomicInteger();
final var zipped = StreamUtils.zip(
Stream.generate(a::incrementAndGet),
Stream.generate(b::decrementAndGet),
(xa, xb) -> xb + 3 * xa);
assertEquals(0, a.get(), "Should not have evaluated a at start");
assertEquals(0, b.get(), "Should not have evaluated b at start");
final var takeTwo = zipped.limit(2);
assertEquals(0, a.get(), "Should not have evaluated a at take");
assertEquals(0, b.get(), "Should not have evaluated b at take");
final var list = takeTwo.collect(Collectors.toList());
assertEquals(2, a.get(), "Should have evaluated a after collect");
assertEquals(-2, b.get(), "Should have evaluated b after collect");
assertEquals(List.of(2, 4), list);
}
}
public class Tuple<S,T> {
private final S object1;
private final T object2;
public Tuple(S object1, T object2) {
this.object1 = object1;
this.object2 = object2;
}
public S getObject1() {
return object1;
}
public T getObject2() {
return object2;
}
}
public class StreamUtils {
private StreamUtils() {
}
public static <T> Stream<Tuple<Integer,T>> zipWithIndex(Stream<T> stream) {
Stream<Integer> integerStream = IntStream.range(0, Integer.MAX_VALUE).boxed();
Iterator<Integer> integerIterator = integerStream.iterator();
return stream.map(x -> new Tuple<>(integerIterator.next(), x));
}
}
AOL's cyclops-react, to which I contribute, also provides zipping functionality, both via an extended Stream implementation, that also implements the reactive-streams interface ReactiveSeq, and via StreamUtils that offers much of the same functionality via static methods to standard Java Streams.
List<Tuple2<Integer,Integer>> list = ReactiveSeq.of(1,2,3,4,5,6)
.zip(Stream.of(100,200,300,400));
List<Tuple2<Integer,Integer>> list = StreamUtils.zip(Stream.of(1,2,3,4,5,6),
Stream.of(100,200,300,400));
It also offers more generalized Applicative based zipping. E.g.
ReactiveSeq.of("a","b","c")
.ap3(this::concat)
.ap(of("1","2","3"))
.ap(of(".","?","!"))
.toList();
//List("a1.","b2?","c3!");
private String concat(String a, String b, String c){
return a+b+c;
}
And even the ability to pair every item in one stream with every item in another
ReactiveSeq.of("a","b","c")
.forEach2(str->Stream.of(str+"!","2"), a->b->a+"_"+b);
//ReactiveSeq("a_a!","a_2","b_b!","b_2","c_c!","c2")
If anyone needs this yet, there is StreamEx.zipWith function in streamex library:
StreamEx<String> givenNames = StreamEx.of("Leo", "Fyodor")
StreamEx<String> familyNames = StreamEx.of("Tolstoy", "Dostoevsky")
StreamEx<String> fullNames = givenNames.zipWith(familyNames, (gn, fn) -> gn + " " + fn);
fullNames.forEach(System.out::println); // prints: "Leo Tolstoy\nFyodor Dostoevsky\n"
This is great. I had to zip two streams into a Map with one stream being the key and other being the value
Stream<String> streamA = Stream.of("A", "B", "C");
Stream<String> streamB = Stream.of("Apple", "Banana", "Carrot", "Doughnut");
final Stream<Map.Entry<String, String>> s = StreamUtils.zip(streamA,
streamB,
(a, b) -> {
final Map.Entry<String, String> entry = new AbstractMap.SimpleEntry<String, String>(a, b);
return entry;
});
System.out.println(s.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue())));
Output:
{A=Apple, B=Banana, C=Carrot}