Lazy flatMap implementation algorithm in java 10 - java

I know java streams, and tried to implement the map, filter, fold (with custom function as argument), both the strict and lazy evaluation ways.
However i could not implement a lazy implementation of flatmap in java.
Normal map,filter, fold are just composed functions which run on the main iterator (if its list) and apply of functions is discarded if the incoming value is null.
However flatMap input function produces another list( stream) which needs to be flattened,
How is the lazy flatMap implemented in java 10? is there any document on the algorithm?
Thanks.

If you want to implement lazy flatMap, the most important part is to provide a correct implementation of Iterator. This implementation can look like this:
final class FlatMappedIterator<A, B> implements Iterator<B> {
private final Iterator<A> iterator;
private final Function<A, Iterable<B>> f;
private Iterator<B> targetIterator; // Iterator after applying `f` to element of type A
FlatMappedIterator(Iterator<A> iterator, Function<A, Iterable<B>> f) {
this.iterator = iterator;
this.f = f;
}
#Override
public boolean hasNext() {
if (targetIterator != null && targetIterator.hasNext()) {
return true;
} else if (iterator.hasNext()) {
A next = iterator.next();
Iterable<B> targetIterable = f.apply(next);
targetIterator = targetIterable.iterator();
return targetIterator.hasNext();
} else {
return false;
}
}
#Override
public B next() {
if (hasNext()) {
return targetIterator.next();
} else {
throw new NoSuchElementException();
}
}
}
So the retrieval of the next element is postponed to the moment when hasNext or next is called.
Then you need to implement the flatMap function itself. But this is easy. I'm leaving it as an exercise for the reader :)

Related

Solution: Iterator which doesn't know if it has a next element

I wrote an iterator, which returns subgraphs of a fixed size of another given undirected simple graph.
It maintains an internal graph which is the currently calculated subgraph and has private stacks and lists from which it calculates the next subgraph.
It is not possible to know if the iterator can return another element, because maybe the algorithm terminates when trying to find the next subgraph.
In this design, the pattern of next() and hasNext() which Java offers doesn't work out. I currently wrote my own Interface BlindIterator with the following abstract methods:
/**
* #return True iff the current element is a valid return.
*/
public boolean hasCurrent();
/**
* #return Returns the current element, but does NOT generate the next element. This method can be called
* as often as wanted, without any side-effects.
*/
public T getCurrent();
/**Generates the next element, which can then be retrieved with getCurrent(). This method thus only provides
* this side-effect. If it is called while the current element is invalid, it may produce and exception,
* depending on the implementation on the iterator.
*/
public void generateNext();
Is this a common pattern and are there better designs than mine?
I believe what you have created is equivalent to the Iterator interface. Here is an implementation of Iterator using your BlindIterator:
class BlindIteratorIterator<T> implements Iterator<T> {
private BlindIterator<T> iterator;
public BlindIteratorIterator(BlindIterator<T> iterator) {
this.iterator = iterator;
iterator.generateNext();
}
#Override
public boolean hasNext() {
return iterator.hasCurrent();
}
#Override
public T next() {
T next = iterator.getCurrent();
iterator.generateNext();
return next;
}
}
You implement the iterator to preload/cache the next element (subgraph).
For example, if your elements are sourced from a Supplier, where the only method is a get() method that returns the next element, or null if no more elements are available, you would implement the Iterator like this:
public final class SupplierIterator<E> implements Iterator<E> {
private final Supplier<E> supplier;
private E next;
SupplierIterator(Supplier<E> supplier) {
this.supplier = supplier;
this.next = supplier.get(); // cache first (preload)
}
#Override
public boolean hasNext() {
return (this.next != null);
}
#Override
public E next() {
if (this.next == null)
throw new NoSuchElementException();
E elem = this.next;
this.next = supplier.get(); // cache next
return elem;
}
}
Answer by Joni has a good Iterator implementation that can use your intended BlindIterator as the source of elements.
Since you only invented the BlindIterator to work around your perceived limitations of Iterator, I'd recommend not doing that. Make the iterator implementation call the underlying "generate" logic directly.

Use Java lambda instead of 'if else'

With Java 8, I have this code:
if(element.exist()){
// Do something
}
I want to convert to lambda style,
element.ifExist(el -> {
// Do something
});
with an ifExist method like this:
public void ifExist(Consumer<Element> consumer) {
if (exist()) {
consumer.accept(this);
}
}
But now I have else cases to call:
element.ifExist(el -> {
// Do something
}).ifNotExist(el -> {
// Do something
});
I can write a similar ifNotExist, and I want they are mutually exclusive (if the exist condition is true, there is no need to check ifNotExist, because sometimes, the exist() method takes so much workload to check), but I always have to check two times. How can I avoid that?
Maybe the "exist" word make someone misunderstand my idea. You can imagine that I also need some methods:
ifVisible()
ifEmpty()
ifHasAttribute()
Many people said that this is bad idea, but:
In Java 8 we can use lambda forEach instead of a traditional for loop. In programming for and if are two basic flow controls. If we can use lambda for a for loop, why is using lambda for if bad idea?
for (Element element : list) {
element.doSomething();
}
list.forEach(Element::doSomething);
In Java 8, there's Optional with ifPresent, similar to my idea of ifExist:
Optional<Elem> element = ...
element.ifPresent(el -> System.out.println("Present " + el);
And about code maintenance and readability, what do you think if I have the following code with many repeating simple if clauses?
if (e0.exist()) {
e0.actionA();
} else {
e0.actionB();
}
if (e1.exist()) {
e0.actionC();
}
if (e2.exist()) {
e2.actionD();
}
if (e3.exist()) {
e3.actionB();
}
Compare to:
e0.ifExist(Element::actionA).ifNotExist(Element::actionB);
e1.ifExist(Element::actionC);
e2.ifExist(Element::actionD);
e3.ifExist(Element::actionB);
Which is better? And, oops, do you notice that in the traditional if clause code, there's a mistake in:
if (e1.exist()) {
e0.actionC(); // Actually e1
}
I think if we use lambda, we can avoid this mistake!
As it almost but not really matches Optional, maybe you might reconsider the logic:
Java 8 has a limited expressiveness:
Optional<Elem> element = ...
element.ifPresent(el -> System.out.println("Present " + el);
System.out.println(element.orElse(DEFAULT_ELEM));
Here the map might restrict the view on the element:
element.map(el -> el.mySpecialView()).ifPresent(System.out::println);
Java 9:
element.ifPresentOrElse(el -> System.out.println("Present " + el,
() -> System.out.println("Not present"));
In general the two branches are asymmetric.
It's called a 'fluent interface'. Simply change the return type and return this; to allow you to chain the methods:
public MyClass ifExist(Consumer<Element> consumer) {
if (exist()) {
consumer.accept(this);
}
return this;
}
public MyClass ifNotExist(Consumer<Element> consumer) {
if (!exist()) {
consumer.accept(this);
}
return this;
}
You could get a bit fancier and return an intermediate type:
interface Else<T>
{
public void otherwise(Consumer<T> consumer); // 'else' is a keyword
}
class DefaultElse<T> implements Else<T>
{
private final T item;
DefaultElse(final T item) { this.item = item; }
public void otherwise(Consumer<T> consumer)
{
consumer.accept(item);
}
}
class NoopElse<T> implements Else<T>
{
public void otherwise(Consumer<T> consumer) { }
}
public Else<MyClass> ifExist(Consumer<Element> consumer) {
if (exist()) {
consumer.accept(this);
return new NoopElse<>();
}
return new DefaultElse<>(this);
}
Sample usage:
element.ifExist(el -> {
//do something
})
.otherwise(el -> {
//do something else
});
You can use a single method that takes two consumers:
public void ifExistOrElse(Consumer<Element> ifExist, Consumer<Element> orElse) {
if (exist()) {
ifExist.accept(this);
} else {
orElse.accept(this);
}
}
Then call it with:
element.ifExistOrElse(
el -> {
// Do something
},
el -> {
// Do something else
});
The problem
(1) You seem to mix up different aspects - control flow and domain logic.
element.ifExist(() -> { ... }).otherElementMethod();
^ ^
control flow method business logic method
(2) It is unclear how methods after a control flow method (like ifExist, ifNotExist) should behave. Should they be always executed or be called only under the condition (similar to ifExist)?
(3) The name ifExist implies a terminal operation, so there is nothing to return - void. A good example is void ifPresent(Consumer) from Optional.
The solution
I would write a fully separated class that would be independent of any concrete class and any specific condition.
The interface is simple, and consists of two contextless control flow methods - ifTrue and ifFalse.
There can be a few ways to create a Condition object. I wrote a static factory method for your instance (e.g. element) and condition (e.g. Element::exist).
public class Condition<E> {
private final Predicate<E> condition;
private final E operand;
private Boolean result;
private Condition(E operand, Predicate<E> condition) {
this.condition = condition;
this.operand = operand;
}
public static <E> Condition<E> of(E element, Predicate<E> condition) {
return new Condition<>(element, condition);
}
public Condition<E> ifTrue(Consumer<E> consumer) {
if (result == null)
result = condition.test(operand);
if (result)
consumer.accept(operand);
return this;
}
public Condition<E> ifFalse(Consumer<E> consumer) {
if (result == null)
result = condition.test(operand);
if (!result)
consumer.accept(operand);
return this;
}
public E getOperand() {
return operand;
}
}
Moreover, we can integrate Condition into Element:
class Element {
...
public Condition<Element> formCondition(Predicate<Element> condition) {
return Condition.of(this, condition);
}
}
The pattern I am promoting is:
work with an Element;
obtain a Condition;
control the flow by the Condition;
switch back to the Element;
continue working with the Element.
The result
Obtaining a Condition by Condition.of:
Element element = new Element();
Condition.of(element, Element::exist)
.ifTrue(e -> { ... })
.ifFalse(e -> { ... })
.getOperand()
.otherElementMethod();
Obtaining a Condition by Element#formCondition:
Element element = new Element();
element.formCondition(Element::exist)
.ifTrue(e -> { ... })
.ifFalse(e -> { ... })
.getOperand()
.otherElementMethod();
Update 1:
For other test methods, the idea remains the same.
Element element = new Element();
element.formCondition(Element::isVisible);
element.formCondition(Element::isEmpty);
element.formCondition(e -> e.hasAttribute(ATTRIBUTE));
Update 2:
It is a good reason to rethink the code design. Neither of 2 snippets is great.
Imagine you need actionC within e0.exist(). How would the method reference Element::actionA be changed?
It would be turned back into a lambda:
e0.ifExist(e -> { e.actionA(); e.actionC(); });
unless you wrap actionA and actionC in a single method (which sounds awful):
e0.ifExist(Element::actionAAndC);
The lambda now is even less 'readable' then the if was.
e0.ifExist(e -> {
e0.actionA();
e0.actionC();
});
But how much effort would we make to do that? And how much effort will we put into maintaining it all?
if(e0.exist()) {
e0.actionA();
e0.actionC();
}
If you are performing a simple check on an object and then executing some statements based on the condition then one approach would be to have a Map with a Predicate as key and desired expression as value
for example.
Map<Predicate<Integer>,Supplier<String>> ruleMap = new LinkedHashMap <Predicate<Integer>,Supplier<String>>(){{
put((i)-> i<10,()->"Less than 10!");
put((i)-> i<100,()->"Less than 100!");
put((i)-> i<1000,()->"Less than 1000!");
}};
We could later stream the following Map to get the value when the Predicate returns true which could replace all the if/else code
ruleMap.keySet()
.stream()
.filter((keyCondition)->keyCondition.test(numItems,version))
.findFirst()
.ifPresent((e)-> System.out.print(ruleMap.get(e).get()));
Since we are using findFirst() it is equivalent to if/else if /else if ......

Special behavior of a stream if there are no elements

How can I express this with java8 streaming-API?
I want to perform itemConsumer for every item of a stream. If there
are no items I want to perform emptyAction.
Of course I could write something like this:
Consumer<Object> itemConsumer = System.out::println;
Runnable emptyAction = () -> {System.out.println("no elements");};
Stream<Object> stream = Stream.of("a","b"); // or Stream.empty()
List<Object> list = stream.collect(Collectors.toList());
if (list.isEmpty())
emptyAction.run();
else
list.stream().forEach(itemConsumer);
But I would prefer to avoid any Lists.
I also thought about setting a flag in a peek method - but that flag would be non-final and therefore not allowed. Using a boolean container also seems to be too much of a workaround.
You could coerce reduce to do this. The logic would be to reduce on false, setting the value to true if any useful data is encountered.
The the result of the reduce is then false then no items have been encountered. If any items were encountered then the result would be true:
boolean hasItems = stream.reduce(false, (o, i) -> {
itemConsumer.accept(i);
return true;
}, (l, r) -> l | r);
if (!hasItems) {
emptyAction.run();
}
This should work fine for parallel streams, as any stream encountering an item would set the value to true.
I'm not sure, however, that I like this as it's a slightly obtuse use of the reduce operation.
An alternative would be to use AtomicBoolean as a mutable boolean container:
final AtomicBoolean hasItems = new AtomicBoolean(false);
stream.forEach(i -> {
itemConsumer.accept(i);
hasItems.set(true);
});
if (!hasItems.get()) {
emptyAction.run();
}
I don't know if I like that more or less however.
Finally, you could have your itemConsumer remember state:
class ItemConsumer implements Consumer<Object> {
private volatile boolean hasConsumedAny;
#Override
public void accept(Object o) {
hasConsumedAny = true;
//magic magic
}
public boolean isHasConsumedAny() {
return hasConsumedAny;
}
}
final ItemConsumer itemConsumer = new ItemConsumer();
stream.forEach(itemConsumer::accept);
if (!itemConsumer.isHasConsumedAny()) {
emptyAction.run();
}
This seems a bit neater, but might not be practical. So maybe a decorator pattern -
class ItemConsumer<T> implements Consumer<T> {
private volatile boolean hasConsumedAny;
private final Consumer<T> delegate;
ItemConsumer(final Consumer<T> delegate) {
this.delegate = delegate;
}
#Override
public void accept(T t) {
hasConsumedAny = true;
delegate.accept(t);
}
public boolean isHasConsumedAny() {
return hasConsumedAny;
}
}
final ItemConsumer<Object> consumer = new ItemConsumer<Object>(() -> /** magic **/);
TL;DR: something has to remember whether you encountered anything during the consumption of the Stream, be it:
the Stream itself in case of reduce;
AtomicBoolean; or
the consumer
I think the consumer is probably best placed, from a logic point of view.
A solution without any additional variables:
stream.peek(itemConsumer).reduce((a, b) -> a).orElseGet(() -> {
emptyAction.run();
return null;
});
Note that if the stream is parallel, then itemConsumer could be called simultaneously for different elements in different threads (like in forEach, not in forEachOrdered). Also this solution will fail if the first stream element is null.
There’s a simple straight-forward solution:
Spliterator<Object> sp=stream.spliterator();
if(!sp.tryAdvance(itemConsumer))
emptyAction.run();
else
sp.forEachRemaining(itemConsumer);
You can even keep parallel support for the elements after the first, if you wish:
Spliterator<Object> sp=stream.parallel().spliterator();
if(!sp.tryAdvance(itemConsumer))
emptyAction.run();
else
StreamSupport.stream(sp, true).forEach(itemConsumer);
In my opinion, it is much easier to understand as a reduce based solution.
You could do this:
if(stream.peek(itemConsumer).count() == 0){
emptyAction.run();
}
But it seems that count may be changed to skip the peek if it knows the size of the Stream in Java 9 (see here), so if you want it to work in the future you could use:
if(stream.peek(itemConsumer).mapToLong(e -> 1).sum() == 0){
emptyAction.run();
}
Another attempt to use reduce:
Stream<Object> stream = Stream.of("a","b","c");
//Stream<Object> stream = Stream.empty();
Runnable defaultRunnable = () -> System.out.println("empty Stream");
Consumer<Object> printConsumer = System.out::println;
Runnable runnable = stream.map(x -> toRunnable(x, printConsumer)).reduce((a, b) -> () -> {
a.run();
b.run();
}).orElse(defaultRunnable);
runnable.run(); // prints a, b, c (or empty stream when it is empty)
// for type inference
static <T> Runnable toRunnable(T t, Consumer<T> cons){
return ()->cons.accept(t);
}
This approach does not use peek() which according to Javadoc "mainly exists to support debugging"

Method transform() in Type Lists Not Applicable

I am trying to use a Guava function to remove duplicates from a List. The reasoning for this is that a "duplicate" is based on a comparison between two items in the list, and whether these objects are "duplicate" requires a fair amount of logic.
Here is my attempt at the function:
private Function<List<BaseRecord>, List<BaseRecord>> removeDuplicates =
new Function<List<BaseRecord>, List<BaseRecord>>() {
public List<BaseRecord> apply(List<BaseRecord> records) {
List<BaseRecord> out = Lists.newArrayList();
PeekingIterator<BaseRecord> i = Iterators.peekingIterator(records.iterator());
while (i.hasNext()) {
BaseRecord current = i.next();
boolean isDuplicate = false;
if ( i.hasNext() ) {
BaseRecord next = i.peek();
// use a ComparisonChain to compare certain fields, removed
isDuplicate = compareCertainObjects(o1, o2);
}
if ( !isDuplicate ) {
out.add(current);
}
}
return out;
}
};
I then try to call it with Lists.transform(originalRecords, removeDuplicates)
Unfortunately, Eclipse isn't happy:
The method transform(List<F>, Function<? super F,? extends T>) in the type Lists is not applicable for the arguments (List<BaseRecord>, Function<List<BaseRecord>,List<BaseRecord>>).
BaseRecord is an abstract class, with at least two subtypes. The fields being compared are all parent of BaseRecord, not the child classes.
Did I just make a dumb mistake?
The reason is -
you're using Function from package
java.blah-blah.function.Function.
But Guava wants here
import com.google.common.base.Function;
Lists.transform is not intended for transforming the List, but its elements. You could try wrapping originalRecords into another List, but that doesn't seem like a "pretty" solution.
However, maybe you can refactor your Function into a Predicate and use Collections2.filter
It doesn't look like Guava is really necessary here at all, except maybe for newArrayList():
List<BaseRecord> dedupe(List<BaseRecord> records) {
List<BaseRecord> out = Lists.newArrayList();
for (BaseRecord current : records) {
boolean isDuplicate = false;
for (BaseRecord other : out) {
if (compareCertainObjects(current, other)) {
isDuplicate = true;
break;
}
}
if (!isDuplicate) {
out.add(current);
}
}
return out;
}
...and then just call dedupe(records), without going through a Function.

Remove from a collection during iteration

I have set of connection objects (library code I cannot change) that have a send method. If the sending fails, they call back a generic onClosed listener which I implement that calls removeConnection() in my code, which will remove the connection from the collection.
The onClosed callback is generic and can be called at any time. It is called when the peer closes the connection, for example, and not just when a write fails.
However, if I have some code that loops over my connections and sends, then the onClosed callback will attempt to modify a collection during iteration.
My current code creates a copy of the connections list before each iteration over it; however, in profiling this has shown to be very expensive.
Set<Connection> connections = new ....;
public void addConnection(Connection conn) {
connections.add(conn);
conn.addClosedListener(this);
}
#Override void onClosed(Connection conn) {
connections.remove(conn);
}
void send(Message msg) {
// how to make this so that the onClosed callback can be safely invoked, and efficient?
for(Connection conn: connections)
conn.send(msg);
}
How can I efficiently cope with modifying collections during iteration?
To iterate a collection with the concurrent modification without any exceptions use List Iterator.
http://www.mkyong.com/java/how-do-loop-iterate-a-list-in-java/ - example
If you use simple for or foreach loops, you will receive ConcurrentModificationException during the element removing - be careful on that.
As an addition, you could override the List Iterator with your own one and add the needed logic. Just implement the java.util.Iterator interface.
A ConcurrentSkipListSet is probably what you want.
You could also use a CopyOnWriteArraySet. This of course will still make a copy, however, it will only do so when the set is modified. So as long as Connection objects are not added or removed regularly, this would be more efficient.
You can also use ConcurrentHashMap.
ConcurrentHashMap is thread-safe, so you don't need to make a copy in order to be able to iterate.
Take a look at this implementation.. http://www.java2s.com/Tutorial/Java/0140__Collections/Concurrentset.htm
I would write a collection wrapper that:
Keeps a set of objects that are to be removed. If the iteration across the underlying collection comes across one of these it is skipped.
On completion of iteration, takes a second pass across the list to remove all of the gathered objects.
Perhaps something like this:
class ModifiableIterator<T> implements Iterator<T> {
// My iterable.
final Iterable<T> it;
// The Iterator we are walking.
final Iterator<T> i;
// The removed objects.
Set<T> removed = new HashSet<T>();
// The next actual one to return.
T next = null;
public ModifiableIterator(Iterable<T> it) {
this.it = it;
i = it.iterator();
}
#Override
public boolean hasNext() {
while ( next == null && i.hasNext() ) {
// Pull a new one.
next = i.next();
if ( removed.contains(next)) {
// Not that one.
next = null;
}
}
if ( next == null ) {
// Finished! Close.
close();
}
return next != null;
}
#Override
public T next() {
T n = next;
next = null;
return n;
}
// Close down - remove all removed.
public void close () {
if ( !removed.isEmpty() ) {
Iterator<T> i = it.iterator();
while ( i.hasNext() ) {
if ( removed.contains(i.next())) {
i.remove();
}
}
// Clear down.
removed.clear();
}
}
#Override
public void remove() {
throw new UnsupportedOperationException("Not supported.");
}
public void remove(T t) {
removed.add(t);
}
}
public void test() {
List<String> test = new ArrayList(Arrays.asList("A","B","C","D","E"));
ModifiableIterator i = new ModifiableIterator(test);
i.remove("A");
i.remove("E");
System.out.println(test);
while ( i.hasNext() ) {
System.out.println(i.next());
}
System.out.println(test);
}
You may need to consider whether your list could contain null values, in which case you will need to tweak it somewhat.
Please remember to close the iterator if you abandon the iteration before it completes.

Categories

Resources