Validate beginning of stream in Reactor Flux - java

Using Reactor, I'm trying to validate the beginning of a cold Flux stream and then become a pass-through.
For example, say I need to validate the first N elements. If (and only if) it passes, these and further elements are forwarded. If it fails, only an error is emitted.
This is what I have so far. It works, but is there a better or more correct way to do this? I was tempted to implement my own operator, but I'm told it's complicated and not recommended.
flux
.bufferUntil(new Predicate<>() {
private int count = 0;
#Override
public boolean test(T next) {
return ++count >= N;
}
})
// Zip with index to know the first element
.zipWith(Flux.<Integer, Integer>generate(() -> 0, (cur, s) -> {
s.next(cur);
return cur + 1;
}))
.map(t -> {
if (t.getT2() == 0 && !validate(t.getT1()))
throw new RuntimeException("Invalid");
return t.getT1();
})
// Flatten buffered elements
.flatMapIterable(identity())
I could have used doOnNext instead of the second map since it doesn't map anything, but I'm not sure it's an acceptable use of the peek methods.
I could also have used a stateful mapper in the second map to run only once instead of zipping with index, I guess that's acceptable since I'm already using a stateful predicate...

Your requirement sounds interesting! We have switchOnFirst which could be useful for validating the first element. But if you have N number of elements to validate, we can try something like this.
Here I assume that I have to validate the first 5 elements which should be <= 5. Then it is a valid stream. Otherwise we would simply throw error saying validation failed.
Flux<Integer> integerFlux = Flux.range(1, 10).delayElements(Duration.ofSeconds(1));
integerFlux
.buffer(5)
.switchOnFirst((signal, flux) -> {
//first 5 elements are <= 5, then it is a valid stream
return signal.get().stream().allMatch(i -> i <= 5) ? flux : Flux.error(new RuntimeException("validation failed"));
})
.flatMapIterable(Function.identity())
.subscribe(System.out::println,
System.out::println);
However this approach is not good as it keeps collecting 5 elements every time even after the first validation is done which we might not want.
To avoid buffering N elements after the validation, we can use bufferUntil. Once we had collected the first N elements and validated, it would just pass the 1 element as and when it receives to the downstream.
AtomicInteger atomicInteger = new AtomicInteger(1);
integerFlux
.bufferUntil(i -> {
if(atomicInteger.get() < 5){
atomicInteger.incrementAndGet();
return false;
}
return true;
})
.switchOnFirst((signal, flux) -> {
return signal.get().stream().allMatch(i -> i <= 5) ? flux : Flux.error(new RuntimeException("validation failed"));
})
.flatMapIterable(Function.identity())
.subscribe(System.out::println,
System.out::println);

Related

Converting enhanced loop into Java 8 Streams, filter

I am trying to learn Java 8. Is there a way to turn the method below into Java 8 Streams, filter, and forEach. If so, how?
String[] couponList = coupons.split(",");
for(String coupon:couponList) {
singleCouponUsageCount = getSingleCouponUsageCount(coupon);
if(singleCouponUsageCount >= totalUsageCount)
return 0;
}
return 1;
//
for(String coupon:couponList) {
singleCouponUsageCount = getSingleCouponUsageCount(coupon);
if(singleCouponUsageCount >= totalUsageCount)
return singleCouponUsageCount;
}
return singleCouponUsageCount;
You can Stream over the elements of the array, map them to their usage count, and use anyMatch to determine if any of the usage counts meets the criteria that should result in returning 0:
return Arrays.stream(coupons.split(","))
.map(coupon -> getSingleCouponUsageCount(coupon))
.anyMatch(count -> count >= totalUsageCount) ? 0 : 1;
EDIT:
For you second snippet, if you want to return the first count that matches the condition, you can write:
return Arrays.stream(coupons.split(","))
.map(coupon -> getSingleCouponUsageCount(coupon))
.filter(count -> count >= totalUsageCount)
.findFirst()
.orElse(someDefaultValue);
Usually, you want a search operation to be short-circuiting, in other words, to return immediately when a match has been found. But unlike operations like collect, the short-circuiting operations of the Stream API can’t be customized easily.
For you specific operation, you can split the operation into two, which can still be formulated as a single expression:
String[] couponList = coupons.split(",");
return Arrays.stream(couponList, 0, couponList.length-1)
.map(coupon -> getSingleCouponUsageCount(coupon))
.filter(singleCouponUsageCount -> singleCouponUsageCount >= totalUsageCount)
.findFirst()
.orElseGet(() -> getSingleCouponUsageCount(couponList[couponList.length-1]));
This does a short-circuiting search over all but the last element, returning immediately when a match has been found. Only if no match has been found there, the last element will be processed and its result returned unconditionally.
You can do it like this
return Stream.of(coupons.split(","))
.anyMatch(coupon -> getSingleCouponUsageCount(coupon) >= totalUsageCount) ? 0 : 1;
Yes, you can do it with streams:
List<Integer> results = couponList
.stream()
.map(coupon -> getSingleCouponUsageCount(coupon))
.filter(count -> count >= totalUsageCount ? 0 : 1)
.collect(Collectors.toList());
You can also use Pattern.splitAsStream() for this, which directly returns a Stream:
// as a constant
private static final Pattern COMMA = Pattern.compile(",");
// somewhere else in a method
boolean found = COMMA.splitAsStream(coupons)
// effectively the same as coupon -> getSingleCouponCount(count)
.map(this::getSingleCouponCount)
.anyMatch(count -> count >= totalUsageCount);
return found ? 0 : 1;
Given the code that you've shared. An important utility for you would be to create a lookup map for coupon usage.
Map<String, Long> couponUsageCount(String[] couponList) {
return Arrays.stream(couponList)
.collect(Collectors.toMap(Function.identity(),
coupon ->> getSingleCouponUsageCount(coupon)));
}
Further, it eases to incorporate this into the other two implementations.
// note: boolean instead of 0 and 1
boolean countUsageExceeds(String[] couponList, Long totalUsageCount) {
return couponUsageCount(couponList).values()
.stream()
.anyMatch(usage -> usage >= totalUsageCount);
}
// better to use Optional since you might not find any such value
// same as above method returning false
Optional<Long> exceededValue(String[] couponList, Long totalUsageCount) {
Map<String, Long> couponUsageCount = couponUsageCount(couponList);
// use this with orElse if you want to return an absolute value from this method
long lastValue = couponUsageCount.get(couponList[couponList.length - 1]);
return couponUsageCount.values()
.stream()
.filter(usage -> usage >= totalUsageCount)
.findFirst();
}

How to validate that a Java 8 Stream has two specific elements in it?

Let's say I have List<Car> and I want to search through that list to verify that I have both a Civic AND a Focus. If it's an OR it's very easy in that I can just apply an OR on the .filter(). Keep in mind that I can't do filter().filter() for this type of AND.
A working solution would be to do:
boolean hasCivic = reportElements.stream()
.filter(car -> "Civic".equals(car.getModel()))
.findFirst()
.isPresent();
boolean hasFocus = reportElements.stream()
.filter(car -> "Focus".equals(car.getModel()))
.findFirst()
.isPresent();
return hasCivic && hasFocus;
But then I'm basically processing the list twice. I can't apply an && in the filter nor can I do filter().filter().
Is there a way to process the stream once to find if the list contains both a Civic and a Focus car?
IMPORTANT UPDATE: The key problem with the solutions provided is that they all guarantee O(n) whereas my solution could be done after just two comparisons. If my list of cars is say 10 million cars then there would be a very significant performance cost. My solution however doesn't feel right, but maybe it is the best solution performance wise...
You could filter the stream on "Civic" or "Focus", and then run a collector on getModel() returning a Set<String>. Then you could test if your set contains both keys.
Set<String> models = reportElements.stream()
.map(Car::getModel)
.filter(model -> model.equals("Focus") || model.equals("Civic"))
.collect(Collectors.toSet());
return models.contains("Focus") && models.contains("Civic");
However, this would process the entire stream; it wouldn't "fast succeed" when both have been found.
The following is a "fast succeed" short-circuiting method. (Updated to include comments and clarifications from comments, below)
return reportElements.stream()
.map(Car::getModel)
.filter(model -> model.equals("Focus") || model.equals("Civic"))
.distinct()
.limit(2)
.count() == 2;
Breaking the stream operations down one at a time, we have:
.map(Car::getModel)
This operation transforms the stream of cars into a stream of car models.
We do this for efficiency.
Instead of calling car.getModel() multiple times in various places in the remainder of the pipeline (twice in the filter(...) to test against each of the desired models, and again for the distinct() operation), we apply this mapping operation once.
Note that this does not create the "temporary map" mentioned in the comments;
it merely translates the car into the car's model for the next stage of the pipeline.
.filter(model -> model.equals("Focus") || model.equals("Civic"))
This filters the stream of car models, allowing only the "Focus" and "Civic" car models to pass.
.distinct()
This pipeline operation is a stateful intermediate operation.
It remembers each car model that it sees in a temporary Set.
(This is likely the "temporary map" mentioned in the comments.)
Only if the model does not exist in the temporary set,
will it be (a) added to the set, and (b) passed on to the next stage of the pipeline.
At this point in the pipeline, there can only be at most two elements in the stream: "Focus" or "Civic" or neither or both.
We know this because we know the filter(...) will only ever pass those two models, and we know that distinct() will remove any duplicates.
However, this stream pipeline itself does not know that.
It would continue to pass car objects to the map stage to be converted into model strings, pass these models to the filter stage, and send on any matching items to the distinct stage.
It cannot tell that this is futile, because it doesn't understand that nothing else can pass through the algorithm; it simple executes the instructions.
But we do understand.
At most two distinct models can pass through the distinct() stage.
So, we follow this with:
.limit(2)
This is a short-circuiting stateful intermediate operation.
It maintains a count of the number of items which pass through, and
after the indicated amount, it terminates the stream, causing all subsequent items to be discarded without even starting down the pipeline.
At this point in the pipeline, there can only be at most two elements in the stream: "Focus" or "Civic" or neither or both.
But if both, then the stream has been truncated and is at the end.
.count() == 2;
Count up the number of items that made it through the pipeline,
and test against the desired number.
If we found both models, the stream will immediate terminate, count() will return 2, and true will be returned.
If both models are not present, of course, the stream is processed until the bitter end, count() will return a value less that two, and false will result.
Example, using an infinite stream of models.
Every third model is a "Civic", every 7th model is a "Focus", the remainder are all "Model #":
boolean matched = IntStream.iterate(1, i -> i + 1)
.mapToObj(i -> i % 3 == 0 ? "Civic" : i % 7 == 0 ? "Focus" : "Model "+i)
.peek(System.out::println)
.filter(model -> model.equals("Civic") || model.equals("Focus"))
.peek(model -> System.out.println(" After filter: " + model))
.distinct()
.peek(model -> System.out.println(" After distinct: " + model))
.limit(2)
.peek(model -> System.out.println(" After limit: " + model))
.count() == 2;
System.out.println("Matched = "+matched);
Output:
Model 1
Model 2
Civic
After filter: Civic
After distinct: Civic
After limit: Civic
Model 4
Model 5
Civic
After filter: Civic
Focus
After filter: Focus
After distinct: Focus
After limit: Focus
Matched = true
Notice that 3 models got through the filter(), but only 2 made it past distinct() and limit().
More importantly, notice that true was returned long before the end of the infinite stream of models was reached.
Generalizing the solution, since the OP wants something that could work with people, or credit cards, or IP addresses, etc., and the search criteria is probably not a fixed set of two items:
Set<String> models = Set.of("Focus", "Civic");
return reportElements.stream()
.map( Car::getModel )
.filter( models::contains )
.distinct()
.limit( models.size() )
.count() == models.size();
Here, given an arbitrary models set, existence of any particular set of car models may be obtained, not limited to just 2.
You can do:
reportElements.stream()
.filter(car -> "Civic".equals(car.getModel()) || "Focus".equals(car.getModel()))
.collect(Collectors.toMap(
c -> c.getModel(),
c -> c,
(c1, c2) -> c1
)).size() == 2;
or even with Set
reportElements.stream()
.filter(car -> "Civic".equals(car.getModel()) || "Focus".equals(car.getModel()))
.map(car -> car.getModel())
.collect(Collectors.toSet())
.size() == 2;
and with distinct
reportElements.stream()
.filter(car -> "Civic".equals(car.getModel()) || "Focus".equals(car.getModel()))
.map(car -> car.getModel())
.distinct()
.count() == 2L;
The reason it "doesn't feel right" is because you are forcing the stream API to do something it doesn't want to do. You would almost surely be better off with a traditional loop:
boolean hasFocus = false, hasCivic = false;
for (Car c : reportElements) {
if ("Focus".equals(c.getModel())) hasFocus = true;
if ("Civic".equals(c.getModel())) hasCivic = true;
if (hasFocus & hasCivic) return true;
}
return false;

Java Streams — How to perform an intermediate function every nth item

I am looking for an operation on a Stream that enables me to perform a non-terminal (and/or terminal) operation every nth item. Although I use a stream of primes for example, the stream could just as easily be web-requests, user actions, or some other cold data or live feed being produced.
From this:
Duration start = Duration.ofNanos(System.nanoTime());
IntStream.iterate(2, n -> n + 1)
.filter(Findprimes::isPrime)
.limit(1_000_1000 * 10)
.forEach(System.out::println);
System.out.println("Duration: " + Duration.ofNanos(System.nanoTime()).minus(start));
To a stream function like this:
IntStream.iterate(2, n -> n + 1)
.filter(Findprimes::isPrime)
.limit(1_000_1000 * 10)
.peekEvery(10, System.out::println)
.forEach( it -> {});
Create a helper method to wrap the peek() consumer:
public static IntConsumer every(int count, IntConsumer consumer) {
if (count <= 0)
throw new IllegalArgumentException("Count must be >1: Got " + count);
return new IntConsumer() {
private int i;
#Override
public void accept(int value) {
if (++this.i == count) {
consumer.accept(value);
this.i = 0;
}
}
};
}
You can now use it almost exactly like you wanted:
IntStream.rangeClosed(1, 20)
.peek(every(5, System.out::println))
.count();
Output
5
10
15
20
The helper method can be put in a utility class and statically imported, similar to how the Collectors class is nothing but static helper methods.
As noted by #user140547 in a comment, this code is not thread-safe, so it cannot be used with parallel streams. Besides, the output order would be messed up, so it doesn't really make sense to use it with parallel streams anyway.
It is not a good idea to rely on peek() and count() as it is possible that the operation is not invoked at all if count() can be calculated without going over the whole stream. Even if it works now, it does not mean that it is also going to work in future. See the javadoc of Stream.count() in Java 9.
Better use forEach().
For the problem itself: In special cases like a simple iteration, you could just filter your objects like.
Stream.iterate(2, n->n+1)
.limit(20)
.filter(n->(n-2)%5==0 && n!=2)
.forEach(System.out::println);
This of course won't work for other cases, where you might use a stateful IntConsumer. If iterate() is used, it is probably not that useful to use parallel streams anyway.
If you want a generic solution, you could also try to use a "normal" Stream, which may not be as efficient as an IntStream, but should still suffice in many cases:
class Tuple{ // ctor, getter/setter omitted
int index;
int value;
}
Then you could do:
Stream.iterate( new Tuple(1,2),t-> new Tuple(t.index+1,t.value*2))
.limit(30)
.filter(t->t.index %5 == 0)
.forEach(System.out::println);
If you have to use peek(), you can also do
.peek(t->{if (t.index %5 == 0) System.out.println(t);})
Or if you add methods
static Tuple initialTuple(int value){
return new Tuple(1,value);
}
static UnaryOperator<Tuple> createNextTuple(IntUnaryOperator f){
return current -> new Tuple(current.index+1,f.applyAsInt(current.value));
}
static Consumer<Tuple> every(int n,IntConsumer consumer){
return tuple -> {if (tuple.index % n == 0) consumer.accept(tuple.value);};
}
you can also do (with static imports):
Stream.iterate( initialTuple(2), createNextTuple(x->x*2))
.limit(30)
.peek(every(5,System.out::println))
.forEach(System.out::println);
Try this.
int[] counter = {0};
long result = IntStream.iterate(2, n -> n + 1)
.filter(Findprimes::isPrime)
.limit(100)
.peek(x -> { if (counter[0]++ % 10 == 0) System.out.print(x + " ");} )
.count();
result:
2 31 73 127 179 233 283 353 419 467

Mutate elements in a Stream

Is there a 'best practice' for mutating elements within a Stream? I'm specifically referring to elements within the stream pipeline, not outside of it.
For example, consider the case where I want to get a list of Users, set a default value for a null property and print it to the console.
Assuming the User class:
class User {
String name;
static User next(int i) {
User u = new User();
if (i % 3 != 0) {
u.name = "user " + i;
}
return u;
}
}
In java 7 it'd be something along the lines of:
for (int i = 0; i < 7; i++) {
User user = User.next(i);
if(user.name == null) {
user.name = "defaultName";
}
System.out.println(user.name);
}
In java 8 it would seem like I'd use .map() and return a reference to the mutated object:
IntStream.range(0, 7)
.mapToObj(User::next)
.map(user -> {
if (user.name == null) {
user.name = "defaultName";
}
return user;
})
//other non-terminal operations
//before a terminal such as .forEach or .collect
.forEach(it -> System.out.println(it.name));
Is there a better way to achieve this? Perhaps using .filter() to handle the null mutation and then concat the unfiltered stream and the filtered stream? Some clever use of Optional? The goal being the ability to use other non-terminal operations before the terminal .forEach().
In the 'spirit' of streams I'm trying to do this without intermediary collections and simple 'pure' operations that don't depend on side effects outside the pipeline.
Edit: The official Stream java doc states 'A small number of stream operations, such as forEach() and peek(), can operate only via side-effects; these should be used with care.' Given that this would be a non-interfering operation, what specifically makes it dangerous? The examples I've seen reach outside the pipeline, which is clearly sketchy.
Don't mutate the object, map to the name directly:
IntStream.range(0, 7)
.mapToObj(User::next)
.map(user -> user.name)
.map(name -> name == null ? "defaultName" : name)
.forEach(System.out::println);
It sounds like you're looking for peek:
.peek(user -> {
if (user.name == null) {
user.name = "defaultName";
}
})
...though it's not clear that your operation actually requires modifying the stream elements instead of just passing through the field you want:
.map(user -> (user.name == null) ? "defaultName" : user.name)
It would seem that Streams can't handle this in one pipeline. The 'best practice' would be to create multiple streams:
List<User> users = IntStream.range(0, 7)
.mapToObj(User::next)
.collect(Collectors.toList());
users.stream()
.filter(it -> it.name == null)
.forEach(it -> it.name = "defaultValue");
users.stream()
//other non-terminal operations
//before terminal operation
.forEach(it -> System.out.println(it.name));

How to dynamically do filtering in Java 8?

I know in Java 8, I can do filtering like this :
List<User> olderUsers = users.stream().filter(u -> u.age > 30).collect(Collectors.toList());
But what if I have a collection and half a dozen filtering criteria, and I want to test the combination of the criteria ?
For example I have a collection of objects and the following criteria :
<1> Size
<2> Weight
<3> Length
<4> Top 50% by a certain order
<5> Top 20% by a another certain ratio
<6> True or false by yet another criteria
And I want to test the combination of the above criteria, something like :
<1> -> <2> -> <3> -> <4> -> <5>
<1> -> <2> -> <3> -> <5> -> <4>
<1> -> <2> -> <5> -> <4> -> <3>
...
<1> -> <5> -> <3> -> <4> -> <2>
<3> -> <2> -> <1> -> <4> -> <5>
...
<5> -> <4> -> <3> -> <3> -> <1>
If each testing order may give me different results, how to write a loop to automatically filter through all the combinations ?
What I can think of is to use another method that generates the testing order like the following :
int[][] getTestOrder(int criteriaCount)
{
...
}
So if the criteriaCount is 2, it will return : {{1,2},{2,1}}
If the criteriaCount is 3, it will return : {{1,2,3},{1,3,2},{2,1,3},{2,3,1},{3,1,2},{3,2,1}}
...
But then how to most efficiently implement it with the filtering mechanism in concise expressions that comes with Java 8 ?
Interesting problem. There are several things going on here. No doubt this could be solved in less than half a page of Haskell or Lisp, but this is Java, so here we go....
One issue is that we have a variable number of filters, whereas most of the examples that have been shown illustrate fixed pipelines.
Another issue is that some of the OP's "filters" are context sensitive, such as "top 50% by a certain order". This can't be done with a simple filter(predicate) construct on a stream.
The key is to realize that, while lambdas allow functions to be passed as arguments (to good effect) it also means that they can be stored in data structures and computations can be performed on them. The most common computation is to take multiple functions and compose them.
Assume that the values being operated on are instances of Widget, which is a POJO that has some obvious getters:
class Widget {
String name() { ... }
int length() { ... }
double weight() { ... }
// constructors, fields, toString(), etc.
}
Let's start off with the first issue and figure out how to operate with a variable number of simple predicates. We can create a list of predicates like this:
List<Predicate<Widget>> allPredicates = Arrays.asList(
w -> w.length() >= 10,
w -> w.weight() > 40.0,
w -> w.name().compareTo("c") > 0);
Given this list, we can permute them (probably not useful, since they're order independent) or select any subset we want. Let's say we just want to apply all of them. How do we apply a variable number of predicates to a stream? There is a Predicate.and() method that will take two predicates and combine them using a logical and, returning a single predicate. So we could take the first predicate and write a loop that combines it with the successive predicates to build up a single predicate that's a composite and of them all:
Predicate<Widget> compositePredicate = allPredicates.get(0);
for (int i = 1; i < allPredicates.size(); i++) {
compositePredicate = compositePredicate.and(allPredicates.get(i));
}
This works, but it fails if the list is empty, and since we're doing functional programming now, mutating a variable in a loop is declassé. But lo! This is a reduction! We can reduce all the predicates over the and operator get a single composite predicate, like this:
Predicate<Widget> compositePredicate =
allPredicates.stream()
.reduce(w -> true, Predicate::and);
(Credit: I learned this technique from #venkat_s. If you ever get a chance, go see him speak at a conference. He's good.)
Note the use of w -> true as the identity value of the reduction. (This could also be used as the initial value of compositePredicate for the loop, which would fix the zero-length list case.)
Now that we have our composite predicate, we can write out a short pipeline that simply applies the composite predicate to the widgets:
widgetList.stream()
.filter(compositePredicate)
.forEach(System.out::println);
Context Sensitive Filters
Now let's consider what I referred to as a "context sensitive" filter, which is represented by the example like "top 50% in a certain order", say the top 50% of widgets by weight. "Context sensitive" isn't the best term for this but it's what I've got at the moment, and it is somewhat descriptive in that it's relative to the number of elements in the stream up to this point.
How would we implement something like this using streams? Unless somebody comes up with something really clever, I think we have to collect the elements somewhere first (say, in a list) before we can emit the first element to the output. It's kind of like sorted() in a pipeline which can't tell which is the first element to output until it has read every single input element and has sorted them.
The straightforward approach to finding the top 50% of widgets by weight, using streams, would look something like this:
List<Widget> temp =
list.stream()
.sorted(comparing(Widget::weight).reversed())
.collect(toList());
temp.stream()
.limit((long)(temp.size() * 0.5))
.forEach(System.out::println);
This isn't complicated, but it's a bit cumbersome as we have to collect the elements into a list and assign it to a variable, in order to use the list's size in the 50% computation.
This is limiting, though, in that it's a "static" representation of this kind of filtering. How would we chain this into a stream with a variable number of elements (other filters or criteria) like we did with the predicates?
A important observation is that this code does its actual work in between the consumption of a stream and the emitting of a stream. It happens to have a collector in the middle, but if you chain a stream to its front and chain stuff off its back end, nobody is the wiser. In fact, the standard stream pipeline operations like map and filter each take a stream as input and emit a stream as output. So we can write a function kind of like this ourselves:
Stream<Widget> top50PercentByWeight(Stream<Widget> stream) {
List<Widget> temp =
stream.sorted(comparing(Widget::weight).reversed())
.collect(toList());
return temp.stream()
.limit((long)(temp.size() * 0.5));
}
A similar example might be to find the shortest three widgets:
Stream<Widget> shortestThree(Stream<Widget> stream) {
return stream.sorted(comparing(Widget::length))
.limit(3);
}
Now we can write something that combines these stateful filters with ordinary stream operations:
shortestThree(
top50PercentByWeight(
widgetList.stream()
.filter(w -> w.length() >= 10)))
.forEach(System.out::println);
This works, but is kind of lousy because it reads "inside-out" and backwards. The stream source is widgetList which is streamed and filtered through an ordinary predicate. Now, going backwards, the top 50% filter is applied, then the shortest-three filter is applied, and finally the stream operation forEach is applied at the end. This works but is quite confusing to read. And it's still static. What we really want is to have a way to put these new filters inside a data structure that we can manipulate, for example, to run all the permutations, as in the original question.
A key insight at this point is that these new kinds of filters are really just functions, and we have functional interface types in Java which let us represent functions as objects, to manipulate them, store them in data structures, compose them, etc. The functional interface type that takes an argument of some type and returns a value of the same type is UnaryOperator. The argument and return type in this case is Stream<Widget>. If we were to take method references such as this::shortestThree or this::top50PercentByWeight, the types of the resulting objects would be
UnaryOperator<Stream<Widget>>
If we were to put these into a list, the type of that list would be
List<UnaryOperator<Stream<Widget>>>
Ugh! Three levels of nested generics is too much for me. (But Aleksey Shipilev did once show me some code that used four levels of nested generics.) The solution for too much generics is to define our own type. Let's call one of our new things a Criterion. It turns out that there's little value to be gained by making our new functional interface type be related to UnaryOperator, so our definition can simply be:
#FunctionalInterface
public interface Criterion {
Stream<Widget> apply(Stream<Widget> s);
}
Now we can create a list of criteria like this:
List<Criterion> criteria = Arrays.asList(
this::shortestThree,
this::lengthGreaterThan20
);
(We'll figure out how to use this list below.) This is a step forward, since we can now manipulate the list dynamically, but it's still somewhat limiting. First, it can't be combined with ordinary predicates. Second, there's a lot of hard-coded values here, such as the shortest three: how about two or four? How about a different criterion than length? What we really want is a function that creates these Criterion objects for us. This is easy with lambdas.
This creates a criterion that selects the top N widgets, given a comparator:
Criterion topN(Comparator<Widget> cmp, long n) {
return stream -> stream.sorted(cmp).limit(n);
}
This creates a criterion that selects the top p percent of widgets, given a comparator:
Criterion topPercent(Comparator<Widget> cmp, double pct) {
return stream -> {
List<Widget> temp =
stream.sorted(cmp).collect(toList());
return temp.stream()
.limit((long)(temp.size() * pct));
};
}
And this creates a criterion from an ordinary predicate:
Criterion fromPredicate(Predicate<Widget> pred) {
return stream -> stream.filter(pred);
}
Now we have a very flexible way of creating criteria and putting them into a list, where they can be subsetted or permuted or whatever:
List<Criterion> criteria = Arrays.asList(
fromPredicate(w -> w.length() > 10), // longer than 10
topN(comparing(Widget::length), 4L), // longest 4
topPercent(comparing(Widget::weight).reversed(), 0.50) // heaviest 50%
);
Once we have a list of Criterion objects, we need to figure out a way to apply all of them. Once again, we can use our friend reduce to combine all of them into a single Criterion object:
Criterion allCriteria =
criteria.stream()
.reduce(c -> c, (c1, c2) -> (s -> c2.apply(c1.apply(s))));
The identity function c -> c is clear, but the second arg is a bit tricky. Given a stream s we first apply Criterion c1, then Criterion c2, and this is wrapped in a lambda that takes two Criterion objects c1 and c2 and returns a lambda that applies the composition of c1 and c2 to a stream and returns the resulting stream.
Now that we've composed all the criteria, we can apply it to a stream of widgets like so:
allCriteria.apply(widgetList.stream())
.forEach(System.out::println);
This is still a bit inside-out, but it's fairly well controlled. Most importantly, it addresses the original question, which is how to combine criteria dynamically. Once the Criterion objects are in a data structure, they can be selected, subsetted, permuted, or whatever as necessary, and they can all be combined in a single criterion and applied to a stream using the above techniques.
The functional programming gurus are probably saying "He just reinvented ... !" which is probably true. I'm sure this has probably been invented somewhere already, but it's new to Java, because prior to lambda, it just wasn't feasible to write Java code that uses these techniques.
Update 2014-04-07
I've cleaned up and posted the complete sample code in a gist.
We could add a counter with a map so we know how many elements we have after the filters. I created a helper class that has a method that counts and returns the same object passed:
class DoNothingButCount<T> {
AtomicInteger i;
public DoNothingButCount() {
i = new AtomicInteger(0);
}
public T pass(T p) {
i.incrementAndGet();
return p;
}
}
public void runDemo() {
List<Person>persons = create(100);
DoNothingButCount<Person> counter = new DoNothingButCount<>();
persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
map((p) -> counter.pass(p)).
sorted((p1, p2) -> p1.age - p2.age).
collect(Collectors.toList()).stream().
limit((int) (counter.i.intValue() * 0.5)).
sorted((p1, p2) -> p2.length - p1.length).
limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
}
I had to convert the stream to list and back to stream in the middle because the limit would use the initial count otherwise. Is all a but "hackish" but is all I could think.
I could do it a bit differently using a Function for my mapped class:
class DoNothingButCount<T > implements Function<T, T> {
AtomicInteger i;
public DoNothingButCount() {
i = new AtomicInteger(0);
}
public T apply(T p) {
i.incrementAndGet();
return p;
}
}
The only thing will change in the stream is:
map((p) -> counter.pass(p)).
will become:
map(counter).
My complete test class including the two examples:
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Function;
import java.util.stream.Collectors;
public class Demo2 {
Random r = new Random();
class Person {
public int size, weitght,length, age;
public Person(int s, int w, int l, int a){
this.size = s;
this.weitght = w;
this.length = l;
this.age = a;
}
public String toString() {
return "P: "+this.size+", "+this.weitght+", "+this.length+", "+this.age+".";
}
}
public List<Person>create(int size) {
List<Person>persons = new ArrayList<>();
while(persons.size()<size) {
persons.add(new Person(r.nextInt(10)+10, r.nextInt(10)+10, r.nextInt(10)+10,r.nextInt(20)+14));
}
return persons;
}
class DoNothingButCount<T> {
AtomicInteger i;
public DoNothingButCount() {
i = new AtomicInteger(0);
}
public T pass(T p) {
i.incrementAndGet();
return p;
}
}
class PDoNothingButCount<T > implements Function<T, T> {
AtomicInteger i;
public PDoNothingButCount() {
i = new AtomicInteger(0);
}
public T apply(T p) {
i.incrementAndGet();
return p;
}
}
public void runDemo() {
List<Person>persons = create(100);
PDoNothingButCount<Person> counter = new PDoNothingButCount<>();
persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
map(counter).
sorted((p1, p2) -> p1.age - p2.age).
collect(Collectors.toList()).stream().
limit((int) (counter.i.intValue() * 0.5)).
sorted((p1, p2) -> p2.length - p1.length).
limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
}
public void runDemo2() {
List<Person>persons = create(100);
DoNothingButCount<Person> counter = new DoNothingButCount<>();
persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
map((p) -> counter.pass(p)).
sorted((p1, p2) -> p1.age - p2.age).
collect(Collectors.toList()).stream().
limit((int) (counter.i.intValue() * 0.5)).
sorted((p1, p2) -> p2.length - p1.length).
limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
}
public static void main(String str[]) {
Demo2 demo = new Demo2();
System.out.println("Demo 2:");
demo.runDemo2();
System.out.println("Demo 1:");
demo.runDemo();
}
}

Categories

Resources