I have a interface that takes a string and returns a transformed string
I have some classes that will transform in different ways.
Is there any way in Java to create a stream of those classes and make a transformation of a string.
For example:
class MyClass implements MyOperation {
String execute(String s) { return doSomething(s); }
}
class MyClass2 implements MyOperation {
String execute(String s) { return doSomething(s); }
}
ArrayList<MyClass> operations = new ArrayList<>();
operations.add(new MyClass());
operations.add(new MyClass2());
...
operations.stream()...
Can I make a stream of that in order to make lots of transformations for a single string? I thought about .reduce() but it is strict about the data types.
Your classes all implement methods that transform a String to a String. In other words, they can be represented by a Function<String,String>. They can be combined as follows and applied on a single String:
List<Function<String,String>> ops = new ArrayList<> ();
ops.add (s -> s + "0"); // these lambda expressions can be replaced with your methods:
// for example - ops.add((new MyClass())::execute);
ops.add (s -> "1" + s);
ops.add (s -> s + " 2");
// here we combine them
Function<String,String> combined =
ops.stream ()
.reduce (Function.identity(), Function::andThen);
// and here we apply them all on a String
System.out.println (combined.apply ("dididi"));
Output:
1dididi0 2
The ArrayList<MyClass> should be ArrayList<MyOperation> else the call to operations.add(new MyClass2()); would yield a compilation error.
That said you're looking for this overload of reduce:
String result = operations.stream().reduce("myString",
(x, y) -> y.execute(x),
(a, b) -> {
throw new RuntimeException("unimplemented");
});
"myString" is the identity value.
(x, y) -> y.execute(x) is the accumulator function to be applied.
(a, b) -> {... is the combiner function used only when the stream is parallel. So you need not worry about it for a sequential stream.
You may also want to read upon an answer I've posted a while back "Deciphering Stream reduce function".
Related
A simplified example of what I am trying to do:
Suppose I have a list of strings, which need to be grouped into 4 groups according to a condition if a specific substring is contained or not. If a string contains Foo it should fall in the group FOO, if it contains Bar it should fall in the group BAR, if it contains both it should appear in both groups.
List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");
A naive approach for the above input doesn't work as expected since the string is grouped into the first matching group:
Map<String,List<String>> result1 =
strings.stream()
.collect(Collectors.groupingBy(
str -> str.contains("Foo") ? "FOO" :
str.contains("Bar") ? "BAR" :
str.contains("Baz") ? "BAZ" : "DEFAULT"));
result1 is
{FOO=[Foo, FooBar, FooBarBaz], DEFAULT=[XXX]}
where as the desired result should be
{FOO=[Foo, FooBar, FooBarBaz], BAR=[FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}
After searching for a while I found another approach, which comes near to my desired result, but not quite fully
Map<String,List<String>> result2 =
List.of("Foo", "Bar", "Baz", "Default").stream()
.flatMap(str -> strings.stream().filter(s -> s.contains(str)).map(s -> new String[]{str.toUpperCase(), s}))
.collect(Collectors.groupingBy(arr -> arr[0], Collectors.mapping(arr -> arr[1], Collectors.toList())));
System.out.println(result2);
result2 is
{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz]}
while this correctly groups strings containing the substrings into the needed groups, the strings which doesn't contain the substrings and therefore should fall in the default group are ignored. The desired result is as already mentioned above (order doesn't matter)
{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}
For now I'm using both result maps and doing an extra:
result2.put("DEFAULT", result1.get("DEFAULT"));
Can the above be done in one step? Is there a better approach better than what I have above?
This is ideal for using mapMulti. MapMulti takes a BiConsumer of the streamed value and a consumer.
The consumer is used to simply place something back on the stream. This was added to Java since flatMaps can incur undesirable overhead.
This works by can building a String array as you did before of Token and the containing String and collecting (also as you did before). If the key was found in the string, accept a String array with it and the containing string. Otherwise, accept a String array with the default key and the string.
List<String> strings =
List.of("Foo", "FooBar", "FooBarBaz", "XXX", "YYY");
Map<String, List<String>> result = strings.stream()
.<String[]>mapMulti((str, consumer) -> {
boolean found = false;
String temp = str.toUpperCase();
for (String token : List.of("FOO", "BAR",
"BAZ")) {
if (temp.contains(token)) {
consumer.accept(
new String[] { token, str });
found = true;
}
}
if (!found) {
consumer.accept(
new String[] { "DEFAULT", str });
}
})
.collect(Collectors.groupingBy(arr -> arr[0],
Collectors.mapping(arr -> arr[1],
Collectors.toList())));
result.entrySet().forEach(System.out::println);
prints
BAR=[FooBar, FooBarBaz]
FOO=[Foo, FooBar, FooBarBaz]
BAZ=[FooBarBaz]
DEFAULT=[XXX, YYY]
Keep in mind that streams are meant to make your coding world easier. But sometimes, a regular loop using some Java 8 constructs is all that is needed. Outside of an academic exercise, I would probably do the task like so.
Map<String,List<String>> result2 = new HashMap<>();
for (String str : strings) {
boolean added = false;
String temp = str.toUpperCase();
for (String token : List.of("FOO","BAR","BAZ")) {
if(temp.contains(token)) {
result2.computeIfAbsent(token, v->new ArrayList<>()).add(str);
added = true;
}
}
if (!added) {
result2.computeIfAbsent("DEFAULT", v-> new ArrayList<>()).add(str);
}
}
Instead of operating with strings "Foo", "Bar", etc. and their corresponding uppercase versions, it would be more convenient and cleaner to define an enum.
Let's call it Keys:
public enum Keys {
FOO("Foo"), BAR("Bar"), BAZ("Baz"), DEFAULT("");
private static final Set<Keys> nonDefaultKeys = EnumSet.range(FOO, BAZ); // Set of enum constants (not includes DEFAULT), needed to avoid creating EnumSet or array of constants via `values()` at every invocation of getKeys()
private String keyName;
Keys(String keyName) {
this.keyName = keyName;
}
public static List<String> getKeys(String str) {
List<String> keys = nonDefaultKeys.stream()
.filter(key -> str.contains(key.keyName))
.map(Enum::name)
.toList();
// if non-default keys not found, i.e. keys.isEmpty() - return the DEFAULT
return keys.isEmpty() ? List.of(DEFAULT.name()) : keys;
}
}
It has a method getKeys(String) which expects a string and returns a list of keys to which the given string should be mapped.
By using the functionality encapsulated in the Keys enum we can create a map of strings split into groups which correspond to the names of Keys-constants by using collect(supplier,accumulator,combiner).
main()
public static void main(String[] args) {
List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");
Map<String, List<String>> stringsByGroup = strings.stream()
.collect(
HashMap::new, // mutable container - which will contain results of mutable reduction
(Map<String, List<String>> map, String next) -> Keys.getKeys(next)
.forEach(key -> map.computeIfAbsent(key, k -> new ArrayList<>()).add(next)), // accumulator function - defines how to store stream elements into the container
(left, right) -> right.forEach((k, v) ->
left.merge(k, v, (oldV, newV) -> { oldV.addAll(newV); return oldV; }) // combiner function - defines how to merge container while executing the stream in parallel
));
stringsByGroup.forEach((k, v) -> System.out.println(k + " -> " + v));
}
Output:
BAR -> [FooBar, FooBarBaz]
FOO -> [Foo, FooBar, FooBarBaz]
BAZ -> [FooBarBaz]
DEFAULT -> [XXX]
A link to Online Demo
First of all I need some very efficient solution as I am comparing collections with >300k elements.
At the beginning we have two different classes
Class A {
String keyA;
String keyB;
String keyC;
}
Class B {
String keyA;
String keyB;
String keyC;
String name;
String code;
toA() {
return new A(keyA, keyB, keyC);
}
}
Both of them contains several fields which are composed key(in this example key of three columns = keyA keyB keyC)
This composed key makes calculation very long for primitive brute forces using nested loops.
So I figured out that the most efficient way would be to transform second class to first one using method toA
and then I can safely compare them using for example google's api using Sets efficiency
Set<A> collectionA = <300k of elements>
Set<B> collectionB = <300k of elements>
Set<A> collectionBConvertedToA = collectionB.stream().map(item -> item.toA()).collect(toSet())
Set<A> result = Sets.differences(collectionBConvertedToA, collectionA); // very fast first full scan comparison
Set<String> changedNames = result.stream()
.map(outer -> collectionB.stream()
// very slow second full scan comparison
.filter(inner -> inner.getKeyA().equals(outer.getKeyA())
&& inner.getKeyB().equals(outer.getKeyB())
&& inner.getKeyC().equals(outer.getKeyC()))
.findFirst()
.map(item -> item.getName()))
.collect(toSet());
log.info("changed names" + changedNames);
Guava Sets.differences can find differences on Sets >300k in less than 1/10 of second but later on I still have full scan anyway to collect names.
I am just guessing, but is there something like
Set<B> result = Sets.differences(setA, setB, a -> a.customHashCode(), b -> b.customHashCode(), (a, b) -> a.customEquals(b))
with custom hashCode and custom equals methods to keep Sets efficiency or there is some better pattern to make such comparison as I believe it seems like common problem ?
EDIT
I just figured out that I can just flip conversion to extended class
toB() {
return new B(keyA, keyB, keyC, null, null);
}
but then I need override hashCode and equals to use only those 3 fields and I still believe there is more elegant way
This is O(n^2) since you are streaming collectionB for each element in result. The following should work pretty fast:
Set<String> changedNames = collectionB.stream()
.filter(b -> collectionA.contains(b.toA())
.map(item -> item.getName()).collect(toSet());
We could stream the first set and for each A object, concatenate the three fields of A by a delimiter and collect it as a Set (Set<String>).
Then we go over the elements of the second set, compose a string based on the key fields of A and check if the above-computed set has it or not.
Set<String> keysOfA = collectionA.stream()
.map(a -> compose(a.getKeyA(), a.getKeyB(), a.getKeyC()))
.collect(Collectors.toSet());
Set<String> changedNames = collectionB.stream()
.filter(b -> !keysOfA.contains(compose(b.getKeyA(), b.getKeyB(), b.getKeyC())))
.map(b -> b.getName())
.collect(Collectors.toSet());
static String compose(String keyA, String keyB, String keyC) {
return keyA + "|" + keyB + "|" + keyC; //any other delimiter would work
}
With this you don't need the toA() method.
Second approach:
If class A implements equals and hashcode, then you can do like
Set<String> changedNames = collectionB.stream()
.filter(b -> !collectionA.contains(b.toA()))
.map(b -> b.getName())
.collect(Collectors.toSet());
I have a cartesian product function in JavaScript:
function cartesianProduct(arr) {
return arr.reduce(function(a,b) {
return a.map(function(x) {
return b.map(function(y) {
return x.concat(y);
});
}).reduce(function(a,b) { return a.concat(b); }, []);
}, [[]]);
}
So that if I have a 3D array:
var data = [[['D']], [['E'],['L','M','N']]];
The result of cartesianProduct(data) would be the 2D array:
[['D','E'], ['D','L','M','N']]
What I'm trying to do is write this cartesian product function in Java using Streams.
So far I have the following in Java:
public Collection<Collection<String>> cartesianProduct(Collection<Collection<Collection<String>>> arr) {
return arr.stream().reduce(new ArrayList<Collection<String>>(), (a, b) -> {
return a.stream().map(x -> {
return b.stream().map(y -> {
return Stream.concat(x.stream(), y.stream());
});
}).reduce(new ArrayList<String>(), (c, d) -> {
return Stream.concat(c, d);
});
});
}
I have a type checking error that states:
ArrayList<String> is not compatible with Stream<Stream<String>>
My guesses as to what is wrong:
I need to use a collector somewhere (maybe after the Stream.concat)
The data type for the identity is wrong
This is possible with a bit of functional programming magic. Here's method which accepts Collection<Collection<Collection<T>>> and produces Stream<Collection<T>>:
static <T> Stream<Collection<T>> cartesianProduct(Collection<Collection<Collection<T>>> arr)
{
return arr.stream()
.<Supplier<Stream<Collection<T>>>> map(c -> c::stream)
.reduce((s1, s2) -> () -> s1.get().flatMap(
a -> s2.get().map(b -> Stream.concat(a.stream(), b.stream())
.collect(Collectors.toList()))))
.orElseGet(() -> () -> Stream.<Collection<T>>of(Collections.emptyList()))
.get();
}
Usage example:
cartesianProduct(
Arrays.asList(Arrays.asList(Arrays.asList("D")),
Arrays.asList(Arrays.asList("E"), Arrays.asList("L", "M", "N"))))
.forEach(System.out::println);
Output:
[D, E]
[D, L, M, N]
Of course instead of .forEach() you can collect the results to the List if you want to return Collection<Collection<T>> instead, but returning Stream seems more flexible to me.
A bit of explanation:
Here we create a stream of stream suppliers via map(c -> c::stream). Each function of this stream may produce by demand a stream of the corresponding collection elements. We do this because streams a once off (otherwise having stream of streams would be enough). After that we reduce this stream of suppliers creating a new supplier for each pair which flatMaps two streams and maps their elements to the concatenated lists. The orElseGet part is necessary to handle the empty input. The last .get() just calls the final stream supplier to get the resulting stream.
I would like to create a method which creates a stream of elements which are cartesian products of multiple given streams (aggregated to the same type at the end by a binary operator). Please note that both arguments and results are streams, not collections.
For example, for two streams of {A, B} and {X, Y} I would like it produce stream of values {AX, AY, BX, BY} (simple concatenation is used for aggregating the strings). So far, I have came up with this code:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator, Stream<T>... streams) {
Stream<T> result = null;
for (Stream<T> stream : streams) {
if (result == null) {
result = stream;
} else {
result = result.flatMap(m -> stream.map(n -> aggregator.apply(m, n)));
}
}
return result;
}
This is my desired use case:
Stream<String> result = cartesian(
(a, b) -> a + b,
Stream.of("A", "B"),
Stream.of("X", "Y")
);
System.out.println(result.collect(Collectors.toList()));
Expected result: AX, AY, BX, BY.
Another example:
Stream<String> result = cartesian(
(a, b) -> a + b,
Stream.of("A", "B"),
Stream.of("K", "L"),
Stream.of("X", "Y")
);
Expected result: AKX, AKY, ALX, ALY, BKX, BKY, BLX, BLY.
However, if I run the code, I get this error:
IllegalStateException: stream has already been operated upon or closed
Where is the stream consumed? By flatMap? Can it be easily fixed?
Passing the streams in your example is never better than passing Lists:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator, List<T>... lists) {
...
}
And use it like this:
Stream<String> result = cartesian(
(a, b) -> a + b,
Arrays.asList("A", "B"),
Arrays.asList("K", "L"),
Arrays.asList("X", "Y")
);
In both cases you create an implicit array from varargs and use it as data source, thus the laziness is imaginary. Your data is actually stored in the arrays.
In most of the cases the resulting Cartesian product stream is much longer than the inputs, thus there's practically no reason to make the inputs lazy. For example, having five lists of five elements (25 in total), you will have the resulting stream of 3125 elements. So storing 25 elements in the memory is not very big problem. Actually in most of the practical cases they are already stored in the memory.
In order to generate the stream of Cartesian products you need to constantly "rewind" all the streams (except the first one). To rewind, the streams should be able to retrieve the original data again and again, either buffering them somehow (which you don't like) or grabbing them again from the source (colleciton, array, file, network, random numbers, etc.) and perform again and again all the intermediate operations. If your source and intermediate operations are slow, then lazy solution may be much slower than buffering solution. If your source is unable to produce the data again (for example, random numbers generator which cannot produce the same numbers it produced before), your solution will be incorrect.
Nevertheless totally lazy solution is possbile. Just use not streams, but stream suppliers:
private static <T> Stream<T> cartesian(BinaryOperator<T> aggregator,
Supplier<Stream<T>>... streams) {
return Arrays.stream(streams)
.reduce((s1, s2) ->
() -> s1.get().flatMap(t1 -> s2.get().map(t2 -> aggregator.apply(t1, t2))))
.orElse(Stream::empty).get();
}
The solution is interesting as we create and reduce the stream of suppliers to get the resulting supplier and finally call it. Usage:
Stream<String> result = cartesian(
(a, b) -> a + b,
() -> Stream.of("A", "B"),
() -> Stream.of("K", "L"),
() -> Stream.of("X", "Y")
);
result.forEach(System.out::println);
stream is consumed in the flatMap operation in the second iteration. So you have to create a new stream every time you map your result. Therefore you have to collect the stream in advance to get a new stream in every iteration.
private static <T> Stream<T> cartesian(BiFunction<T, T, T> aggregator, Stream<T>... streams) {
Stream<T> result = null;
for (Stream<T> stream : streams) {
if (result == null) {
result = stream;
} else {
Collection<T> s = stream.collect(Collectors.toList());
result = result.flatMap(m -> s.stream().map(n -> aggregator.apply(m, n)));
}
}
return result;
}
Or even shorter:
private static <T> Stream<T> cartesian(BiFunction<T, T, T> aggregator, Stream<T>... streams) {
return Arrays.stream(streams).reduce((r, s) -> {
List<T> collect = s.collect(Collectors.toList());
return r.flatMap(m -> collect.stream().map(n -> aggregator.apply(m, n)));
}).orElse(Stream.empty());
}
You can create a method that returns a stream of List<T> of objects and does not aggregate them. The algorithm is the same: at each step, collect the elements of the second stream to a list and then append them to the elements of the first stream.
The aggregator is outside the method.
#SuppressWarnings("unchecked")
public static <T> Stream<List<T>> cartesianProduct(Stream<T>... streams) {
// incorrect incoming data
if (streams == null) return Stream.empty();
return Arrays.stream(streams)
// non-null streams
.filter(Objects::nonNull)
// represent each list element as SingletonList<Object>
.map(stream -> stream.map(Collections::singletonList))
// summation of pairs of inner lists
.reduce((stream1, stream2) -> {
// list of lists from second stream
List<List<T>> list2 = stream2.collect(Collectors.toList());
// append to the first stream
return stream1.flatMap(inner1 -> list2.stream()
// combinations of inner lists
.map(inner2 -> {
List<T> list = new ArrayList<>();
list.addAll(inner1);
list.addAll(inner2);
return list;
}));
}).orElse(Stream.empty());
}
public static void main(String[] args) {
Stream<String> stream1 = Stream.of("A", "B");
Stream<String> stream2 = Stream.of("K", "L");
Stream<String> stream3 = Stream.of("X", "Y");
#SuppressWarnings("unchecked")
Stream<List<String>> stream4 = cartesianProduct(stream1, stream2, stream3);
// output
stream4.map(list -> String.join("", list)).forEach(System.out::println);
}
String.join is a kind of aggregator in this case.
Output:
AKX
AKY
ALX
ALY
BKX
BKY
BLX
BLY
See also: Stream of cartesian product of other streams, each element as a List?
I have to programm regular expression with lambda expressions for the university. I got stuck by 2 methods in a method.
here is my code:
static String ausdruck = "abcd";
public static Function<String, String> Char = (c) -> {
return (ausdruck.startsWith(c)) ? ausdruck = ausdruck.substring(1,
ausdruck.length()) : "Value Error";
};
public static BiFunction<Function<String, String>,
Function<String, String>,
Function<String, String>>
And = (f1, f2) -> {return null;};
what I want to do in the And method is: Char(Char.apply("a")) -> I want to call the function f2 with the f1 as a parameter.
the Call of the And Method have to look like:
And.apply(Char.apply("a"), Char.apply("b"));
I guess this is what you want
Func<Str,Str> f = and( comsume("a"), consume("b") );
f.apply("abcd"); // "cd"
Func<Str,Str> consume(String x)
return input->{
if(input.startsWith(x)) return input.substring(x.length());
else throws new IllegalArgument()
};
Func<Str,Str> and(Fun<Str,Str> f1, Func<Str,Str> f2)
return input-> f2.apply(f1.apply(input))
and is not necessary though, see Function.andThen method
f = consume("a").andThen( consume("b) )
Unfortunately, there is no "curry"; otherwise, we could do this
f = consume2.curry("a") .andThen ( consume2.curry("b") );
static BiFunc<Str,Str,Str> consume2 = (input,x)-> {...return input.substring(x.length()); ..
It's better off if you design your own functional interfaces, with needed methods like curry.
interface F1
String apply(String);
F1 and(F1);
interface F2
String apply(String,String);
F1 curry(String);
If I understand the question correctly, you want to create a function that compones a new function, executing one function with the result of another function. The best way to do this in a lambda would be to return a new lambda.
Try something like this:
BiFunction<Function<String, String>, Function<String, String>, Function<String, String>> compose =
(f1, f2) -> (a -> f2.apply(f1.apply(a)));
Example:
Function<String, String> upper = s -> s.toUpperCase();
Function<String, String> twice = s -> s + s;
Function<String, String> upperTwice = compose.apply(upper, twice);
System.out.println(upperTwice.apply("foo"));
Output is FOOFOO.
Concerning your concrete example
the Call of the And Method have to look like:
And.apply(Char.apply("a"), Char.apply("b");
I do not know exactly what you are trying to do, but I don't think this will work, given your current implementation of Char. It seems like you want to compose a lambda to remove a with another to remove b, but instead Char.apply("a") will not create another function, but actually remove "a" from your ausdruck String! Instead, your Char lambda should probably also return another lambda, and that lambda should not modify some static variable, but take and return another String parameter.
Function<String, Function<String, String>> stripChar =
c -> (s -> s.startsWith(c) ? s.substring(1) : "ERROR");
Function<String, String> stripAandC = compose.apply(stripChar.apply("c"), stripChar.apply("a"));
System.out.println(stripAandC.apply("cash"));
Output is sh.
Finally, in case you want to use this with anything other than just String, it might make sense to make compose an actual method instead of a lambda, so you can make use of generics. Also, you can make this a bit simpler by using andThen:
public static <A, B, C> Function<A, C> compose(Function<A, B> f1, Function<B,C> f2){
return f1.andThen(f2);
}