I'm working on improving the speed of a program where performance is critical. Currently it fails to process large data sets. There are many nested for loops and so I thought it would be worth trying parallel streams. I have access to a high performance cluster so potentially have many cores available.
I have the method below:
public MinSpecSetFamily getMinDomSpecSets() {
MinSpecSetFamily result = new MinSpecSetFamily();
ResourceType minRT = this.getFirstEssentialResourceType();
if (minRT == null || minRT.noSpecies()) {
System.out.println("Problem in getMinDomSpecSets()");
}
for (Species spec : minRT.specList) {
SpecTree minTree = this.getMinimalConstSpecTreeRootedAt(spec);
ArrayList<SpecTreeNode> leafList = minTree.getLeaves();
for (SpecTreeNode leaf : leafList) {
ArrayList<Species> sp = leaf.getAncestors();
SpecSet tmpSet = new SpecSet(sp);
result.addSpecSet(tmpSet);
}
}
return result;
}
I understand that I can turn a nested for loop into a parallel stream with something like:
minRT.specList.parallelStream().flatMap(leaf -> leaflist.parallelStream())
However, I cannot find examples showing how to deal with the actions inside each for loop and I'm not at all confident about how this is supposed to work. I'd really appreciate some assistance and explanation of how to convert this method so that I can translate the solution to other methods in the program too.
Thanks.
Here's one way of doing it (hopefully I have no typos):
MinSpecSetFamily result =
minRT.specList
.parallelStream()
.flatMap(spec -> getMinimalConstSpecTreeRootedAt(spec).getLeaves().stream())
.map(leaf -> new SpecSet(leaf.getAncestors()))
.reduce(new MinSpecSetFamily (),
(fam,set)-> {
fam.addSpecSet(set);
return fam;
},
(f1, f2) -> new MinSpecSetFamily(f1, f2));
EDIT: Following Holger's comment, you should use collect instead of reduce:
MinSpecSetFamily result =
minRT.specList
.parallelStream()
.flatMap(spec -> getMinimalConstSpecTreeRootedAt(spec).getLeaves().stream())
.map(leaf -> new SpecSet(leaf.getAncestors()))
.collect(MinSpecSetFamily::new,MinSpecSetFamily::addSpecSet,MinSpecSetFamily::add);
Related
I'm trying to filter two conditions in a stream that is inside another stream of data, what I need to do is to check if the object is there by using the "name" parameter and then reviewing the Boolean property of "isGoldplated" if is true, I tried using this code but didn't work as it wasn't filtering by the isGoldplated parameter:
List<CompressorModel> filteredCompressors = pack.getSet().getCompressors().stream()
.peek(p -> goldData.stream().map(GoldPlateData::getCompressorSerialNo).anyMatch(name -> name.equals(p.getGcsn())))
.peek(p -> goldData.stream().map(GoldPlateData::getIsGoldplated))
.collect(Collectors.toList());
so I finished using two loops instead:
List<CompressorModel> filteredCompressors = new ArrayList<>();
for (CompressorModel cmp : pack.getSet().getCompressors()) {
for(GoldPlateData gold: goldData) {
if( StringUtils.equalsIgnoreCase(cmp.getGcsn(), gold.getCompressorSerialNo()) && Boolean.TRUE.equals(gold.getIsGoldplated())) {
filteredCompressors.add(cmp);
}
}
}
so my request is, how could I convert these two loops into a working stream?
thanks in advance
You could use filter() on the pack.getSet().getCompressors() stream and then look for a match in goldData, like this:
List<CompressorModel> filteredCompressors = pack.getSet()
.getCompressors().stream()
.filter(cmp -> goldData.stream().anyMatch(gd -> cmp.getGcsn().equalsIgnoreCase(gd.getCompressorSerialNo()) && gd.getIsGoldplated()))
.collect(Collectors.toList());
Peek is basically there only to be used for debugging purposes. Peek can be not executed at all at times because it's a terminal operation. You can see here to get an idea.
So you may modify your implementation to use filter instead.
List<CompressorModel> filteredCompressors = pack.getSet().getCompressors().stream()
.filter(p -> goldData.stream().map(GoldPlateData::getCompressorSerialNo).anyMatch(name -> name.equals(p.getGcsn())))
.filter(p -> goldData.stream().map(GoldPlateData::getIsGoldplated))
.collect(Collectors.toList());
I have a set of code which I want to perform using stream I have done using for loop but unable to do it using stream or other better approach please advice
for (Student student : students) {
List<Laptop> filteredLaptop = new ArrayList<>();
for (Laptop laptop: student.getLaptopList()) {
if(laptop.getColour().endsWith("RED")){
filteredLaptop.add(laptop);
}
}
student.setLaptopList(filteredLaptop);
}
How can i reduce the complexity of code or how can perform same operation using Stream
Something like this (I think it's better to rename laptopsList)
students.forEach(student ->
student.setLaptops(
student.getLaptops().stream()
.filter(laptop -> laptop.getColour().endsWith("RED"))
.collect(toList())
)
);
I have following block, processRule() removes entries from diff list.
public List<Difference> process(List<Rule> rules, List<Difference> differences) {
for (Rule rule : rules) {
differences = processRule(rule, differences);
}
return differences;
}
how can this be done with stream api? i can't just use flatMap because i need each new call to processRule() to have reduced differences as an argument.
May be something like this using stream reduce.
Note: not tested, posting from my mobile
return rules
.stream()
.reduce(differences, (rule1, rule2) ->
processRule(rule2,
processRule(rule1, differences))
} );
I'm reading the book 'Java in Action'.
And I saw an example code of Stream in the book.
List<String> names = menu.stream()
.filter(d -> {
System.out.println("filtering" + d.getName());
return d.getCalories() > 300;
})
.map(d -> {
System.out.println("mapping" + d.getName());
return d.getName();
})
.limit(3)
.collect(toList());
When the code is executed, the result is as follows.
filtering __1__.
mapping __1__.
filtering __2__.
mapping __2__.
filtering __3__.
mapping __3__.
That is, because of limit(3), the log message is printed only 3 times!
In this book, this is called in "loop fusion."
But, I don't understand this.
Because, if you know whether an object is filtered, you have to calculate the filtering function. Then, "filtering ..." message is should be printed, I think.
Please, Explain me about how the loop fusion works internally.
“Because, if you [want to] know whether an object is filtered, you have to calculate the filtering function”, is right, but perhaps your sample data wasn’t sufficient to illustrate the point. If you try
List<String> result = Stream.of("java", "streams", "are", "great", "stuff")
.filter(s -> {
System.out.println("filtering " + s);
return s.length()>=4;
})
.map(s -> {
System.out.println("mapping " + s);
return s.toUpperCase();
})
.limit(3)
.collect(Collectors.toList());
System.out.println("Result:");
result.forEach(System.out::println);
it will print
filtering java
mapping java
filtering streams
mapping streams
filtering are
filtering great
mapping great
Result:
JAVA
STREAMS
GREAT
Showing that
In order to find three elements matching the filter, you might have to evaluate more than three elements, here, four element are evaluated, but you don’t need to evaluate subsequent elements once you have three matches
The subsequent mapping function only need to be applied to matching elements. This allows to conclude that it is irrelevant whether .map(…).limit(…)or .limit(…).map(…) was specified.
This differs from the relative position of .filter and .limit which is relevant.
The term “loop fusion” implies that there is not a filtering loop, followed by a mapping loop, followed by a limit operation, but only one loop (conceptionally), performing the entire work, equivalent to the following single loop:
String[] source = { "java", "streams", "are", "great", "stuff"};
List<String> result = new ArrayList<>();
int limit = 3;
for(String s: source) {
System.out.println("filtering " + s);
if(s.length()>=4) {
System.out.println("mapping " + s);
String t = s.toUpperCase();
if(limit-->0) {
result.add(t);
}
else break;
}
}
I think you got it wrong. limit is actually called short-circuiting (because it is executed only 3 times).
While loop fusion is filter and map executed at a single pass. These two operations where merged into a single one that is executed at each element.
You do not see output like this:
filtering
filtering
filtering
mapping
mapping
mapping
Instead you see filter followed immediately by a map; so these two operations were merged into a single one.
Generally you should not care how that is done internally (it build a pipeline of these operations), because this might change and it is implementation specific.
I am looking for some help in converting some code I have to use the really nifty Java 8 Stream library. Essentially I have a bunch of student objects and I would like to get back a list of filtered objects as seen below:
List<Integer> classRoomList;
Set<ScienceStudent> filteredStudents = new HashSet<>();
//Return only 5 students in the end
int limit = 5;
for (MathStudent s : mathStudents)
{
// Get the scienceStudent with the same id as the math student
ScienceStudent ss = scienceStudents.get(s.getId());
if (classRoomList.contains(ss.getClassroomId()))
{
if (!exclusionStudents.contains(ss))
{
if (limit > 0)
{
filteredStudents.add(ss);
limit--;
}
}
}
}
Of course the above is a super contrived example I made up for the sake of learning more Java 8. Assume all students are extended from a Student object with studentId and classRoomId. An additional requirement I would require is the have the result be an Immutable set.
A quite literal translation (and the required classes to play around)
interface ScienceStudent {
String getClassroomId();
}
interface MathStudent {
String getId();
}
Set<ScienceStudent> filter(
Collection<MathStudent> mathStudents,
Map<String, ScienceStudent> scienceStudents,
Set<ScienceStudent> exclusionStudents,
List<String> classRoomList) {
return mathStudents.stream()
.map(s -> scienceStudents.get(s.getId()))
.filter(ss -> classRoomList.contains(ss.getClassroomId()))
.filter(ss -> !exclusionStudents.contains(ss))
.limit(5)
.collect(Collectors.toSet());
}
Multiple conditions to filter really just translate into multiple .filter calls or a combined big filter like ss -> classRoomList.contains(ss.getClassroomId()) && !exclusion...
Regarding immutable set: You best wrap that around the result manually because collect expects a mutable collection that can be filled from the stream and returned once finished. I don't see an easy way to do that directly with streams.
The null paranoid version
return mathStudents.stream().filter(Objects::nonNull) // math students could be null
.map(MathStudent::getId).filter(Objects::nonNull) // their id could be null
.map(scienceStudents::get).filter(Objects::nonNull) // and the mapped science student
.filter(ss -> classRoomList.contains(ss.getClassroomId()))
.filter(ss -> !exclusionStudents.contains(ss))
.limit(5)
.collect(Collectors.toSet());