Consider I have a list with two types of data,one valid and the other invalid.
If I starting filter through this list, can i collect two lists at the end?
Can we collect two lists from Java 8 streams?
You cannot but if you have a way to group elements of the Lists according to a condition.
In this case you could for example use Collectors.groupingBy() that will return Map<Foo, List<Bar>> where the values of the Map are the two List.
Note that in your case you don't need stream to do only filter.
Filter the invalid list with removeIf() and add all element of that in the first list :
invalidList.removeIf(o -> conditionToRemove);
goodList.addAll(invalidList);
If you don't want to change the state of goodList you can do a shallow copy of that :
invalidList.removeIf(o -> conditionToRemove);
List<Foo> terminalList = new ArrayList<>(goodList);
terminalList.addAll(invalidList);
This is a way using Java 8 streams API. Consider I have a List of String elements: the input list has strings with various lengths; only strings with length 3 are valid.
List<String> input = Arrays.asList("one", "two", "three", "four", "five");
Map<Boolean, List<String>> map = input.collect(Collectors.partitioningBy(s -> s.length() == 3));
System.out.println(map); // {false=[three, four, five], true=[one, two]}
The resulting Map has only two records; one with valid and the other with not-valid input list elements. The map record with key=true has the valid string as a List: one, two. The other is a key=false and the not-valid strings: three, four, five.
Note the Collectors.partitioningBy produces always two records in the resulting map irrespective of the existence of valid or not-valid values.
Another suggestion: filter using a lambda that adds the elements that you want to filter out of the stream to a separate list.
The collector can only return a single object!
But you could create a custom collector that simply puts the stream elements into two lists, to then return a list of lists containing these two lists.
There are many examples how to do that.
If you need to classify binary (true|false), you can use Collectors.partitioningBy that will return Map<Boolean, List> (or other downstream collection like Set, if you additionally specify).
If you need more than 2 categories - use groupBy or just Collectors.toMap of collections.
Related
I have two lists containing names. The first list l1 has selected names, and the second list l2 has all the names.
Example:
List<String> l1 = {"en", "cu", "eu"};
List<String> l2 = {"ch", "cu", "en", "eu", "pe"};
All the elements in l1 always exist in l2.
I want to obtain the output where all the common names are placed at the beginning followed by the remaining names, and I need them to be sorted in the ascending order.
Like this:
output -> {cu, en, eu, ch, pe};
What I was trying to do I've shown below, but don't getting the idea that how to place common elements at the starting position (2nd list is already sorted) and place the remaining elements after the common elements
List<String> common = l2.stream().filter(l2::contains).collect(Collectors.toList());
If I understood correctly, you want to sort the elements from the second list l2 by placing the values that are present in the first list l1 at the beginning, and both part should be sorted in the alphabetical order.
You need to define a custom Comparator for that purpose. Since Java 8 the recommended way to create a Comparator is to use static factory methods like Comparator.comparing().
While implementing the Comparator, firstly we need to check if a particular value is present in the first list and order the elements by the result of this check (reminder: natural ordering of boolean values is false -> true). And since contains() on a list is costful (it does iteration over the list under the hood and therefore runs in O(n)) it would be performance-wise to dump the data from the first list into a HashSet and perform checks against this Set.
The in order to sort both parts of the list alphabetically, we need can chain the second comparator by applying Comparator.thenComparing() and providing Comparator.naturalOrder() as the argument.
Here's how it might be implemented:
List<String> l1 = List.of("en", "cu", "eu");
List<String> l2 = List.of("ch", "cu", "en", "eu", "pe");
// output -> {cu,en,eu,cu,pe};
Set<String> toCheck = new HashSet<>(l1);
Comparator<String> comparator =
Comparator.<String,Boolean>comparing(toCheck::contains).reversed()
.thenComparing(Comparator.naturalOrder());
List<String> sorted = l2.stream()
.sorted(comparator)
.toList(); // for Java 16 or collect(Collectors.toList())
System.out.println(sorted);
Output:
[cu, en, eu, ch, pe]
I have a List<Stream<String>> that I get by doing a series of transactions.
The list size is dynamic (Maximum 3 elements) so I can't do:
Stream<String> finalStream = Stream.concat(list.get(0),Stream.concat(list.get(1),list.get(2));
I need to concatenate the list of Streams into one single Stream<String>.
Is there any simple way to do this?
If you have a list of lists, or a stream of streams, or any collection of collections, you can use flatMap to, well, flatten them. flatMap applies a mapping function which must return a stream to an input and streams each element of the result of the mapping function.
In your case, you could do:
var finalStream = list.stream().flatMap(x -> x);
x -> x is the identify function which returns the input unmodified. If you prefer, you can replace it with the expression Function.identity().
Is there any way where I can assert 2 lists of Maps ignoring the order? thanks
Example:
List<Map<String, Object>> dataList1 = new ArrayList<>();
List<String> headers1 = new ArrayList<>();
headers1.add("Header1");
headers1.add("Header2");
headers1.add("Header3");
Map<String, Object> dataMap1 = new LinkedHashMap<>();
dataMap1.put(headers1.get(0), "testData1");
dataMap1.put(headers1.get(1), "testData2");
dataMap1.put(headers1.get(2), "testData3");
Map<String, Object> dataMap2 = new LinkedHashMap<>();
dataMap2.put(headers1.get(0), "testData4");
dataMap2.put(headers1.get(1), "testData5");
dataMap2.put(headers1.get(2), "testData6");
dataList1.add(dataMap1);
dataList1.add(dataMap2);
List<Map<String, Object>> dataList2 = new ArrayList<>();
List<String> headers3 = new ArrayList<>();
headers3.add("Header1");
headers3.add("Header2");
headers3.add("Header3");
Map<String, Object> dataMap3 = new LinkedHashMap<>();
dataMap3.put(headers3.get(0), "testData1");
dataMap3.put(headers3.get(1), "testData2");
dataMap3.put(headers3.get(2), "testData3");
Map<String, Object> dataMap4 = new LinkedHashMap<>();
dataMap4.put(headers3.get(0), "testData4");
dataMap4.put(headers3.get(1), "testData5");
dataMap4.put(headers3.get(2), "testData6");
dataList2.add(dataMap4);
dataList2.add(dataMap3);
System.out.println(dataList1);
System.out.println(dataList2);
and the results would be:
[{Header1=testData1, Header2=testData2, Header3=testData3}, {Header1=testData4, Header2=testData5, Header3=testData6}]
[{Header1=testData4, Header2=testData5, Header3=testData6}, {Header1=testData1, Header2=testData2, Header3=testData3}]
I want to get a TRUE result since they are actually the same but with different order. thank you in advance!
EDIT:
Just to add. I am trying to check if the 2 lists of maps are equal from 2 different sources (Excel file vs Database data). so there's a chance that the lists of data have duplicates.
That is not what a list is for. If you don't care about the order, you should use a Set instead.
However if the order is important but you just want to make sure they contain the same Maps then you could just convert them to Sets for the assertion:
new HashSet<Map<String, Object>>(dataList1).equals(new HashSet<Map<String, Object>>(dataList2))
In the case you're describing the value lists would rather have "bag" semantics, i.e. sets which contain duplicates.
To compare those you need to write your own comparison logic (or find a library that provides it, e.g. Apache Commons Collections' CollectionUtils#isEqualCollection() or Hamcrest's Matchers#containsInAnyOrder()) since the default methods won't help. The assert would then be something like assertTrue(mapsAreEqual(actual, expected)) or assertEqual(new MapEqualWrapper(actual), new MapEqualWrapper(expected)) where MapEqualWrapper would implement the logic in its equals().
For the check you could sort the lists (or copies of them) and do a traditional comparison or use a frequency map (after other checks of course):
first check the sizes - if they are different the lists aren't equal
build a frequency map for the first list, i.e. increment the value by 1 for each occurence
check the elements in the second list by decreasing the occurences and removing any frequencies that hit 0
if you hit an element that has no entry in the frequency map you can stop already since the bags aren't equal
Sorting and comparing would be easier to implement with just a couple of lines but time complexity would be O(n*log(n)) due to the sorting.
On the other hand, time complexity for using the frequency map would basically be O(n) (iterations are O(n) and map put/get/remove should be O(1) in theory). This, however, shouldn't matter unless you need to quickly compare large lists and thus I'd go with the sort and compare method first.
Lately I came across a problem during working with nested collections (values of Maps inside a List):
List<Map<String, Object>> items
This list in my case contains 10-20 Maps.
At some point I had to replace value Calculation of key description to Rating. So I come up with this solution:
items.forEach(e -> e.replace("description","Calculation","Rating"));
It would be quite fine and efficient solution if all maps in this list will contain key-Value pair ["description", "Calculation"]. Unfortunately, I know that there will be only one such pair in the whole List<Map<String, Object>>.
The question is:
Is there a better (more efficient) solution of finding and replacing this one value, instead of iterating through all List elements using Java-8 streams?
Perfection would be to have it done in one stream without any complex/obfuscating operations on it.
items.stream()
.filter(map -> map.containsKey("description"))
.findFirst()
.ifPresent(map -> map.replace("description", "Calculation", "Rating"));
You will have to iterate over the list until a map with the key "description" is found. Pick up the first such, and try to replace.
As pointed out by #Holger, if the key "description" isn't single for all the maps, but rather the pair ("description", "Calculation") is unique:
items.stream()
.anyMatch(m -> m.replace("description", "Calculation", "Rating"));
Suppose I read whole files:
JavaPairRDD<String, String> filesRDD = sc.wholeTextFiles(inputDataPath);
Then, I have the following mapper which s:
JavaRDD<List<String>> processingFiles = filesRDD.map(fileNameContent -> {
List<String> results = new ArrayList<String>();
for ( some loop ) {
if (condition) {
results.add(someString);
}
}
. . .
return results;
});
For the sake of argument, suppose that inside the mapper I need to make a list of strings, which I return from each file. Now, each string in each list can be viewed independently and needs to be processed later on independently. I don't want Spark to process each list at once, but each string of each list at once. Later when I use collect() I get a list of lists.
One way to put this is: how to parallelize this list of lists for each string individually not for each list individually?
Instead of mapping filesRDD to get a list of lists, flatmap it and you can get an RDD of strings.
EDIT: Adding comment out of request
Map is a 1:1 function where 1 input row -> 1 output row. Flatmap is a 1:N function where 1 input row -> many (or 0) output rows. If you use flatMap, you can design it so your output RDD is and RDD of strings whereas currently your output RDD is a RDD of lists of strings. It sounds like this is what you want. I'm not a java-spark user, so I can't give you syntax specifics. Check here for help on syntax