Java Stream distinct() with other chained operations generates duplicate result

Java Stream distinct() with other chained operations generates duplicate result - java

I was testing some things with Java Stream and distinct() and I came across a scenario where if I change a object of the Stream after executing distinct(), the final result of the execution contains duplicate items.
Does it make sense the object to go to the next operation before the distinct() is finalized? How to ensure uniqueness without iterating through the entire list?
OBS: The Lombok: #Data annotation adds #EqualsAndHashCode in the Dto class which will automatically generate equals() and hashCode() methods!
package br.com.marcusvoltolim;
import lombok.AllArgsConstructor;
import lombok.Data;
import reactor.core.publisher.Flux;
import java.util.*;
import java.util.stream.Collectors;
#Data
#AllArgsConstructor
class Dto {
private Long id;
}
public class Main {
public static void main(String[] args) {
Dto dto0 = new Dto(0L);
Dto dto1 = new Dto(1L);
Dto dto2 = new Dto(1L);
List<Dto> list = Arrays.asList(dto0, dto1, dto2);
System.out.println("Original list: " + list);
System.out.println("List with only distinct: " + list.stream().distinct().collect(Collectors.toList()));
List<Dto> streamList = list.stream()
.distinct()
.map(dto -> {
if (dto.getId() == 1L) {
dto.setId(3L);
}
return dto;
})
.collect(Collectors.toList());
System.out.println("Java Stream with map after distinct: " + streamList);
}
}
Result:
Original list: [Dto(id=0), Dto(id=1), Dto(id=1)]
List with only distinct: [Dto(id=0), Dto(id=1)]
Java Stream with map after distinct: [Dto(id=0), Dto(id=3), Dto(id=3)]
I expected the result: [Dto(id=0), Dto(id=3)]

To get your "expected" behaviour you can't change the values for the list element in the list.
You have:
if (dto.getId() == 1L) {
dto.setId(3L); <--- Changing the contents
}
So you are chaning the value and the "equals" of the nodes.
For it to work correctly, you have to really MAP and not MODIFY:
if (dto.getId() == 1L) {
return new Dto(3L); <---- Map to new object
} else {
return new Dto(dto.getId()); <---- Map to new object
}

Related

Get unexpected result while using Java 8 Lambda Expression

I have changed my Test to make it reproduce easier:
Minimize Test
public class test {
public static void main(String[] args) {
List<TestBean> obj_list = Arrays.asList(new TestBean("aa"), new TestBean("bb" ), new TestBean("bb")).stream()
.distinct().map(tt -> {
tt.col = tt.col + "_t";
return tt;
}).collect(Collectors.toList());
System.out.println(obj_list);
List<String> string_obj_list = Arrays.asList(new String("1"), new String("2"), new String("2")).stream().distinct().map(t -> t + "_t").collect(Collectors.toList());
System.out.println(string_obj_list);
List<String> string_list = Arrays.asList("1", "2", "2").stream().distinct().map(t -> t + "_t").collect(Collectors.toList());
System.out.println(string_list);
}
}
#Data
#AllArgsConstructor
#EqualsAndHashCode
class TestBean {
String col;
}
the result is below, the line one is abnormal for me to understand:
[TestBean(col=aa_t), TestBean(col=bb_t), TestBean(col=bb_t)]
[1_t, 2_t]
[1_t, 2_t]
----------original question is below -------------------------------
my logic step:
produce a list of stream
map each element to list stream
collect list stream to one stream
distinct element
map function apply to each element and collect the result to a list
however , the result does not do distinct logic (step 4) ,which is that i can not understand
public class test {
public static void main(String[] args) {
List<TestBean> stage1 = Arrays.asList(new TestBean("aa", null), new TestBean("bb", null), new TestBean("bb", null)).stream()
.map(tt -> {
return Arrays.asList(tt);
})
.flatMap(Collection::stream).distinct().collect(Collectors.toList());
List<Object> stage2 = stage1.stream().map(tt -> {
tt.setCol2(tt.col1);
return tt;
}).collect(Collectors.toList());
System.out.println(stage1);
System.out.println(stage2);
List<TestBean> stage_all = Arrays.asList(new TestBean("aa", null), new TestBean("bb", null), new TestBean("bb", null)).stream()
.map(tt -> {
return Arrays.asList(tt);
})
.flatMap(Collection::stream).distinct().map(tt -> {
tt.setCol2(tt.col1);
return tt;
}).collect(Collectors.toList());
System.out.println(stage_all);
}
}
#Data
#AllArgsConstructor
#EqualsAndHashCode
class TestBean {
String col1;
String col2;
}
the result is
[TestBean(col1=aa, col2=aa), TestBean(col1=bb, col2=bb)]
[TestBean(col1=aa, col2=aa), TestBean(col1=bb, col2=bb)]
[TestBean(col1=aa, col2=aa), TestBean(col1=bb, col2=bb), TestBean(col1=bb, col2=bb)]
line three is abnormal for me.

The distinct() operation is filtering the set of items in the stream using Object.equals(Object).
However you are mutating the items as they are streamed - a very bad idea for Set operations - so in theory it is possible in your runs that the first TestBean(col=bb) is changed to TestBean(col=bb_t) before the final TestBean(col=bb) is handled by distinct(). Therefore at that moment there are 3 unique items in the stream seen by the distinct() step and the last .map() sees all three items.
You can verify by re-processing the stream without the ".map()" side effect or add .distinct() after .map().
Takeaway from this: Don't use distinct() or other set like operations on data structures that change fields used in equals() or hashCode() as this gives misleading / duplicates into set.add() operations. This is where Java records are useful as they cannot be changed and would eliminate errors from these type of side-effects:
record TestBean(String col) {}
Example
The #EqualsAndHashCode tag on TestBean is generating the equals and hashCode calls which are essential for Set / distinct() operations to work. If the hashcode/equals changes after adding an item, the set won't work properly as it fails to match up the previous element as being a duplicate of a newly added element. Consider this simpler definition of TestBean which add same instance 5 times:
public static void main(String... args) {
class TestBean {
String col;
TestBean(String col) {
this.col = col;
}
// This is bad choice hashCode as it changes if "col" is changed:
public int hashCode() {
return col.hashCode();
}
}
Set<TestBean> set = new HashSet<>();
TestBean x = new TestBean("bb");
for (int i = 0; i < 5; i++) {
System.out.println("set.add(x)="+set.add(x));
System.out.println("set.size()="+set.size());
// Comment out next line or whole hashCode method:
x.col += "_t";
}
}
Run the above which adds same element to a set 5 times. You will see that set.size() may be 5 not 1. Comment out the line which causes the hashcode to change - or the hashCode() method, and set.size()=1 as expected.

How to find the max number of an unique string element in a alphanumeric Array list in java

I have list that has alphanumeric elements. I want to find the maximum number of each elements individually.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
public class Collect {
public static void main(String[] args) {
List<String> alphaNumericList = new ArrayList<String>();
alphaNumericList.add("Demo.23");
alphaNumericList.add("Demo.1000");
alphaNumericList.add("Demo.12");
alphaNumericList.add("Demo.12");
alphaNumericList.add("Test.01");
alphaNumericList.add("Test.02");
alphaNumericList.add("Test.100");
alphaNumericList.add("Test.99");
Collections.sort(alphaNumericList);
System.out.println("Output "+Arrays.asList(alphaNumericList));
}
I need filter only below values. For that I am sorting the list but it filters based on the string rather than int value. I want to achieve in an efficient way. Please suggest on this.
Demo.1000
Test.100
Output [[Demo.1000, Demo.12, Demo.12, Demo.23, Test.01, Test.02, Test.100, Test.99]]

You can either create a special AlphaNumericList type, wrapping the array list or whatever collection(s) you want to use internally, giving it a nice public interface to work with, or for the simplest case if you want to stick to the ArrayList<String>, just use a Comparator for sort(..):
package de.scrum_master.stackoverflow.q60482676;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import static java.lang.Integer.parseInt;
public class Collect {
public static void main(String[] args) {
List<String> alphaNumericList = Arrays.asList(
"Demo.23", "Demo.1000", "Demo.12", "Demo.12",
"Test.01", "Test.02", "Test.100", "Test.99"
);
Collections.sort(
alphaNumericList,
(o1, o2) ->
((Integer) parseInt(o1.split("[.]")[1])).compareTo(parseInt(o2.split("[.]")[1]))
);
System.out.println("Output " + alphaNumericList);
}
}
This will yield the following console log:
Output [Test.01, Test.02, Demo.12, Demo.12, Demo.23, Test.99, Test.100, Demo.1000]
Please let me know if you don't understand lambda syntax. You can also use an anonymous class instead like in pre-8 versions of Java.
Update 1: If you want to refactor the one-line lambda for better readability, maybe you prefer this:
Collections.sort(
alphaNumericList,
(text1, text2) -> {
Integer number1 = parseInt(text1.split("[.]")[1]);
int number2 = parseInt(text2.split("[.]")[1]);
return number1.compareTo(number2);
}
);
Update 2: If more than one dot "." character can occur in your strings, you need to get the numeric substring in a different way via regex match, still not complicated:
Collections.sort(
alphaNumericList,
(text1, text2) -> {
Integer number1 = parseInt(text1.replaceFirst(".*[.]", ""));
int number2 = parseInt(text2.replaceFirst(".*[.]", ""));
return number1.compareTo(number2);
}
);
Update 3: I just noticed that for some weird reason you put the sorted list into another list via Arrays.asList(alphaNumericList) when printing. I have replaced that by just alphaNumericList in the code above and also updated the console log. Before the output was like [[foo, bar, zot]], i.e. a nested list with one element.

Check below answer:
public static void main(String[] args) {
List<String> alphaNumericList = new ArrayList<String>();
alphaNumericList.add("Demo.23");
alphaNumericList.add("Demo.1000");
alphaNumericList.add("Demo.12");
alphaNumericList.add("Demo.12");
alphaNumericList.add("Test.01");
alphaNumericList.add("Test.02");
alphaNumericList.add("Test.100");
alphaNumericList.add("Test.99");
Map<String, List<Integer>> map = new HashMap<>();
for (String val : alphaNumericList) {
String key = val.split("\\.")[0];
Integer value = Integer.valueOf(val.split("\\.")[1]);
if (map.containsKey(key)) {
map.get(key).add(value);
} else {
List<Integer> intList = new ArrayList<>();
intList.add(value);
map.put(key, intList);
}
}
for (Map.Entry<String, List<Integer>> entry : map.entrySet()) {
List<Integer> valueList = entry.getValue();
Collections.sort(valueList, Collections.reverseOrder());
System.out.print(entry.getKey() + "." + valueList.get(0) + " ");
}
}

Using stream and toMap() collector.
Map<String, Long> result = alphaNumericList.stream().collect(
toMap(k -> k.split("\\.")[0], v -> Long.parseLong(v.split("\\.")[1]), maxBy(Long::compare)));
The result map will contain word part as a key and maximum number as a value of the map(in your example the map will contain {Demo=1000, Test=100})

a. Assuming there are string of type Demo. and Test. in your arraylist.
b. It should be trivial to filter out elements with String Demo. and then extract the max integer for same.
c. Same should be applicable for extracting out max number associated with Test.
Please check the following snippet of code to achieve the same.
Set<String> uniqueString = alphaNumericList.stream().map(c->c.replaceAll("\\.[0-9]*","")).collect(Collectors.toSet());
Map<String,Integer> map = new HashMap<>();
for(String s:uniqueString){
int max= alphaNumericList.stream().filter(c -> c.startsWith(s+".")).map(c -> c.replaceAll(s+"\\.","")).map(c-> Integer.parseInt(c)).max(Integer::compare).get();
map.put(s,max);
}

Java 8: How to stream a list into a list of lists?

Given a list of objects that need to be sorted and grouped:
static class Widget {
// ...
public String getCode() { return widgetCode; }
public String getName() { return widgetName; }
}
List<Widget> widgetList = Arrays.asList(
// several widgets with codes and names
);
I want to group the list into a list-of-lists, grouped by widgetCode, with the elements of each sub-list in the order they were encountered in the original list. I know that I can group them into a Map of lists using the groupingBy Collector:
Map<String,List<Widget>> widgetMap = widgetList.stream()
.collect(groupingBy(Widget::getCode));
I do not take for granted that the keys are sorted, so I've taken the extra step of loading the whole thing into a SortedMap type:
SortedMap<String,List<Widget>> sortedWidgetMap = new TreeMap<String,List<Widget>>(
widgetList.stream()
.collect(groupingBy(Widget::getCode))
);
I know I can get a Collection from sortedWidgetMap by using .values(), and I guess it is an ordered collection because it comes from an ordered map type, so that's my current solution:
Collection<List<Widget>> widgetListList = new TreeMap<String,List<Widget>>(
widgetList.stream()
.collect(groupingBy(Widget::getCode))
).values();
widgetListList.forEach(System.out::println); // do something with the data
This works so far, but I'm not confident that the resulting widgetListList is actually guaranteed to be in the right order (i.e. by widgetCode) or that the sub-lists will be built in the order they were found in the original list. Also, I think it must be possible to use the Stream API alone to achieve the output I want. So, how can I do this better?

As mentioned in a comment, referring to a question that is very similar (in fact, I nearly considered it to be a duplicate...), the groupBy call comes in different flavors, and one of them allows passing in a factory for the map that is to be created.
So there is no need to explicitly wrap the result of the "simple" groupBy call into the creation of a new TreeMap, because you can create the TreeMap directly. This map (and its values() collection!) will be ordered by the key. The values of the map are lists, which are created using the downstream collector toList(), which explicitly says that it will collect the results in encounter order.
So the following should indeed be a simple, correct and efficient solution:
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.TreeMap;
import java.util.stream.Collectors;
public class CollectToListOfList {
static class Widget {
String code;
String name;
Widget(String code, String name) {
this.code = code;
this.name = name;
}
public String getCode() {
return code;
}
public String getName() {
return name;
}
#Override
public String toString() {
return code + ": " + name;
}
}
public static void main(String[] args) {
List<Widget> widgetList = Arrays.asList(
new Widget("0", "A"),
new Widget("1", "B"),
new Widget("2", "C"),
new Widget("3", "D"),
new Widget("0", "E"),
new Widget("1", "F"),
new Widget("2", "G"),
new Widget("3", "H"),
new Widget("0", "I"),
new Widget("1", "J"));
Collection<List<Widget>> result = widgetList.stream()
.collect(Collectors.groupingBy(Widget::getCode, TreeMap::new, Collectors.toList()))
.values();
for (List<Widget> list : result) {
System.out.println(list);
}
}
}

Edited to correct previous post based on clarification. Only difference between answer(s) by others is that the values() result is fed into an ArrayList constructor to create a List of Lists.
// Create some data
Random r = new Random(29);
String codes = "ABCD";
List<Widget> widgetList = r.ints(10, 0, 4).mapToObj(
n -> new Widget(codes.substring(n, n + 1), "N" + i++)).collect(
Collectors.toList());
// Now create the list of lists.
List<List<Widget>> listofWidgets = new ArrayList<>(
widgetList.stream().collect(Collectors.groupingBy(Widget::getCode,
TreeMap::new,
Collectors.toList())).values());
// Display them
for (List<?> list : listofWidgets) {
System.out.println(list);
}
Widget class
class Widget {
String widgetCode;
String widgetName;
public Widget(String wc, String wn) {
widgetCode = wc;
widgetName = wn;
}
public String getCode() {
return widgetCode;
}
public String getName() {
return widgetName;
}
public String toString() {
return "{" + widgetCode + ":" + widgetName + "}";
}
}

Possible Java1.8 stream anomaly

Can someone explain the behavior of the following code? In particular why does the forEach in the stream change the original List?:
import java.util.ArrayList;
import java.util.List;
public class foreachIssue {
class justanInt {
public int anint;
public justanInt(int t){
anint=t;
}
}
public static void main(String[] args){
new foreachIssue();
}
public foreachIssue(){
System.out.println("The Stream Output:");
List<justanInt> lst = new ArrayList<>();
justanInt j1=new justanInt(2);
justanInt j2=new justanInt(5);
lst.add(j1);lst.add(j2);
lst.stream()
.map((s)->{
s.anint=s.anint*s.anint;
return s;
})
.forEach((s)->System.out.println("Anything"));
System.out.println(" lst after the stream:");
for(justanInt il:lst)
System.out.println(il.anint);
List<justanInt> lst1 = new ArrayList<>();
justanInt j3=new justanInt(2);
justanInt j4=new justanInt(5);
lst1.add(j3);lst1.add(j4);
lst1.stream()
.map((s)->{
s.anint=s.anint*s.anint;
return s;
});
System.out.println(" lst1 after the stream without forEach:");
for(justanInt il:lst1)
System.out.println(il.anint);
}
}
The output is:
The Stream Output:
Anything
Anything
lst after the stream:
4
25
lst1 after the stream without forEach:
2
5

map is an intermediate operation.
Stream operations are divided into intermediate (Stream-producing)
operations and terminal (value- or side-effect-producing) operations.
Intermediate operations are always lazy.
So the Function you've provided to map doesn't get applied until you consume the Stream. In the first case, you do that with forEach, which is a terminal operation. In the second, you don't.

Comparing Maps of Objects

I have set up a test that:
retrieves data concerning several court cases: each court case is stored in a CourtCase object
a set of CourtCase objects is then stored in a Map
I retrieve these data twice (from two different sources) so I end up with two Maps
The data within the objects should be the same between the Maps, but the order of the objects within the Maps may not be:
Map1:
A, case1 - B, case2 - C, case3
Map2:
B, case2 - A, case1 - C, case3
How can I best compare these two Maps?

Map#equals does not care about the order. As long as your 2 maps contain the same mapping it will return true.
Map#equals uses Set#equals method, applied to the entry set. Set#equals contract:
Returns true if the specified object is also a set, the two sets have the same size, and every member of the specified set is contained in this set (or equivalently, every member of this set is contained in the specified set).
Note: this assumes that your CourtCase objects have proper equals and hashcode methods to be compared.

Map implementations provides an equals method which do the trick. Map.equals

#user973718 the best to compare two map objects in java is - you can add the keys of a map to list and with those 2 lists you can use the methods retainAll() and removeAll() and add them to another common keys list and different keys list. Using the keys of the common list and different list you can iterate through map, using equals you can compare the maps.
The below code gives this output :
Before {b=2, c=3, a=1}
After {c=333, a=1}
Unequal: Before- 3 After- 333
Equal: Before- 1 After- 1
Values present only in before map: 2
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import org.apache.commons.collections.CollectionUtils;
public class Demo
{
public static void main(String[] args)
{
Map<String, String> beforeMap = new HashMap<String, String>();
beforeMap.put("a", "1");
beforeMap.put("b", "2");
beforeMap.put("c", "3");
Map<String, String> afterMap = new HashMap<String, String>();
afterMap.put("a", "1");
afterMap.put("c", "333");
System.out.println("Before "+beforeMap);
System.out.println("After "+afterMap);
List<String> beforeList = getAllKeys(beforeMap);
List<String> afterList = getAllKeys(afterMap);
List<String> commonList1 = beforeList;
List<String> commonList2 = afterList;
List<String> diffList1 = getAllKeys(beforeMap);
List<String> diffList2 = getAllKeys(afterMap);
commonList1.retainAll(afterList);
commonList2.retainAll(beforeList);
diffList1.removeAll(commonList1);
diffList2.removeAll(commonList2);
if(commonList1!=null & commonList2!=null) // athough both the size are same
{
for (int i = 0; i < commonList1.size(); i++)
{
if ((beforeMap.get(commonList1.get(i))).equals(afterMap.get(commonList1.get(i))))
{
System.out.println("Equal: Before- "+ beforeMap.get(commonList1.get(i))+" After- "+afterMap.get(commonList1.get(i)));
}
else
{
System.out.println("Unequal: Before- "+ beforeMap.get(commonList1.get(i))+" After- "+afterMap.get(commonList1.get(i)));
}
}
}
if (CollectionUtils.isNotEmpty(diffList1))
{
for (int i = 0; i < diffList1.size(); i++)
{
System.out.println("Values present only in before map: "+beforeMap.get(diffList1.get(i)));
}
}
if (CollectionUtils.isNotEmpty(diffList2))
{
for (int i = 0; i < diffList2.size(); i++)
{
System.out.println("Values present only in after map: "+afterMap.get(diffList2.get(i)));
}
}
}
/**getAllKeys API adds the keys of the map to a list */
private static List<String> getAllKeys(Map<String, String> map1)
{
List<String> key = new ArrayList<String>();
if (map1 != null)
{
Iterator<String> mapIterator = map1.keySet().iterator();
while (mapIterator.hasNext())
{
key.add(mapIterator.next());
}
}
return key;
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Stream distinct() with other chained operations generates duplicate result - java

Related

Get unexpected result while using Java 8 Lambda Expression

How to find the max number of an unique string element in a alphanumeric Array list in java

Java 8: How to stream a list into a list of lists?

Possible Java1.8 stream anomaly

Comparing Maps of Objects

Categories

Resources