transform narrow table into wide in java - java

I have a txt file containing wide table of strings. I want to transform it to wide list of lists based on the ID column. The data is look like this
Long format has 3 columns: country, key, value
- M*N rows.
e.g.
'USA', 'President', 'Obama'
...
'USA', 'Currency', 'Dollar'
Wide format has N=16 columns: county, key1, ..., keyN
- M rows
example:
country, President, ... , Currency
'USA', 'Obama', ... , 'Dollar'
I want to know the equivalent of
SELECT country,
MAX( IF( key='President', value, NULL ) ) AS President,
MAX( IF( key='Currency', value, NULL ) ) AS Currency,
...
FROM table
GROUP BY country;
in java!

I think you can make it a little bit easier with some Collectors.groupingBy() but a simpler version would be this:
List<String[]> list = new ArrayList<>();
list.add(new String[] { "USA", "President", "Obama" });
list.add(new String[] { "USA", "Currency", "Dollar" });
list.add(new String[] { "Germany", "President", "Steinmeier" });
list.add(new String[] { "Germany", "Currency", "Euro" });
list.add(new String[] { "United Kingdom", "President", "Queen Elisabeth" });
list.add(new String[] { "United Kingdom", "Currency", "Pound" });
Map<String, Map<String, String>> map = new HashMap<>();
list.forEach(s -> {
map.putIfAbsent(s[0], new HashMap<>());
map.get(s[0]).put(s[1], s[2]);
});
List<String[]> wideList = map.entrySet().stream()
.map(m -> new String[] { m.getKey(), m.getValue().get("President"), m.getValue().get("Currency") })//
.collect(Collectors.toList());
System.out.println("country, President, Currency");
wideList.forEach(s -> System.out.println(s[0] + ", " + s[1] + ", " + s[2]));

Related

Group list of objects by key into list with sublists of unique objects (with java streams)

I have a list in which I have a combination of key and some additional objects, that are not related to each other in other way.
Considering this structure:
record A0(String id, String name, B b, C c) {}
record A(String id, String name, Set<B> bs, Set<C> cs) {}
record B(String id, String name) {}
record C(String id, String name) {}
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("1", "nc1")));
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("2", "nc2")));
a0s.add(new A0("1", "n1", new B("2", "nb2"), new C("3", "nc3")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("4", "nc4")));
a0s.add(new A0("2", "n2", new B("1", "nb1"), new C("5", "nc5")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("6", "nc6")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("7", "nc7")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("8", "nc8")));
a0s.add(new A0("4", "n4", new B("4", "nb4"), new C("9", "nc9")));
a0s.add(new A0("4", "n4", new B("5", "nb5"), new C("10", "nc10")));
I want to achieve this with java-streams:
[ {
"id" : "1",
"name" : "n1",
"bs" : [ {
"id" : "1",
"name" : "nb1"
}, {
"id" : "2",
"name" : "nb2"
} ],
"cs" : [ {
"id" : "1",
"name" : "nc1"
}, {
"id" : "2",
"name" : "nc2"
}, {
"id" : "3",
"name" : "nc3"
} ]
}, {
"id" : "2",
"name" : "n2",
"bs" : [ {
"id" : "2",
"name" : "nb2"
}, {
"id" : "1",
"name" : "nb1"
} ],
"cs" : [ {
"id" : "4",
"name" : "nc4"
}, {
"id" : "5",
"name" : "nc5"
}, {
"id" : "6",
"name" : "nc6"
} ]
}, {
"id" : "3",
"name" : "n3",
"bs" : [ {
"id" : "3",
"name" : "nb3"
} ],
"cs" : [ {
"id" : "7",
"name" : "nc7"
}, {
"id" : "8",
"name" : "nc8"
} ]
}, {
"id" : "4",
"name" : "n4",
"bs" : [ {
"id" : "4",
"name" : "nb4"
}, {
"id" : "5",
"name" : "nb5"
} ],
"cs" : [ {
"id" : "10",
"name" : "nc10"
}, {
"id" : "9",
"name" : "nc9"
} ]
} ]
Here is my code without(obviously) java-streams:
import java.util.*;
import java.util.stream.Collectors;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
class Scratch {
record A0(String id, String name, B b, C c) {}
record A(String id, String name, Set<B> bs, Set<C> cs) {}
record B(String id, String name) {}
record C(String id, String name) {}
public static void main(String[] args) throws JsonProcessingException {
List<A0> a0s = new ArrayList<>();
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("1", "nc1")));
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("2", "nc2")));
a0s.add(new A0("1", "n1", new B("2", "nb2"), new C("3", "nc3")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("4", "nc4")));
a0s.add(new A0("2", "n2", new B("1", "nb1"), new C("5", "nc5")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("6", "nc6")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("7", "nc7")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("8", "nc8")));
a0s.add(new A0("4", "n4", new B("4", "nb4"), new C("9", "nc9")));
a0s.add(new A0("4", "n4", new B("5", "nb5"), new C("10", "nc10")));
Set<A> collectA = new HashSet<>();
Map<String, Set<B>> mapAB = new HashMap<>();
Map<String, Set<C>> mapAC = new HashMap<>();
a0s.forEach(
a0 -> {
mapAB.computeIfAbsent(a0.id, k -> new HashSet<>());
mapAC.computeIfAbsent(a0.id, k -> new HashSet<>());
mapAB.get(a0.id).add(a0.b);
mapAC.get(a0.id).add(a0.c);
collectA.add(new A(a0.id, a0.name, new HashSet<>(), new HashSet<>()));
});
Set<A> outA = new HashSet<>();
collectA.forEach(
a -> {
outA.add(new A(a.id, a.name, mapAB.get(a.id), mapAC.get(a.id)));
});
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.enable(SerializationFeature.INDENT_OUTPUT);
String json =
objectMapper.writeValueAsString(
outA.stream()
.sorted(Comparator.comparing(A::id))
.collect(Collectors.toList()));
System.out.println(json);
}
}
I have red posts and docs, but was unable to achieve it.
This pointed me in some direction, but I was unable to continue combining with other solution and reading API docs.
What "bugs", me is that I have multiple repeated objects to group(collect) and be unique. I am using Set to get advantage of the uniqueness, but could be List as well.
groupingBy + teeing
One of the way to do that is to build the solution around standard Collectors.
For convince, we can introduce a couple of custom types.
A record which is meant to hold the unique properties id and name:
record IdName(String id, String name) {}
And another record for storing sets Set<B>, Set<C> associated with the same id:
record BCSets(Set<B> bs, Set<C> cs) {}
The logic of the stream:
Group the data using IdName as a Key by utilizing Collector groupingBy()
Make use of Collector teeing() as downstream of grouping. teeing() expects three arguments: two Collectors and a function combining the results produced by them. As both downstream Collectors of teeing() we can make use of the combination of mapping() and toSet(), and combine their results by generating an auxiliary record BCSets.
Then create a stream over the map entries and transform each entry into an instance of type A.
Sort the stream elements and collect them into a list.
List<A> listA = a0s.stream()
.collect(Collectors.groupingBy(
a0 -> new IdName(a0.id(), a0.name()),
Collectors.teeing(
Collectors.mapping(A0::b, Collectors.toSet()),
Collectors.mapping(A0::c, Collectors.toSet()),
BCSets::new
)
))
.entrySet().stream()
.map(e -> new A(e.getKey().id(), e.getKey().name(), e.getValue().bs(), e.getValue().cs()))
.sorted(Comparator.comparing(A::id))
.toList();
groupingBy + custom Collector
Another option would be to create a custom Collector which would be used as the downstream of grouping()
For that, we need to define a custom accumulation type to consume elements from the stream and collect instances of B and C into sets. For convenience, I've implemented Consumer interface:
public static class ABCAccumulator implements Consumer<A0> {
private Set<B> bs = new HashSet<>();
private Set<C> cs = new HashSet<>();
#Override
public void accept(A0 a0) {
bs.add(a0.b());
cs.add(a0.c());
}
public ABCAccumulator merge(ABCAccumulator other) {
bs.addAll(other.bs);
cs.addAll(other.cs);
return this;
}
// getters
}
To create a custom Collector, we can use static factory method Collector.of().
The overall logic remains the same with one difference - now we have only two collectors, and the type of values of auxiliary map is different as well (it would be ABCAccumulator).
List<A> listA = a0s.stream()
.collect(Collectors.groupingBy(
a0 -> new IdName(a0.id(), a0.name()),
Collector.of(
ABCAccumulator::new,
ABCAccumulator::accept,
ABCAccumulator::merge
)
))
.entrySet().stream()
.map(e -> new A(e.getKey().id(), e.getKey().name(), e.getValue().getBs(), e.getValue().getCs()))
.sorted(Comparator.comparing(A::id))
.toList();
Until I can think of a better approach....
I was writing a solution using Collectors.teeing, but #Alexander Ivanchenko beat me to it. You can refer to that answer for how to achieve this using Collectors.teeing.
My initial code without using Collectors.teeing:
First, we group the elements in source list (a0s) by their id.
Map<String, List<A0>> groupById = a0s.stream()
.collect(Collectors.groupingBy(A0::id));
Next, we stream the entries in the previous map and build A objects.
Set<A> outAResult = groupById.entrySet()
.stream()
.map(entry -> new A(entry.getKey(),
entry.getValue().get(0).name(), //since grouped by A0's id - name will be same for all elements
transform(entry.getValue(), A0::b),
transform(entry.getValue(), A0::c)))
.collect(Collectors.toSet());
private <T> Set<T> transform(List<A0> a0s, Function<A0, T> mapper) {
return a0s.stream()
.map(mapper)
.collect(Collectors.toSet());
}
One issue is we have to stream the elements in List<A0> (each value in groupById map) twice to extract the List<B> and List<C>.
Note: Extracting the name of A0 object by entry.getValue().get(0).name() doesn't look great. To avoid this, you can create a temporary object (a record) which captures the id and name and group by that.

Group by multiple fields in stream java 8

I have a class:
public class PublicationDTO {
final String publicationName;
final String publicationID;
final String locale;
final Integer views;
final Integer shares;
}
A need to get sum of views and shares. For test i created a list:
PublicationDTO publicationDTO1 = new PublicationDTO("Name1", "name1", "CA", 5, 6);
PublicationDTO publicationDTO2 = new PublicationDTO("Name2", "name2", "US", 6, 3);
PublicationDTO publicationDTO3 = new PublicationDTO("Name1", "name1", "CA", 10, 1);
PublicationDTO publicationDTO4 = new PublicationDTO("Name2", "name2", "CA", 2, 3);
List<PublicationDTO> publicationDTOS = List.of(publicationDTO1, publicationDTO2, publicationDTO3, publicationDTO4);
I want to group objects in list by publicationName, publicationId and locale and get result list like:
List.of(new PublicationDTO("Name1", "name1", "CA", 15, 7),
new PublicationDTO("Name2", "name2", "CA", 2, 3),
new PublicationDTO("Name2", "name2", "US", 6, 3));
I found a solution like:
List<PublicationDTO> collect = publicationDTOS.stream()
.collect(groupingBy(PublicationDTO::getPublicationID))
.values().stream()
.map(dtos -> dtos.stream()
.reduce((f1, f2) -> new PublicationDTO(f1.publicationName, f1.publicationID, f1.locale, f1.views + f2.views, f1.shares + f2.shares)))
.map(Optional::get)
.collect(toList());
but the result not grouped by locale and I'm not sure if it works by publicationId. Please let me know how to properly use collectors in such case?
You are only grouping by getPublicationID:
.collect(groupingBy(PublicationDTO::getPublicationID))
Since those fields you are interested in for grouping are all strings you could concatenate them and use that as a grouping classifier:
.collect(Collectors.groupingBy(p -> p.getPublicationName() + p.getPublicationID() + p.getLocale()))

Check if each map of the list contains all key-values combination from two sources via Stream api

I have a small service which takes two arrays person and document and then return some array with combined data from them
If request like this:
{
"documents": [
{
"id": "A",
},
{
"id": "B",
}
],
"persons": [
{
"lastName": "C",
},
{
"lastName": "D",
}
]
}
Then the response has an array like this:
{
"documents": [
{
"id": "A",
"lastName": "C"
},
{
"id": "A",
"lastName": "D"
},
{
"id": "B",
"lastName": "C"
},
{
"id": "B",
"lastName": "D"
}
]
}
If i get List<Map<String,String>> of resulting array how do i check if each map has each Key-Value combinations via Stream Api. I managed to do this with brute loops but struggling with Stream Api
Update: my method eventually looks like this
private void checkResultList(List<Map<String, String>> resultList) {
List<String> persons = Arrays.asList("C", "D");
List<String> documents = Arrays.asList("A", "B");
List<Map<String, String>> expectedList = new ArrayList<>();
List<Map<String, String>> actualList = new ArrayList<>();
for (String name : persons){
for (String id : documents){
Map<String,String> expectedElement = new HashMap<>();
expectedElement.put("lastName", name);
expectedElement.put("id", id);
expectedList.add(expectedElement);
}
}
resultList.stream().forEach(i -> {
Map<String,String> actualElement = new HashMap<>();
actualElement.put("lastName", i.get("lastName"));
actualElement.put("id", i.get("id"));
actualList.add(actualElement);
});
Assertions.assertEquals(actualList, expectedList);
}
Update 2
So far i managed to implement it like this. But i still have two steps. And can't find out if it can be written any shorter.
private void checkResultList(List<Map<String, String>> resultList) {
List<String> persons = Arrays.asList("C", "D");
List<String> documents = Arrays.asList("A", "B");
List<Map<String, String>> expectedList = new ArrayList<>();
List<Map<String, String>> actualList = new ArrayList<>();
persons.forEach(i -> documents.forEach(k -> {
HashMap<String, String> e = new HashMap<>();
e.put("id", k);
e.put("name", i);
expectedList.add(e);
}));
resultList.stream().forEach(i -> {
Map<String,String> actualElement = new HashMap<>();
actualElement.put("lastName", i.get("lastName"));
actualElement.put("id", i.get("id"));
actualList.add(actualElement);
});
Assertions.assertEquals(actualList, expectedList);
}
I just learned this.. How to flat map of values to another map grouped by key
List<Map<String, String>> expectedList =
documents
.stream()
.flatMap(d -> persons.stream().map(p -> Map.of("id", d, "lastName", p)))
.collect(toList());

Find the missing elements of a list not in the values of a hashmap

I want to get a list of hashmap that grouped by year and find the customer_id that doesn't contain a group based on a list of customer_id.
This is an example of the dataset.
List tagList = new ArrayList<>();
# Customer
HashMap<String, Object> customerMap = new HashMap<>();
## feeding data example
customerMap.put("date", "2018");
customerMap.put("name", "John");
customerMap.put("custemer_no", "1a");
tagList.add(customerMap);
customer_id_list = ['1a', '2b', '3c']
customer_list = [
{
"date": "2019",
"name": "John",
"customer_id": "1a"
},
{
"date": "2019",
"name": "David",
"customer_id": "2b"
},
{
"date": "2020",
"name": "John",
"customer_id": "1a"
},
{
"date": "2020",
"name": "Alex",
"customer_id": "3c"
},
{
"date": "2021",
"name": "John",
"customer_id": "1a"
}
]
This is a sample output that I want.
missing_customer_list = [
{
"date": "2019",
"name": "Alex",
"customer_id": "3c"
},
{
"date": "2020",
"name": "David",
"customer_id": "2b"
},
{
"date": "2021",
"name": "David",
"customer_id": "2b"
},
{
"date": "2021",
"name": "Alex",
"customer_id": "3c"
}
]
Do you have any ideas how I can get this sample output using stream comprehension?
If I cannot filter the list directly using stream comprehension, using a for loop is fine too.
I found the way how to group by year for now, but don't know how to handle the rest of the filtering..
Please help me! thanks in advance
List<Customer> result = customer_list.stream()
.collect(Collectors.groupingBy(Customer::getDate))
UPDATE
Referring to the current answers,
I ve got stuck at the point where I couldn't create the new instance of customer.
So, Im planning to use for-loops to find the missing_customer_id during iteratation of customer_id_list and filtered_customer_list.
Once I get the missing_customer_id in a list, I will try to re-create the Customers manually and add them into a new list to print
There are many different ways to do this. These solutions can become inefficient for large lists. Because, calling .contains() on a List is a linear-time operation, meaning doing so n times is quadratic.
List<String> customerIdList = Arrays.asList("a", "b", "c");
List<Customer> customerList = new ArrayList<>();
customerList.add(new Customer("2018", "a"));
customerList.add(new Customer("2018", "b"));
customerList.add(new Customer("2019", "b"));
customerList.add(new Customer("2019", "c"));
customerList.add(new Customer("2020", "a"));
customerList.add(new Customer("2020", "c"));
customerList.add(new Customer("2021", "a"));
Map<String, List<Customer>> collect = customerList.stream()
.collect(Collectors.groupingBy(Customer::getDate))
.entrySet().stream().map(notExistCustomerByYear(customerIdList))
.flatMap(Collection::stream)
.collect(Collectors.groupingBy(Customer::getDate));
The method that accomplishes what I really want is as follows;
private Function<Map.Entry<String, List<Customer>>, List<Customer>> notExistCustomerByYear(List<String> customerIdList) {
return e -> {
List<String> customerIds = e.getValue().stream().map(Customer::getCustomerId).collect(Collectors.toList());
List<String> notExistCustomerIdsInYear = customerIdList.stream().filter(id -> !customerIds.contains(id)).collect(Collectors.toList());
return notExistCustomerIdsInYear.stream().map(id -> new Customer(e.getKey(), id)).collect(Collectors.toList());
};
}
What you describe is not filtering. According to your question and comments, you want a completely new list with combinations of year and customer-id which were not present in the original list.
First, you can group customer-ids per year to have a better representation of your original list.
Map<Integer, List<String>> customerIdPerYear = customerList.stream()
.collect(Collectors.groupingBy(
Customer::getYear,
Collectors.mapping(
Customer::getCustomerId,
Collectors.toList())));
System.out.println(customerIdPerYear);
// output: {2018=[a, b], 2019=[b, c], 2020=[a, c], 2021=[a]}
In a second step, you create a new list per year with the customer-ids not found in the original list. Finally, you can create new Customer objects and return a flattened list with flatMap.
List<Customer> missingCustomersPerYear = customerIdPerYear.entrySet().stream()
.flatMap(e -> customerIdList.stream()
.filter(id -> !e.getValue().contains(id))
.map(id -> new Customer(e.getKey(), id)))
.collect(Collectors.toList());
System.out.println(missingCustomersPerYear);
// output: [(2018, c), (2019, a), (2020, b), (2021, b), (2021, c)]
To be complete, here is the Customer class used for the above examples:
class Customer {
private int year;
private String customerId;
public Customer(final int year, final String customerId) {
this.year = year;
this.customerId = customerId;
}
public int getYear() {
return year;
}
public String getCustomerId() {
return customerId;
}
#Override
public String toString() {
return "(" + year + ", " + customerId + ")";
}
}

How to convert list of map to array of hash

I am using Java for my application. Here is the my input data.
[{
"id": 1,
"firstname": "one",
"lastname": "1"
},
{
"id": 2,
"firstname": "two",
"lastname": "2"
},
{
"id": 3,
"firstname": "three",
"lastname": "3"
}
]
I want to convert the above input to like below output. How can I achieve the below output in an efficient manner?
{
["id", "firstname", "lastname"], [1, "one", "1"], [2, "two", "2"], [3, "three", "3"]
}
Update:
I have tried the below. But it resulted as below
Expected:
result => {[lastname, id, firstname]=[[1, 1, one], [2, 2, two], [3, 3, three]]}
Actual:
result => {[lastname, id, firstname], [1, 1, one], [2, 2, two], [3, 3, three]}
Code:
Map<String, Object> one = Map.of("id", 1, "firstname", "one", "lastname", "1");
Map<String, Object> two = Map.of("id", 2, "firstname", "two", "lastname", "2");
Map<String, Object> three = Map.of("id", 3, "firstname", "three", "lastname", "3");
ArrayList<Map<String,Object>> list = new ArrayList<>();
list.add(one);
list.add(two);
list.add(three);
MultiValueMap<Object, Object> result = new LinkedMultiValueMap<>();
Set<String> strings = list.get(0).keySet();
ArrayList<Object> objects = new ArrayList<>();
for(Map<String,Object> map: list) {
objects.add(map.values());
}
result.put(strings, objects);
The input JSON may be read into a list of maps data sharing the same key sets and this list is to be converted into a list of object arrays, with the first element of this list being the keys of a map.
So, at first the array of field names should be created, converted to Stream, and then merged with Stream<Object[]> retrieved from the values of each map in data list:
// using Jackson JSON to read the input
ObjectMapper mapper = new ObjectMapper();
String input = "[{\"id\":1, \"firstname\":\"First\", \"lastname\": \"Last\"}]";
List<Map<String, Object>> data = mapper.readValue(input, new TypeReference<>() {});
List<Object[]> output = Stream.concat(
Stream.<Object[]>of(data.get(0).keySet().toArray()),
data.stream().map(m -> m.values().toArray())
)
.collect(Collectors.toList());
System.out.printf("Result: {%n\t%s%n}%n",
output.stream().map(Arrays::toString).collect(Collectors.joining(", ")));
System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(output));
Output:
Result: {
[id, firstname, lastname], [1, First, Last]
}
[ [ "id", "firstname", "lastname" ], [ 1, "First", "Last" ] ]
Or similarly the result could be a list of raw collections based on maps' keySet() and values, which could be created like this:
List<Collection> result = Stream.concat(
Stream.of(data.get(0).keySet()),
data.stream().map(Map::values)
).collect(Collectors.toList());

Categories

Resources