how to format output of Join RDD using java

how to format output of Join RDD using java - java

JavaPairRDD<String, Tuple2<Tuple2<String, Integer>, Double>> accountNew =
accountRecPair.join(accountCnt).join(accountSum);
( Key, (value))
------------------------------
(12,(ID1,12,1062.0,2),68605.0))
i would like myoutput without "(" and ")"
ID1,12,1062.0,2,68605.0

Since tuples are not collections (they are more like case classes), there is no easy way to flatten the structure. You have to explicitly map your result after each join to extract the data the nested tuple structure and put them in a flat tuple structure.

JavaRDD<String> outputFile = accountNew.map(
new Function< Tuple2<String, Tuple2<Tuple2<String, Integer>, Double>>, String>() {
public String call(
Tuple2<String, Tuple2<Tuple2<String, Integer>, Double>> rec)
{
String orderRec ;
// orderRec = rec._1 ;
Tuple2<Tuple2<String, Integer>, Double> rec1 = rec._2() ;
Tuple2<String, Integer> rec2 = rec1._1() ;
orderRec = rec2._1 + "," + rec2._2().toString() + "," + rec1._2().toString() ;
return orderRec;
}
}
) ;
Here is what I did to format the output.
Blockquote

Related

Store HashMap keys and values to two separate string variables in Java

I need to store all keys into single string variable each key separated by a comma and also I need to do the same for all values
Here is my code
HashMap<String, Object> yourHashMap = new Gson().fromJson(dynamicJson, HashMap.class);
yourHashMap.forEach((k, v) -> {
//System.out.println("Key: " + k );
String result = k + ",";
System.out.println("Keys : "+result);
});
Actual output Keys : name,
message,
Expected output : Keys : name, message
Values : "Message1", "Message Content"
Using these outputs I'm going to create CSV file it uses keys as header and values as rows

You can use Collectors.joining() to add comma separation
String keys = map.keySet().stream().collect(Collectors.joining(", "));
String values = map.values().stream().map(obj -> String.valueOf(obj)).collect(Collectors.joining(", "));
, main function
public static void main(String[] args) {
Map<String, String> map = new HashMap<>();
map.put("key1", "val1");
map.put("key2", "val2");
map.put("key3", "val3");
map.put("key4", "val4");
String keys = map.keySet().stream().collect(Collectors.joining(", "));
String values = map.values().stream().map(obj -> String.valueOf(obj)).collect(Collectors.joining(", "));
System.out.println("Keys: " + keys);
System.out.println("Values: " + values);
}
, output
Keys: key1, key2, key3, key4
Values: val1, val2, val3, val4

HashMap<String, Object> yourHashMap = new Gson().fromJson(dynamicJson, HashMap.class);
LinnkedHashSet<String> keys = new LinnkedHashSet<>();
LinnkedHashSet<String> values = new LinnkedHashSet<>();
yourHashMap.forEach((k, v) -> {
keys.add(k);
values.add(v);
});
System.out.println("keys: "+String.join(",",keys) +
"\n values: "+ String.join(",",values));

Use String.join
HashMap<String, String> yourHashMap = ....
String keys = String.join(",", yourHashMap.keySet());
String values = String.join(",", yourHashMap.values());

Convert nested for loop into java8 stream

I am trying to convert the below code into Java 8 Stream for nested for loop.
I have tried to take stream for outer loop but not sure how to write condition and assign a variable there.
final Map<String, String> events = new HashMap<>();
for (final Event s : result.getEvents()) {
String eventDetail = "";
for (final Data d : s.getData()) {
if (StringUtils.isNotEmpty(d.getValue()) && StringUtils.isNotEmpty(eventDetail)) {
eventDetail = eventDetail + "-" + d.getValue();
} else {
eventDetail = eventDetail + d.getValue();
}
}
events.put(s.getReferenceID(), eventDetail);
}
Result should be map value.

It looks like your goal is to concatenate the value members of the Data instances of each
Event into a "-" separated String, and map this String to the Event's reference ID.
This can be done with Collectors.joining():
Map<String, String>
events = result.getEvents()
.stream()
.map(s -> new SimpleEntry<>(s.getReferenceID(),s.getData().stream().map(Data::getValue).collect(Collectors.joining("-"))))
.collect(Collectors.toMap(Map.Entry::getKey,Map.Entry::getValue));
or, if you wish to eliminate empty values:
Map<String, String>
events = result.getEvents()
.stream()
.map(s -> new SimpleEntry<>(s.getReferenceID(),s.getData().stream().map(Data::getValue).filter(StringUtils::isNotEmpty).collect(Collectors.joining("-"))))
.collect(Collectors.toMap(Map.Entry::getKey,Map.Entry::getValue));

Is this what you want?
Function<Event, String> mapper = event -> event.getData().stream()
.map(Data::getValue)
.filter(StringUtils::isNotEmpty)
.reduce("", (value1, value2) -> value1 + "-" + value2);
final Map<String, String> events = result.getEvents().stream()
.collect(Collectors.toUnmodifiableMap(Event::getReferenceID, mapper));

Swapping keys of nested maps in a list using Java 8

I have an object with structure
List<Map<String(k1), Map<String(k2), String(v2)>>>
I need to convert the above list to
List<Map<String(k2), Map<String(k1), String(v2)>>>
I am stuck on how do i get the nested map using construct like
serviceResults.stream().map((k, v) -> ????)
that will allow me to swap the keys. Is it possible to do it in a way without using loops using Java 8 streams?
Additional Info
This is the code that uses loop construct
List<Map<String, Map<String, String>>> serviceResults = new ArrayList<>();
//Populate the above list
Map<String, Map<String, String>> swpMapOuter = new HashMap<>();
Map<String, String> swpMapInner = new HashMap<>();
for (Map<String, Map<String, String>> stringMapMap : serviceResults) {
for (Map.Entry<String, Map<String, String>> s : stringMapMap.entrySet()) {
String key1 = s.getKey();
Map<String, String> value1 = s.getValue();
for (Map.Entry<String, String> s1 : value1.entrySet()) {
String key2 = s1.getKey();
String value2 = s1.getValue();
swpMapInner.put(key1, value2);
swpMapOuter.put(key2, swpMapInner);
}
}
}
System.out.println("swpMapOuter " + swpMapOuter);
Below is the code with forEach, instead of for loops, but was wondering, if it could be implemented using Stream constructs
Map<String, Map<String, String>> swpMapOuter2 = new HashMap<>();
Map<String, String> swpMapInner2 = new HashMap<>();
serviceResults.forEach((stringMapMap) -> {
stringMapMap.entrySet().forEach((s) -> {
String key1 = s.getKey();
Map<String, String> value1 = s.getValue();
value1.entrySet().forEach((s1) -> {
String key2 = s1.getKey();
String value2 = s1.getValue();
swpMapInner2.put(key1, value2);
swpMapOuter2.put(key2, swpMapInner2);
});
});
});
System.out.println("swpMapOuter2 " + swpMapOuter2);

Usng StringJoiner in complex HashMaps

I have a list of Maps as below:
List<Map<String,Object>> someObjectsList = new ArrayList<Map<String,Object>>();
I am storing the following data in each HashMap
key value
2017-07-21 2017-07-21-07.33.28.429340
2017-07-24 2017-07-24-01.23.33.591340
2017-07-24 2017-07-24-01.23.33.492340
2017-07-21 2017-07-21-07.33.28.429540
I want to iterate through the list of HashMaps and check if the key matches with the first 10 characters of any of the HashMap value, then I want to store those keys and values in the following format. i.e. by using the telemeter 'comma'. The ultimate aim is to group the unique keys of the HashMaps and their relative values (if the key matches with the first 10 characters of any of the HashMap value) in a new HashMap.
key value
2017-07-21 2017-07-21-07.33.28.429340,2017-07-21-07.33.28.429540
2017-07-24 2017-07-24-01.23.33.591340,2017-07-24-01.23.33.492340
I am trying with following java code using StringJoiner, but not getting the results as expected. Any clue on how to frame the logic here?
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.StringJoiner;
public class SampleOne {
public static void main(String[] args) {
// TODO Auto-generated method stub
List<Map<String, Object>> someObjectsList = new ArrayList<Map<String, Object>>();
Map<String, Object> mapOne = new HashMap<String, Object>();
mapOne.put("2017-07-21", "2017-07-21-07.33.28.429340");
Map<String, Object> mapTwo = new HashMap<String, Object>();
mapTwo.put("2017-07-24", "2017-07-24-01.23.33.591340");
Map<String, Object> mapThree = new HashMap<String, Object>();
mapThree.put("2017-07-24", "2017-07-24-01.23.33.492340");
Map<String, Object> mapFour = new HashMap<String, Object>();
mapFour.put("2017-07-21", "2017-07-21-07.33.28.429540");
someObjectsList.add(mapOne);
someObjectsList.add(mapTwo);
someObjectsList.add(mapThree);
someObjectsList.add(mapFour);
for (Map map : someObjectsList) {
StringJoiner sj = new StringJoiner(",");
for (Object key : map.keySet()) {
String value = ((String) map.get(key));
String date = value.substring(0, Math.min(value.length(), 10));
//System.out.println(str);
//System.out.println(value);
if(key.equals(date)) {
sj.add(value);
System.out.println(sj.toString());
}
}
}
}
}
output:
2017-07-21-07.33.28.429340
2017-07-24-01.23.33.591340
2017-07-24-01.23.33.492340
2017-07-21-07.33.28.429540

Make use of the .merge function:
Map<String, Object> finalMap = new HashMap<String, Object>();
for (Map map : someObjectsList) {
for (Object key : map.keySet()) {
String value = ((String) map.get(key));
finalMap.merge((String) key, value, (k, v) -> k + "," + v);
}
}
which outputs:
{2017-07-21=2017-07-21-07.33.28.429340,2017-07-21-07.33.28.429540,
2017-07-24=2017-07-24-01.23.33.591340,2017-07-24-01.23.33.492340}
The same can be achieved by the following one-liner:
someObjectsList.stream()
.flatMap(i -> i.entrySet().stream())
.collect(Collectors.toMap(Entry::getKey, Entry::getValue,
(k, v) -> k + "," + v));

On your code, you are using different StringJoiner on each map. So, it's creating a new instance of it.
You can save your keys on a map. An example code:
(Edit: I did not remove your StringJoiner part.)
public static void main(String[] args) {
// TODO Auto-generated method stub
List<Map<String, Object>> someObjectsList = new ArrayList<Map<String, Object>>();
Map<String, Object> mapOne = new HashMap<String, Object>();
mapOne.put("2017-07-21", "2017-07-21-07.33.28.429340");
Map<String, Object> mapTwo = new HashMap<String, Object>();
mapTwo.put("2017-07-24", "2017-07-24-01.23.33.591340");
Map<String, Object> mapThree = new HashMap<String, Object>();
mapThree.put("2017-07-24", "2017-07-24-01.23.33.492340");
Map<String, Object> mapFour = new HashMap<String, Object>();
mapFour.put("2017-07-21", "2017-07-21-07.33.28.429540");
someObjectsList.add(mapOne);
someObjectsList.add(mapTwo);
someObjectsList.add(mapThree);
someObjectsList.add(mapFour);
Map<String, Object> outputMap = new HashMap<String, Object>();
for (Map map : someObjectsList) {
StringJoiner sj = new StringJoiner(",");
for (Object key : map.keySet()) {
String value = ((String) map.get(key));
String date = value.substring(0, Math.min(value.length(), 10));
//System.out.println(str);
//System.out.println(value);
if(key.equals(date)) {
sj.add(value);
System.out.println(sj.toString());
if(outputMap.containsKey(key)) {
String str = (String) map.get(key);
str = str + "," + value;
outputMap.put((String)key, str);
} else {
outputMap.put((String)key, value);
}
}
}
}
for (String map : outputMap.keySet()) {
System.out.println(map + " " + outputMap.get(map));
}
}

You are looking for the grouping behavior of processing a List. You can use the advantage of java-stream since java-8. In any case, you need a new Map to store the values in order to print them. :
someObjectsList.stream()
.flatMap(i -> i.entrySet().stream()) // flatmapping to entries
.collect(Collectors.groupingBy(Entry::getKey)) // grouping them using the key
In case you want to use for-loops. In this case it is harder since the more entries might appear in each List item:
final Map<String, List<Object>> map = new HashMap<>();
for (Map<String, Object> m: someObjectsList) { // iterate List<Map>
for (Entry<String, Object> entry: m.entrySet()) { // iterate entries of each Map
List<Object> list;
final String key = entry.getKey(); // key of the entry
final Object value = entry.getValue(); // value of the entry
if (map.containsKey(key)) { // if the key exists
list = map.get(key); // ... use it
} else {
list = new ArrayList<>(); // ... or else create a new one
}
list.add(value); // add the new value
map.put(key, list); // and add/update the entry
}
}
Printing out of Map<String, List<Object>> map in both cased will produce the following output:
2017-07-21=[2017-07-21-07.33.28.429340, 2017-07-21-07.33.28.429540],
2017-07-24=[2017-07-24-01.23.33.591340, 2017-07-24-01.23.33.492340]

Any reason you're using Object over String and avoiding safety checks? That said, it's not "the first 10 characters", you want to see if value starts with key full-stop (all your keys are 10 characters). So in that case you can just do if (value.startsWith(key)) { ... }. Don't forget your newlines if the stringjoiner wasn't full. Lastly, you don't need a List, a Map can hold multiple keys at once. An alternative way of doing it:
//LinkedHashMap will preserve our insertion order
Map<String, String> map = new LinkedHashMap<>();
map.put("2017-07-21", "2017-07-21-07.33.28.429340");
map.put("2017-07-24", "2017-07-24-01.23.33.591340");
//note duplicates are overwritten, but no value change here
map.put("2017-07-24", "2017-07-24-01.23.33.492340");
map.put("2017-07-21", "2017-07-21-07.33.28.429540");
// You can also use Java 8 streams for the concatenation
// but I left it simple
List<String> matches = map.entrySet()
.filter(e -> e.getValue().startsWith(e.getKey())
.collect(Collectors.toList());
String concatenated = String.join("\n", matches);
If you wanted to generate that string without streams, it would look like this (again, not using #entrySet for simplicity, but it would be more efficient here):
List<String> matches = new ArrayList<>();
StringJoiner joiner = new StringJoiner("\n");
for (String key : map.keySet()) {
String value = map.get(key);
if (value.startsWith(key)) {
joiner.add(value);
}
}
//joiner#toString will give the expected result

Convert Spark Java to Spark scala

I am trying to convert my Java code to scala in Spark, but found it very complicated. Is it possible to convert the following Java code to scala? Thanks!
JavaPairRDD<String,Tuple2<String,String>> newDataPair = newRecords.mapToPair(new PairFunction<String, String, Tuple2<String, String>>() {
private static final long serialVersionUID = 1L;
#Override
public Tuple2<String, Tuple2<String, String>> call(String t) throws Exception {
MyPerson p = (new Gson()).fromJson(t, MyPerson.class);
String nameAgeKey = p.getName() + "_" + p.getAge() ;
Tuple2<String, String> value = new Tuple2<String, String>(p.getNationality(), t);
Tuple2<String, Tuple2<String, String>> kvp =
new Tuple2<String, Tuple2<String, String>>(nameAgeKey.toLowerCase(), value);
return kvp;
}
});
I tried the following, but I am sure I have missed many things. And actually it is not clear to me how to do the override function in scala ... Please suggest or share some examples. Thank you!
val newDataPair = newRecords.mapToPair(new PairFunction<String, String, Tuple2<String, String>>() {
#Override
public val call(String t) throws Exception {
val p = (new Gson()).fromJson(t, MyPerson.class);
val nameAgeKey = p.getName() + "_" + p.getAge() ;
val value = new Tuple2<String, String>(p.getNationality(), t);
val kvp =
new Tuple2<String, Tuple2<String, String>>(nameAgeKey.toLowerCase(), value);
return kvp;
}
});

Literal translations from Spark-Java to Spark-Scala typically don't work because Spark-Java introduces many artifacts to cope with the limited type system in Java. Examples in this case: mapToPair in Java is just map in Scala. Tuple2 has a more terse syntax (a,b)
Applying that (and some more) to the snippet:
val newDataPair = newRecords.map{t =>
val p = (new Gson()).fromJson(t, classOf[MyPerson])
val nameAgeKey = p.getName + "_" + p.getAge
val value = (p.getNationality(), t)
(nameAgeKey.toLowerCase(), value)
}
It could be made a bit more concise but I wanted to keep the same structure as the Java counterpart to facilitate the understanding of it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

how to format output of Join RDD using java - java

JavaPairRDD<String, Tuple2<Tuple2<String, Integer>, Double>> accountNew = accountRecPair.join(accountCnt).join(accountSum); ( Key, (value)) ------------------------------ (12,(ID1,12,1062.0,2),68605.0)) i would like myoutput without "(" and ")" ID1,12,1062.0,2,68605.0

Since tuples are not collections (they are more like case classes), there is no easy way to flatten the structure. You have to explicitly map your result after each join to extract the data the nested tuple structure and put them in a flat tuple structure.

Related

Store HashMap keys and values to two separate string variables in Java

Convert nested for loop into java8 stream

Swapping keys of nested maps in a list using Java 8

Usng StringJoiner in complex HashMaps

Convert Spark Java to Spark scala

Categories

Resources