How to load data in nested array using dataflow - java

I am trying to load the data into below table. I am able to load the data in "array_data".
But how to load the data in nested array "inside_array".I have tried the commented part to load the data in inside_array array but it did not work.
enter image description here
Here is my code.-
Pipeline p = Pipeline.create(options);
org.apache.beam.sdk.values.PCollection<TableRow> output = p.apply(org.apache.beam.sdk.transforms.Create.of("temp"))
.apply("O/P",ParDo.of(new DoFn<String, TableRow>() {
/**
*
*/
private static final long serialVersionUID = 307542945272055650L;
#ProcessElement
public void processElemet(ProcessContext c) {
TableRow row = new TableRow();
row.set("name","Jack");
row.set("phone","9874563210");
TableRow ip = new TableRow().set("address", "M G Road").set("email","abc#gmail.com");
TableRow ip1 = new TableRow().set("address","F C Road").set("email","xyz#gmail.com");
java.util.List<TableRow> metadata = new ArrayList<TableRow>();
metadata.add(ip);
metadata.add(ip1);
row.set("array_data",metadata);
LOG.info("O/P:"+row);
c.output(row);
}}));
output.apply("Write to table",BigQueryIO.writeTableRows().withoutValidation().to("AA.nested_array")
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
p.run();
Anyone has any clue or suggestion.Thanks in advance.

To Handle the nested array using dataflow create a seprate List and add it into your main array of tablerow.
Here I tried this way and I got the expected output.
Pipeline p = Pipeline.create(options);
org.apache.beam.sdk.values.PCollection output = p.apply(org.apache.beam.sdk.transforms.Create.of("temp"))
.apply("O/P",ParDo.of(new DoFn<String, TableRow>() {
#ProcessElement
public void processElemet(ProcessContext c) {
TableRow row = new TableRow();
row.set("name","Jack");
row.set("phone","9874563210");
List<TableRow> listDest = new ArrayList<>();
TableRow t=new TableRow().set("detail1","one" ).set("detail2", "two");
TableRow t1=new TableRow().set("detail1","three" ).set("detail2", "four");
listDest.add(t);
listDest.add(t1);
TableRow ip = new TableRow().set("address", "M G Road").set("email","abc#gmail.com").set("inside_array", listDest);
TableRow ip1 = new TableRow().set("address","F C Road").set("email","xyz#gmail.com").set("inside_array", listDest);
java.util.List<TableRow> metadata = new ArrayList<TableRow>();
metadata.add(ip);
metadata.add(ip1);
row.set("array_data",metadata);
LOG.info("O/P:"+row);
c.output(row);
}}));
Adding the image of table with data as well.
hope It will helpful if anyone is working on the same kind of table.

Related

Getting latest data from AWS custom Cloudwatch in Java

I have a custom metric in AWS cloudwatch and i am putting data into it through AWS java API.
for(int i =0;i<collection.size();i++){
String[] cell = collection.get(i).split("\\|\\|");
List<Dimension> dimensions = new ArrayList<>();
dimensions.add(new Dimension().withName(dimension[0]).withValue(cell[0]));
dimensions.add(new Dimension().withName(dimension[1]).withValue(cell[1]));
MetricDatum datum = new MetricDatum().withMetricName(metricName)
.withUnit(StandardUnit.None)
.withValue(Double.valueOf(cell[2]))
.withDimensions(dimensions);
PutMetricDataRequest request = new PutMetricDataRequest().withNamespace(namespace+"_"+cell[3]).withMetricData(datum);
String response = String.valueOf(cw.putMetricData(request));
GetMetricDataRequest res = new GetMetricDataRequest().withMetricDataQueries();
//cw.getMetricData();
com.amazonaws.services.cloudwatch.model.Metric m = new com.amazonaws.services.cloudwatch.model.Metric();
m.setMetricName(metricName);
m.setDimensions(dimensions);
m.setNamespace(namespace);
MetricStat ms = new MetricStat().withMetric(m);
MetricDataQuery metricDataQuery = new MetricDataQuery();
metricDataQuery.withMetricStat(ms);
metricDataQuery.withId("m1");
List<MetricDataQuery> mqList = new ArrayList<MetricDataQuery>();
mqList.add(metricDataQuery);
res.withMetricDataQueries(mqList);
GetMetricDataResult result1= cw.getMetricData(res);
}
Now i want to be able to fetch the latest data entered for a particular namespace, metric name and dimention combination through Java API. I am not able to find appropriate documenation from AWS regarding the same. Can anyone please help me?
I got the results from cloudwatch by the below code.\
GetMetricDataRequest getMetricDataRequest = new GetMetricDataRequest().withMetricDataQueries();
Integer integer = new Integer(300);
Iterator<Map.Entry<String, String>> entries = dimensions.entrySet().iterator();
List<Dimension> dList = new ArrayList<Dimension>();
while (entries.hasNext()) {
Map.Entry<String, String> entry = entries.next();
dList.add(new Dimension().withName(entry.getKey()).withValue(entry.getValue()));
}
com.amazonaws.services.cloudwatch.model.Metric metric = new com.amazonaws.services.cloudwatch.model.Metric();
metric.setNamespace(namespace);
metric.setMetricName(metricName);
metric.setDimensions(dList);
MetricStat ms = new MetricStat().withMetric(metric)
.withPeriod(integer)
.withUnit(StandardUnit.None)
.withStat("Average");
MetricDataQuery metricDataQuery = new MetricDataQuery().withMetricStat(ms)
.withId("m1");
List<MetricDataQuery> mqList = new ArrayList<>();
mqList.add(metricDataQuery);
getMetricDataRequest.withMetricDataQueries(mqList);
long timestamp = 1536962700000L;
long timestampEnd = 1536963000000L;
Date d = new Date(timestamp );
Date dEnd = new Date(timestampEnd );
getMetricDataRequest.withStartTime(d);
getMetricDataRequest.withEndTime(dEnd);
GetMetricDataResult result1= cw.getMetricData(getMetricDataRequest);

Java BigQuery API to list table data

I am trying to list the table data from BigQuery using JAVA. However I am not able to find how to configure API to get maximum rows per call?
public class QuickstartSample {
public static void main(String... args) throws Exception {
GoogleCredentials credentials;
File credentialsPath = new File("/Users/gaurang.shah/Downloads/fb3735b731b9.json"); // TODO: update to your key path.
FileInputStream serviceAccountStream = new FileInputStream(credentialsPath);
credentials = ServiceAccountCredentials.fromStream(serviceAccountStream);
BigQuery bigquery = BigQueryOptions.newBuilder().
setCredentials(credentials).
setProjectId("bigquery-public-data").
build().
getService();
Dataset hacker_news = bigquery.getDataset("hacker_news");
Table comments = hacker_news.get("comments");
TableResult result = comments.list().;
for (FieldValueList row : result.iterateAll()) {
// do something with the row
System.out.println(row);
}
}
}
To limit the number of rows you can use listTableData method with TableDataListOption.pageSize(n) parameter.
Following example returns 100 rows as the result:
String datasetName = "my_dataset_name";
String tableName = "my_table_name";
TableId tableIdObject = TableId.of(datasetName, tableName);
TableResult tableData =
bigquery.listTableData(tableIdObject, TableDataListOption.pageSize(100));
for (FieldValueList row : tableData.iterateAll()) {
// do something with the row
}

Custom DataProvider Nattable

I create nattable the following way. But I can get access to the cells only through getters and setters in my Student class. How else can I access cells? Should I create my own BodyDataProvider or use IDataProvider? If it is true, could someone give some examples of implementing such providers?
final ColumnGroupModel columnGroupModel = new ColumnGroupModel();
ColumnHeaderLayer columnHeaderLayer;
String[] propertyNames = { "name", "groupNumber", "examName", "examMark" };
Map<String, String> propertyToLabelMap = new HashMap<String, String>();
propertyToLabelMap.put("name", "Full Name");
propertyToLabelMap.put("groupNumber", "Group");
propertyToLabelMap.put("examName", "Name");
propertyToLabelMap.put("examMark", "Mark");
DefaultBodyDataProvider<Student> bodyDataProvider = new DefaultBodyDataProvider<Student>(students,
propertyNames);
ColumnGroupBodyLayerStack bodyLayer = new ColumnGroupBodyLayerStack(new DataLayer(bodyDataProvider),
columnGroupModel);
DefaultColumnHeaderDataProvider defaultColumnHeaderDataProvider = new DefaultColumnHeaderDataProvider(
propertyNames, propertyToLabelMap);
DefaultColumnHeaderDataLayer columnHeaderDataLayer = new DefaultColumnHeaderDataLayer(
defaultColumnHeaderDataProvider);
columnHeaderLayer = new ColumnHeaderLayer(columnHeaderDataLayer, bodyLayer, bodyLayer.getSelectionLayer());
ColumnGroupHeaderLayer columnGroupHeaderLayer = new ColumnGroupHeaderLayer(columnHeaderLayer,
bodyLayer.getSelectionLayer(), columnGroupModel);
columnGroupHeaderLayer.addColumnsIndexesToGroup("Exams", 2, 3);
columnGroupHeaderLayer.setGroupUnbreakable(2);
final DefaultRowHeaderDataProvider rowHeaderDataProvider = new DefaultRowHeaderDataProvider(bodyDataProvider);
DefaultRowHeaderDataLayer rowHeaderDataLayer = new DefaultRowHeaderDataLayer(rowHeaderDataProvider);
ILayer rowHeaderLayer = new RowHeaderLayer(rowHeaderDataLayer, bodyLayer, bodyLayer.getSelectionLayer());
final DefaultCornerDataProvider cornerDataProvider = new DefaultCornerDataProvider(
defaultColumnHeaderDataProvider, rowHeaderDataProvider);
DataLayer cornerDataLayer = new DataLayer(cornerDataProvider);
ILayer cornerLayer = new CornerLayer(cornerDataLayer, rowHeaderLayer, columnGroupHeaderLayer);
GridLayer gridLayer = new GridLayer(bodyLayer, columnGroupHeaderLayer, rowHeaderLayer, cornerLayer);
NatTable table = new NatTable(shell, gridLayer, true);
As answered in your previous question How do I fix NullPointerException and putting data into NatTable, this is explained in the NatTable Getting Started Tutorial.
If you need some sample code try the NatTable Examples Application
And from knowing your previous question, your data structure does not work in a table, as you have nested objects where the child objects are stored in an array. So this is more a tree and not a table.

How to pass csv mapped bean class to Dataset

I wrote code to read a csv file and map all the columns to a bean class.
Now, I'm trying to set these values to a Dataset and getting an issue.
7/08/30 16:33:58 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: object is not an instance of declaring class
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
If I try to set the values manually it works fine
public void run(String t, String u) throws FileNotFoundException {
JavaRDD<String> pairRDD = sparkContext.textFile("C:/temp/L1_result.csv");
JavaPairRDD<String,String> rowJavaRDD = pairRDD.mapToPair(new PairFunction<String, String, String>() {
public Tuple2<String,String> call(String rec) throws FileNotFoundException {
String[] tokens = rec.split(";");
String[] vals = new String[tokens.length];
for(int i= 0; i < tokens.length; i++){
vals[i] =tokens[i];
}
return new Tuple2<String, String>(tokens[0], tokens[1]);
}
});
ColumnPositionMappingStrategy cpm = new ColumnPositionMappingStrategy();
cpm.setType(funds.class);
String[] csvcolumns = new String[]{"portfolio_id", "portfolio_code"};
cpm.setColumnMapping(csvcolumns);
CSVReader csvReader = new CSVReader(new FileReader("C:/temp/L1_result.csv"));
CsvToBean csvtobean = new CsvToBean();
List csvDataList = csvtobean.parse(cpm, csvReader);
for (Object dataobject : csvDataList) {
funds fund = (funds) dataobject;
System.out.println("Portfolio:"+fund.getPortfolio_id()+ " code:"+fund.getPortfolio_code());
}
/* funds b0 = new funds();
b0.setK("k0");
b0.setSomething("sth0");
funds b1 = new funds();
b1.setK("k1");
b1.setSomething("sth1");
List<funds> data = new ArrayList<funds>();
data.add(b0);
data.add(b1);*/
System.out.println("Portfolio:" + rowJavaRDD.values());
//manual set works fine ///
// Dataset<Row> fundDf = SQLContext.createDataFrame(data, funds.class);
Dataset<Row> fundDf = SQLContext.createDataFrame(rowJavaRDD.values(), funds.class);
fundDf.printSchema();
fundDf.write().option("mergeschema", true).parquet("C:/test");
}
The line below is giving an issue: using rowJavaRDD.values():
Dataset<Row> fundDf = SQLContext.createDataFrame(rowJavaRDD.values(), funds.class);
what is the resolution to this? whatever values Im column mapping should be passed here, but how this needs to be done. Any idea really helps me.
Dataset fundDf = SQLContext.createDataFrame(csvDataList, funds.class);
Passing list worked!

jTable get data from filtered rows

I want to retrieve some data from a filtered row.
This is how i filter my table :
String makeText = makeFilterCombo.getSelectedItem().toString();
if (makeText == "All") {
makeText = "";
}
String numar = getEssRegex();
String impact = impactBox.getSelectedItem().toString();
if (impact == "All") {
impact = "";
}
TableModel model;
model = jTable1.getModel();
final TableRowSorter<TableModel> sorter = new TableRowSorter<TableModel>(model);
jTable1.setRowSorter(sorter);
List<RowFilter<Object, Object>> rfs = new ArrayList<RowFilter<Object, Object>>(2);
rfs.add(RowFilter.regexFilter(makeText, 2));
rfs.add(RowFilter.regexFilter(numar, 5));
rfs.add(RowFilter.regexFilter(impact, 9));
RowFilter<Object, Object> af = RowFilter.andFilter(rfs);
sorter.setRowFilter(af);
And this is how i try to get a value from a filtered row:
int f = search(connectedCarIndex);
connectedImage1 = jTable1.getModel().getValueAt(jTable1.convertRowIndexToModel(f), 10).toString();
connectedImage2 = jTable1.getModel().getValueAt(jTable1.convertRowIndexToModel(f), 11).toString();
connectedImage3 = jTable1.getModel().getValueAt(jTable1.convertRowIndexToModel(f), 12).toString();
System.out.println(connectedImage1 + "-------" + connectedImage2 + "------" + connectedImage3);
But none of this works ?
Can anybody help me ?
The code works and i can see the connected image name if the rows are shown
int f = search(connectedCarIndex);
I have no idea what the search(...) method does.
If you are searching the data that is displayed in the table then you would just use:
table.getValueAt(...);
If you are searching all the data that is stored in the TableModel then you would use:
table.getModel().getValueAt(...);
there is no need to convert the index if you know what you are searching.

Categories

Resources