Writing/appending data to a CSV file, column wise, in JAVA

Writing/appending data to a CSV file, column wise, in JAVA - java

I want to write/append data to a CSV file, column-by-column, in below fashion:
query1 query2 query3
data_item1 data_item7 data_item12
data_item2 data_item8 data_item13
data_item3 data_item9 data_item14
data_item4 data_item10
data_item5 data_item11
data_item6
I have the data in a hashMap, with the queryID (i.e. query1,query2) being the key and data_items for the
corresponding queries being the values.
The values(data_items for every query) are in a list.
Therefore, my hash map looks like this :
HashMap<String,List<String>> hm = new HashMap<String,List<String>>();
How can I write this data, column by column to a csv, as demonstrated above, using JAVA ?
I tried CSVWriter, but couldn't do it. Can anyone please help me out ?

csv files are mostly used to persist data structured like a table... meaning data with columns and rows that are in a close context.
In your example there seems to be only a very loose connection between query1, 2 and 3, and no connection horizontally between item 1,7 and 12, or 2, 8 and 13 and so on.
On top of that writing into files are usually facilitated along rows or lines. So you open your file write one line, and then another and so on.
So to write the data columnwise as you are asking, you have to either restructure your data in your code alrady to have all the data which is written into one line available on writing that line, or run through your csv file and it's lines several times, each time adding another item to a row. Of course the latter option is very time consuming and would not make much sense.
So i would suggest if there is really no connection between the data of the 3 queries, you either write your data into 3 different csv files: query1.csv, 2.csv and 3.csv.
Or, if you have a horizontal connection i.e. between item 1,7 and 12, and so on you write it into one csv file, organizing the data into rows and columns. Something like:
queryNo columnX columnY columnZ
1 item1 item2 item3
2 item7 item8 item9
3 item12 item13 item14
How to do that is well described in this thread: Java - Writing strings to a CSV file.
Other examples you can also find here https://mkyong.com/java/how-to-export-data-to-csv-file-java/

After days of tinkering around, I finally succeeded. Here is the implementation :
for(int k=0;k<maxRows;k++) {
List<String> rowValues = new ArrayList<String>();
for(int i=0;i<queryIdListArr.length;i++) {
subList = qValuesList.subList(i, i+1);
List<String> subList2 = subList.stream().flatMap(List::stream).collect(Collectors.toList());
if(subList2.size()<=k) {
rowValues.add("");
}else{
rowValues.add(subList2.get(k));
}
}
String[] rowValuesArr = new String[rowValues.size()];
rowValuesArr = rowValues.toArray(rowValuesArr);
// System.out.println(rowValues);
writer.writeNext(rowValuesArr);
}
maxRows : Size of the value list with max size. I have a list of values for each key. My hash map looks like this
HashMap<String,List<String>> hm = new HashMap<String,List<String>>();
queryIdListArr : List of all the values obtained from the hash map.
qValuesList : List of all the value lists.
List<List<String>> qValuesList = new ArrayList<List<String>>();
subList2 : sublist obtained from qValuesList using the below syntax :
qValuesList.subList(i, i+1);
rowValuesArr is an array that gets populated with the index wise value for each
value fetched from qValuesList.
The idea is to fetch all the values for each index from all the sublists and then write those values to the row. If for that index, no value is found, write a blank character.

Related

How to update multiple rows using a single query with a mutable colletion

I want to update rows on a table which contains the following colums:
`parameter_name`(PRIMARY KEY),
`option_order`,
`value`.
I have a collection called parameterColletion which contains "parameterNames", "optionOrders" and "values". This collection does not have a fixed value, it can receive the quantity of parameters you want to.
Imagine I have 5 parameters inside my collection (I could have 28, or 10204 too) and I am trying to update the rows of the database using the next query. Example of query:
UPDATE insight_app_parameter_option
SET option_order IN (1,2,3,4,5), value IN ('a','b','c','d','e')
WHERE parameter_name IN ('name1', 'name2', 'name3', 'name4', 'name5')
But this isn't doing the job, instead it gives back an error which says You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IN (1,2,3,4,5), value IN ('a','b','c','d','e') WHERE parameter_name IN ('name1'' at line 2
1,2,3,4,5 -> Represent the option orders inside parameterCollection.
'a','b','c','d','e' -> Represent the values inside parameterCollection.
'name1', 'name2', 'name3', 'name4', 'name5' -> Represent the names inside parameterCollection.
I know how to update each parameter by separate but i would like to do it all together. Here are some links I visited where people asked the same question but they used a fixed colletion of objects, not a mutable one.
MySQL - UPDATE multiple rows with different values in one query
Multiple rows update into a single query
SQL - Update multiple records in one query

That's not possible with MySQL. The error you are receiving is a syntax error. You are not able to set multiple values at once. This is the correct syntax to a UPDATE statement: (ref)
UPDATE [LOW_PRIORITY] [IGNORE] table_reference
SET assignment_list
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
value:
{expr | DEFAULT}
assignment:
col_name = value
assignment_list:
assignment [, assignment] ...
You need to create separate UPDATEs for each row. I suggest executing all in a single transaction, if its the case.
The correct syntax for your example is:
UPDATE insight_app_parameter_option
SET option_order = 1, value = 'a'
WHERE parameter_name = 'name1';
UPDATE insight_app_parameter_option
SET option_order = 2, value = 'b'
WHERE parameter_name = 'name2';
UPDATE insight_app_parameter_option
SET option_order = 3, value = 'c'
WHERE parameter_name = 'name3';
...

Using SQL stored procedure can we get the following output? or handle it within the code itself?

Image 1 is the current data,
Image 2 is the data i need to store into a new table. Thing is i want to combine all the same ITEM_NO and put it as a comma separated value and insert into a new table.

Whilst I don't think storing data like this is a good idea at all (see what others have said in the comments) it is possible by doing:
SELECT REFERENCE_NO,
ITEM_NO,
ROLES = STUFF((SELECT N', ' + ENTITY_ROLE
FROM dbo.MyTable AS p2
WHERE p2.ITEM_NO = p.ITEM_NO
ORDER BY ENTITY_ROLE
FOR XML PATH(N'')), 1, 2, N'')
FROM dbo.MyTable AS p
GROUP BY REFERENCE_NO, ITEM_NO
ORDER BY ITEM_NO;
A demo of this in action: SQL Fiddle

Compare two Maps & Identify modified value of a existing key

I'm working on a problem, where for the first time i want to upload Excel file, read it & store into a MySQL DB. I'm done with this part & everything is working as expected.
Form the 2nd time whenever i'll upload Excel file again(Excel file can have exactly same data as DB or modified already existing data or both modified already existing data & newly added data), I've to compare it with the data available in MySQL & identify the changes.
I'm reading Excel file using Apache POI library & storing it in a Map<Integer, List<MyCell> where key is a Row Number & its value is List<MyCell> which is basically List of columns.
I'm able to identify the newly added records (newly added keys & their values in
a Map) by this logic
Map<Integer, List<MyCell>> filteredMap = new HashMap<>();
for (Integer key : datafromExcel.keySet()) {
if (datafromDB.containsKey(key)) {
datafromDB.remove(key);
} else {
filteredMap.put(key, datafromExcel.get(key));
}
}
But I didn't succeed in finding existing modified record(existing modified value of a same key in a Map)
How can i get this?

To check the modified values you have to compare the list coming from DB and excel with respective key.
Like I'm showing you just comparison of list here :
Collection<String> similar = new HashSet<String>( fromDB.get(key) );
Collection<String> different = new HashSet<String>();
different.addAll( fromDB.get(key) );
different.addAll( fromExcel.get(key) );
similar.retainAll( fromExcel.get(key) );
different.removeAll( similar );
if(different.size()>0){
SOP("NO CHANGE");
}
Modify as per your logic if MyCell(I have used String here) is a class, you can use comparator.
Hope this will help
.

For key existing in your database you are calling datafromDB.remove(key);.
So immediatelly you find the modified record you remove it and do not longer know you had such record. Clearly you have to do something more than datafromDB.remove(key); or do not remove data for key at all.

How to pipe() a grouped by key RDD?

I Have done the follow workflow path so far:
1) JavaPairRDD< Integer, String > aRDD = fooRDD.mapToPair( )
2) JavaPairRDD< Integer, Iterable< String > > bRDD = aRDD.groupByKey( )
3) JavaPairRDD< Integer, List<String> > cRDD = bRDD.mapToPair( )
Now I have a problem: I need to cRDD.pipe('myscript.sh') but I noticed myscript.sh are receiving all the list for each key at once.
The long version: there is a bash script that will take each group of lines and create a PDF with the data. So bRDD will group lines by using a key, cRDD will sort and remove some undesirable data inside each group and the next step will be create one PDF report for each data group.
I'm thinking in convert the List<String> representing the group content into a new JavaPairRDD< Integer, String > for each group but I don't know how to do this and even if this is the correct way to proceed.
Example:
(1,'foo,b,tom'), (1,'bar,c,city'), (1,'fly,Marty'), (2,'newFoo,Jerry'), (2,'newBar,zed,Mark'), (2,'newFly,boring,data') (2,'jack,big,deal')
After groupBy:
(1, 'foo,b,tom','bar,c,city','fly,Marty')
(2, 'newFoo,Jerry','newBar,zed,Mark','newFly,boring,data','jack,big,deal')
How `myscript.sh' are taking the data (note one String for the entire group):
(1,['foo,b,tom,bar,c,city,fly,Marty'])
(2,['newFoo,Jerry,newBar,zed,Mark,newFly,boring,data,jack,big,deal'])
how I'm expecting to receive:
For partition 1 or worker 1:
1,'foo,b,tom'
1,'bar,c,city'
1,'fly,Marty'
For partition 2 or worker 2:
2,'newFoo,Jerry'
2,'newBar,zed,Mark'
2,'newFly,boring,data'
2,'jack,big,deal'
So I can process each line at one time but still keeping the group and can ensure that this will make group 1 go to one PDF report and group 2 go to another report. The major problem is my data line is already a comma-separated data then I can't determine where to start a new line value because all lines are merged as comma-separated line too.
I'm working with Java. Please give your answer in Java too.

You can't create RDD inside RDD. If you want to process all records continuously which belongs to particular key then you shouldn't again flatMap grouped RDDs ( bRDD, cRDD) . Instead, I would suggest to change grouped RDDs' ( bRDD, cRDD ) values separator to some other character.
e.g.
cRDD.map(s->{
StringBuilder sb =new StringBuilder();
Iterator<String> ite = s._2().iterator();
while (ite.hasNext()){
//change delimiter to colon(:) or some other character
sb.append(ite.next()+":");
}
return new Tuple2<Long,String>(s._1(),sb.toString());
}).pipe('myscript.sh');
In myscript.sh split records based on colon (:). I hope this would help.

Google trends api results in java

I am using Google trends to get trends for particulate keyword. it will returning JSON but main problem is that i want to create class that holds data and used in java code as array List.
I am confused what is the class structure for it when i get result look like below
{"version":"0.6","status":"ok","sig":"1248242565",
"table":
{ "cols":
[{"id":"date","label":"Date","type":"date","pattern":""},
{"id":"query0","label":"linkedin","type":"number","pattern":""},
{"id":"query1","label":"facebook","type":"number","pattern":""}],
"rows":[{"c":[{"v":new Date(2004,0,1),"f":"January 2004"},{"v":0.0,"f":"0"},{"v":0.0,"f":"0"}]},
{"c":[{"v":new Date(2004,5,1),"f":"June 2004"},{"v":0.0,"f":"0"}, {"v":0.0,"f":"0"}]},
{"c":[{"v":new Date(2004,8,1),"f":"September 2004"},{"v":0.0,"f":"0"},{"v":0.0,"f":"0"}]},
{"c":[{"v":new Date(2013,9,1),"f":"October 2013"},{"v":1.0,"f":"1"},{"v":83.0,"f":"83"}]}]
}
}
It will return row and cols on search query if i search two individual word the the result is like above JSON. nay idea to how can i can make class Trend.java and that list object that holds all this informations

How would you represent those values? I'd go for a List<HashMap<String, String>> implementation.
You can assign each item in a row to a HashMap with the column header as the key. So:
HashMap<String, String> row = new HashMap<String, String>();
row.put("id", "c");
// add the rest.
Then you can cycle through each row, and request the column data by name. This will also make for some very semantically nice code!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.