How to use CSV HeaderColumnNameTranslateMappingStrategy without knowing the class before

How to use CSV HeaderColumnNameTranslateMappingStrategy without knowing the class before - java

I'm having a big problem. I want to map a CSV file to my entites. This should happen dynamically. I get a file and mapping of the user from the frontend. It contains the headerColumn, the class and the attribute. This will be iterated through a JSONArray. So I can't know the class because maybe the user tells that he wants to have the headerColumn "abc" as Article.articleName but in the same CSV the headerColumn "def" as Competitor.location.
My attempt:
List<Class> classes = new ArrayList<>();
for (int i = 0; i < jsonArray.length(); i++) {
Class<?> cls = Class.forName(pathToEntities + jsonArray.getJSONObject(i).get(classKey).toString());
maps.put(cls, columnMapping);
}
for (int i = 0; i < classes.size(); i++) {
Class cls = classes.get(i);
CsvToBean<cls> csvToBean = new CsvToBean<cls>();
columnMapping.put("LANGU", "id");
columnMapping.put("TXTMD", "fname");
columnMapping.put("Lname", "lname");
HeaderColumnNameTranslateMappingStrategy<cls> strategy = new HeaderColumnNameTranslateMappingStrategy<cls>();
strategy.setType(cls);
strategy.setColumnMapping(columnMapping);
}
I hope you can follow me. For every class I've wanted to run this whole mapping process. The columnMapping Map would have been given to this class then. But the compiler can't work with this and tells me when I give the CsvToBean a variable instead of a class: cls cannot be resolved to a type
The library is OpenCSV here. But I really don't care which CSV library to use. I just want it to work. I would implement every single of them no matter. This now seems not fixable for me.

You can try to use instanceof to check what entity you get from Class.forName()
CSVReader csvReader = new CSVReader(new FileReader("Simple.csv"));
Class aClass = Class.forName("com.ishikawa.csvparser.entity.SimpleEntity");
CsvToBean ctb = new CsvToBean();
HashMap<String, String> columnMapping = new HashMap<>();
HeaderColumnNameTranslateMappingStrategy headerStrategy = new HeaderColumnNameTranslateMappingStrategy();
columnMapping.put("LANGU", "name");
headerStrategy.setType(aClass);
headerStrategy.setColumnMapping(columnMapping);
ctb.setMappingStrategy(headerStrategy);
ctb.setCsvReader(csvReader);
List parse = ctb.parse();
parse.stream().forEach(e->{
if (e instanceof SimpleEntity) {
System.out.println(((SimpleEntity)e).getName());
}
});
with simple.csv
LANGU
asd
asdf
and entity as example
package com.ishikawa.csvparser.entity;
import lombok.Data;
#Data
public class SimpleEntity {
public String name;
}
hope it helps you.
upd1
If you need to change separator and add created instance of CSVParser to CSVReader
CSVParser parser = new CSVParserBuilder()
.withSeparator(';')
.withIgnoreQuotations(true)
.build();
CSVReader csvReader = new CSVReaderBuilder(reader)
.withSkipLines(0)
.withCSVParser(parser)
.build();

Related

mapping particular column of a csv file with particular POJO's field

I have to map particular CSV column based on index with particular POJO attributes. Mapping will be based on a json file which will contain columnIndex and attribute name which means that for a particular columnIndex from csv file you have to map particular attribute from Pojo class.
Below is a sample of json file which shows column mapping strategy with Pojo attributes.
[{"index":0,"columnname":"date"},{"index":1,"columnname":"deviceAddress"},{"index":7,"columnname":"iPAddress"},{"index":3,"columnname":"userName"},{"index":10,"columnname":"group"},{"index":5,"columnname":"eventCategoryName"},{"index":6,"columnname":"message"}]
I have tried with OpenCSV library but the challenges which i faced with that I am not able to read partial column with it. As in above json you can see that we are skipping index 2 and 4 to read from CSV file. Below is the code with openCSV file.
public static List<BaseDataModel> readCSVFile(String filePath,List<String> columnListBasedOnIndex) {
List<BaseDataModel> csvDataModels = null;
File myFile = new File(filePath);
try (FileInputStream fis = new FileInputStream(myFile)) {
final ColumnPositionMappingStrategy<BaseDataModel> strategy = new ColumnPositionMappingStrategy<BaseDataModel>();
strategy.setType(BaseDataModel.class);
strategy.setColumnMapping(columnListBasedOnIndex.toArray(new String[0]));
final CsvToBeanBuilder<BaseDataModel> beanBuilder = new CsvToBeanBuilder<>(new InputStreamReader(fis));
beanBuilder.withMappingStrategy(strategy);
csvDataModels = beanBuilder.build().parse();
} catch (Exception e) {
e.printStackTrace();
}
}
List<ColumnIndexMapping> columnIndexMappingList = dataSourceModel.getColumnMappingStrategy();
List<String> columnNameList = columnIndexMappingList.stream().map(ColumnIndexMapping::getColumnname)
.collect(Collectors.toList());
List<BaseDataModel> DataModels = Utility
.readCSVFile(file.getAbsolutePath() + File.separator + fileName, columnNameList);
I have also tried with univocity but with this library how can i map csv with particular attributes. Below is the code -
CsvParserSettings settings = new CsvParserSettings();
settings.detectFormatAutomatically(); //detects the format
settings.getFormat().setLineSeparator("\n");
//extracts the headers from the input
settings.setHeaderExtractionEnabled(true);
settings.selectIndexes(0, 2); //rows will contain only values of columns at position 0 and 2
CsvRoutines routines = new CsvRoutines(settings); // Can also use TSV and Fixed-width routines
routines.parseAll(BaseDataModel.class, new File("/path/to/your.csv"));
List<String[]> rows = new CsvParser(settings).parseAll(new File("/path/to/your.csv"), "UTF-8");
Please have a look if someone can help me in this case.

Author of univocity-parsers here. You can define mappings to your class attributes in code instead of annotations. Something like this:
public class BaseDataModel {
private String a;
private int b;
private String c;
private Date d;
}
Then on your code, map the attributes to whatever column names you need:
ColumnMapper mapper = routines.getColumnMapper();
mapper.attributeToColumnName("a", "col1");
mapper.attributeToColumnName("b", "col2");
mapper.attributeToColumnName("c", "col3");
mapper.attributeToColumnName("d", "col4");
You can also use mapper.attributeToIndex("d", 3); to map attributes to a given column index.
Hope this helps.

Write multiple classes to one CSV file

I have a list of objects that are instances of a number of sub-classes of a base class. I've been trying to write these objects out together into one CSV file.
Each class contains the fields of the base class and adds a couple of extra fields of its own.
What I am trying to achieve is to write out a csv having the base class fields first and then the columns coming from the rest of the sub-classes. This of course means that the sub-classes that don't contain a particular column name should have that field empty.
I have tried achieving this using OpenCSV and SuperCSV but have not managed to configure them to do this. Looking at the libraries code I am pretty sure OpenCSV will not do this. Using SuperCSV with Dozer I got multiple classes to write in one file but I can't get the empty columns in place where a class is missing a particular column field.
I can obviously write my own custom CSV writer to achieve this but I was wondering if anyone could help me reach a solution based off an existing CSV writer library.
Edit: SuperCSV code added below per commenter's request
private static final String[] FIELD_MAPPING = new String[] { "documentNumber", "lineOfBusiness", "clientId", "childClass1Field", };
private static final String[] FIELD_MAPPING2 = new String[] { "documentNumber", "lineOfBusiness", "clientId", "childClass2Field1", "childClass2Field2"};
public static void writeWithCsvBeanWriter(PrintWriter writer, List<ParentClass> documents) throws Exception {
CsvDozerBeanWriter beanWriter = null;
try {
beanWriter = new CsvDozerBeanWriter(writer, CsvPreference.STANDARD_PREFERENCE);
final String[] header = new String[] { "documentNumber", "lineOfBusiness", "clientId", "childClass1Field", "childClass2Field1", "childClass2Field2"};
beanWriter.configureBeanMapping(ChildClass1.class, FIELD_MAPPING);
beanWriter.configureBeanMapping(ChildClass2.class, FIELD_MAPPING2);
final CellProcessor[] processors = new CellProcessor[] { new Optional(), new Optional(), new Optional(), new Optional() }
final CellProcessor[] processors2 = new CellProcessor[] { new Optional(), new Optional(), new Optional(), new Optional(), new Optional() }
beanWriter.writeHeader(header);
for (final ParentClass document : documents) {
if (document instanceof ChildClass1) {
beanWriter.write(document, processors);
} else {
beanWriter.write(document, processors2);
}
}
} finally {
if (beanWriter != null) {
beanWriter.close();
}
}
}

Modifying JSON output for for two different functions

I have two functions that each take in an array list descriptors. I am trying to print different JSON outputs for each respective function. I am using the Gson library to help me accomplish this task. I use a Client Data model object to help format the JSON correctly. Attached below are the getters and setters for this.
import java.util.ArrayList;
import java.util.List;
import com.google.gson.annotations.SerializedName;
public class ClientData {
#SerializedName("TrialCountryCodes")
private List<String> trialCountryCodes;
#SerializedName("CancerGenePanel")
private String cancerGenePanel;
public ClientData() {
this.trialCountryCodes = new ArrayList<String>();
}
public List<String> getTrialCountryCodes() {
return trialCountryCodes;
}
public void setTrialCountryCodes(List<String> trialCountryCodes) {
this.trialCountryCodes = trialCountryCodes;
}
public String getCancerGenePanel() {
return cancerGenePanel;
}
public void setCancerGenePanel(String cancerGenePanel) {
this.cancerGenePanel = cancerGenePanel;
}
}
The problem comes in with the Trial Country Codes. When I call one function I want Trial Country Codes to be visible in the JSON output. When I call the other one I don't want Country Codes to be visible. Attached below are the two functions one takes in one file and the other takes in two files. When the function has one file I don't want Trial Country Codes to be visible. When the function has two files I do want Trial Country Codes to be visible
descriptors = HelperMethods.getBreastCarcinomaDescriptorsFromCsvFile("/Users/edgarjohnson/eclipse-workspace/CsvToJson/src/in.csv");
descriptors = HelperMethods.getBreastCarcinomaDescriptorsFromCsvFile("/Users/edgarjohnson/eclipse-workspace/CsvToJson/src/in.csv", "/Users/edgarjohnson/eclipse-workspace/CsvToJson/src/EU.csv");
HelperMethods.writeJsonFile(descriptors, "JsonOutput.json");
More BackGround info: I am getting these values from a CSV file in which I read the CSV file and write the JSON output to multiple files. This is the code that I use to format my JSON file:
public static List<BreastCarcinomaDescriptor> getBreastCarcinomaDescriptorsFromCsvFile(String fileName, String fileName2) {
List<BreastCarcinomaDescriptor> descriptorsAndCountrycodes = new ArrayList<BreastCarcinomaDescriptor>();
BufferedReader bufferedCsvFile = HelperMethods
.getCsvFileBuffer(fileName);
BufferedReader bufferedCsvFile2 = HelperMethods
.getCsvFileBuffer(fileName2);
List<String> lines = new ArrayList<String>();
List<String> line2 = new ArrayList<String>();
HelperMethods.readCsvToStrings(lines, bufferedCsvFile);
HelperMethods.readCsvToStrings(line2, bufferedCsvFile2);
List<String> countryList = new ArrayList<String>();
System.out.println(line2);
//populate the country list using file2
countryList = Arrays.asList(line2.get(0).split(","));
System.out.println(countryList);
for (String line : lines) {
BreastCarcinomaDescriptor descriptor= getBreastCarcinomaDescriptorFromCsvLine(line);
//enrich this object with country code property
descriptor.getClientData().setTrialCountryCodes(countryList);
descriptorsAndCountrycodes.add(descriptor);
}
return descriptorsAndCountrycodes;
}
private static BreastCarcinomaDescriptor getBreastCarcinomaDescriptorFromCsvLine(String line) {
BreastCarcinomaDescriptor breastCarcinomaDescriptor = new BreastCarcinomaDescriptor();
String[] data = line.split(",");
breastCarcinomaDescriptor.setBatchName(data[0]);
breastCarcinomaDescriptor.getMetadata().setCharset("utf-8");
breastCarcinomaDescriptor.getMetadata().setSchemaVersion("1.5");
if(data.length > 5) {
breastCarcinomaDescriptor.getSampleInfo().setAge(new Integer(data[5].trim()));
}
breastCarcinomaDescriptor.getSampleInfo().setCancerType(data[3].trim());
if(data.length>4) {
breastCarcinomaDescriptor.getSampleInfo().setGender(data[4].trim());
}
breastCarcinomaDescriptor.getFiles().add(data[1].concat(".*"));
// breastCarcinomaDescriptor.getClientData().getTrialCountryCodes().add(descriptorsAndCountrycodes[]);
//breastCarcinomaDescriptor.getClientData().getTrialCountryCodes().add("20");
breastCarcinomaDescriptor.getClientData().setCancerGenePanel("");
breastCarcinomaDescriptor.setCaseName(data[1]);
return breastCarcinomaDescriptor;
}
What I've Tried: I tried using custom serialization to only display Trial Country Codes when we take in one file but I am having trouble with this.
Does anyone have any ideas how I can accomplish this task. I feel like the solution is trivial. However, I don't know the Gson Library too well and I am new to java.
How formatted output should look for function that takes in 1 file:
How formatted output should look for function that takes in 2 files:

You can register two different TypeAdapters which serialize into the format you want depending on which function gets called. Then each of your functions uses it's own type adapter and can control the details of the transformation.
First function
GsonBuilder builder = new GsonBuilder();
builder.registerTypeAdapter(ClientData.class, new ClientDataWithCancerGenePanelAdapter());
Gson gson = builder.create();
Second function:
GsonBuilder builder = new GsonBuilder();
builder.registerTypeAdapter(ClientData.class, new ClientDataWithTrialCountryCodesAdapter());
Gson gson = builder.create();

Supercsv - unable to find method exception

I have the below implementation.
csvReader = new CsvBeanReader(new InputStreamReader(stream), CsvPreference.STANDARD_PREFERENCE);
lastReadIdentity = (T) csvReader.read(Packages.class, Packages.COLS);
In my Packages.class
I have set my unitcount variable.
public String getUnitCount() {
return unitCount;
}
public void setUnitCount(String unitCount) {
this.unitCount = unitCount;
}
This works fine when it is taken as a string, but when taken as a integer, it throws the below exception. Please help
private int unitCount;
public int getUnitCount() {
return unitCount;
}
public void setUnitCount(int unitCount) {
this.unitCount = unitCount;
}
Exception:
org.supercsv.exception.SuperCsvReflectionException: unable to find method setUnitCount(java.lang.String) in class com.directv.sms.data.SubscriberPackages - check that the corresponding nameMapping element matches the field name in the bean, and the cell processor returns a type compatible with the field
context=null
at org.supercsv.util.ReflectionUtils.findSetter(ReflectionUtils.java:139)
at org.supercsv.util.MethodCache.getSetMethod(MethodCache.java:95)

I'm not sure about SuperCsv, but univocity-parsers should be able to handle this without a hitch, not to mention it is at least 3 times faster to parse your input.
Just annotate your class:
public class SubscriberPackages {
#Parsed(defaultNullRead = "0") // if the file contains nulls, then they will be converted to 0.
private int unitCount; // The attribute name will be matched against the column header in the file automatically.
}
To parse the CSV into beans:
// BeanListProcessor converts each parsed row to an instance of a given class, then stores each instance into a list.
BeanListProcessor<SubscriberPackages> rowProcessor = new BeanListProcessor<SubscriberPackages>(SubscriberPackages.class);
CsvParserSettings parserSettings = new CsvParserSettings(); //many options here, check the tutorial.
parserSettings.setRowProcessor(rowProcessor); //uses the bean processor to handle your input rows
parserSettings.setHeaderExtractionEnabled(true); // extracts header names from the input file.
CsvParser parser = new CsvParser(parserSettings); //creates a parser with your settings.
parser.parse(new FileReader(new File("/path/to/file.csv"))); //all rows parsed here go straight to the bean processor
// The BeanListProcessor provides a list of objects extracted from the input.
List<SubscriberPackages> beans = rowProcessor.getBeans();
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

How to rename Columns via Lambda function - fasterXML

Im using the FasterXML library to parse my CSV file. The CSV file has the column names in its first line. Unfortunately I need the columns to be renamed. I have a lambda function for this, where I can pass the red value from the csv file in and get the new value.
my code looks like this, but does not work.
CsvSchema csvSchema =CsvSchema.emptySchema().withHeader();
ArrayList<HashMap<String, String>> result = new ArrayList<HashMap<String, String>>();
MappingIterator<HashMap<String,String>> it = new CsvMapper().reader(HashMap.class)
.with(csvSchema )
.readValues(new File(fileName));
while (it.hasNext())
result.add(it.next());
System.out.println("changing the schema columns.");
for (int i=0; i < csvSchema.size();i++) {
String name = csvSchema.column(i).getName();
String newName = getNewName(name);
csvSchema.builder().renameColumn(i, newName);
}
csvSchema.rebuild();
when i try to print out the columns later, they are still the same as in the top line of my CSV file.
Additionally I noticed, that csvSchema.size() equals 0 - why?

You could instead use uniVocity-parsers for that. The following solution streams the input rows to the output so you don't need to load everything in memory to then write your data back with new headers. It will be much faster:
public static void main(String ... args) throws Exception{
Writer output = new StringWriter(); // use a FileWriter for your case
CsvWriterSettings writerSettings = new CsvWriterSettings(); //many options here - check the documentation
final CsvWriter writer = new CsvWriter(output, writerSettings);
CsvParserSettings parserSettings = new CsvParserSettings(); //many options here as well
parserSettings.setHeaderExtractionEnabled(true); // indicates the first row of the input are headers
parserSettings.setRowProcessor(new AbstractRowProcessor(){
public void processStarted(ParsingContext context) {
writer.writeHeaders("Column A", "Column B", "... etc");
}
public void rowProcessed(String[] row, ParsingContext context) {
writer.writeRow(row);
}
public void processEnded(ParsingContext context) {
writer.close();
}
});
CsvParser parser = new CsvParser(parserSettings);
Reader reader = new StringReader("A,B,C\n1,2,3\n4,5,6"); // use a FileReader for your case
parser.parse(reader); // all rows are parsed and submitted to the RowProcessor implementation of the parserSettings.
System.out.println(output.toString());
//nothing else to do. All resources are closed automatically in case of errors.
}
You can easily select the columns by using parserSettings.selectFields("B", "A") in case you want to reorder/eliminate columns.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to use CSV HeaderColumnNameTranslateMappingStrategy without knowing the class before - java

Related

mapping particular column of a csv file with particular POJO's field

Write multiple classes to one CSV file

Modifying JSON output for for two different functions

Supercsv - unable to find method exception

How to rename Columns via Lambda function - fasterXML

Categories

Resources