Let's say I have a class, Car, and I'm trying to import a large set of data to create multiple instances of "Car".
My CSV file is laid out like so:
Car Manufacturer,Model,Color,Owner,MPG,License Plate,Country of Origin,VIN,... etc
The point is, there is a lot of data that needs to be in the constructor. If there's only a few of these, it wouldn't be that bad to manually instantiate it by writing Car FordFocus = new Car(Ford,Focus,Blue,John Doe,108-J1AZ,USA,194241-12e1...), but if I have hundreds of these, is there any way to import all this data to make the classes?
As George mentions, you need a tool. I have used opencsv before to achieve this.
opencsv provides you three mapping strategies (which can be further extended) for mapping a CSV row to bean. The simplest is ColumnPositionMappingStrategy. So if your CSV format is fixed, e.g. the header row looks like:
Car Manufacturer,Model,Color,Owner,MPG,License Plate,Country of Origin,VIN,... etc
This code snippet will help you. I have also used HeaderColumnNameTranslateMappingStrategy which lets you map CSV header names to bean field names e.g. "Car Manufacturer" -> carManufacturer.
CSVReader csvReader = new CSVReader(new FileReader(csvFile));
ColumnPositionMappingStrategy<Car> strategy = new ColumnPositionMappingStrategy<Car>();
strategy.setType(Car.class);
String[] columns = new String[] {"CarManufacturer","Model","Color","Owner","MPG","LicensePlate","CountryOfOrigin","VIN"}; // the fields to bind do in your JavaBean
strategy.setColumnMapping(columns);
CsvToBean<Car> csv = new CsvToBean<Car>();
List<Car> list = csv.parse(strategy, csvReader);
A self contained sample program can be found here
Reflection is a possibility.
You can associate an attribute with a position in your CSV file (a column).
See for example of setting attribute with reflection : https://docs.oracle.com/javase/tutorial/reflect/member/fieldValues.html
You can read the csv file line by line and can create the Car object by constructor in loop.
Related
I am writing a small java method that needs to read test data from a file on my win10 laptop.
The test data has not been formed yet but it will be text based.
I need to write a method that reads the data and analyses it character by character.
My questions are:
what is the simplest format to create and read the file....I was looking at JSON, something that does not look particularly complex but is it the best for a very simple application?
My second question (and I am a novice). If the file is in a text file on my laptop.....how do I tell my java code where to find it....how do I ask java to navigate the win10 operating system?
You can also map the text file into java objects (It depends on your text file).
For example, we have a text file that contains person name and family line by line like:
Foo,bar
John,doe
So for parse above text file and map it into a java object we can :
1- Create a Person Object
2- Read and parse the file (line by line)
Create Person Class
public class Person {
private String name;
private String family;
//setters and getters
}
Read The File and Parse line by line
public static void main(String[] args) throws IOException {
//Read file
//Parse line by line
//Map into person object
List<Person> personList = Files
.lines(Paths
.get("D:\\Project\\Code\\src\\main\\resources\\person.txt"))
.map(line -> {
//Get lines of test and split by ","
//It split words of the line and push them into an array of string. Like "John,Doe" -> [John,Doe]
List<String> nameAndFamily = Splitter.on(",").trimResults().omitEmptyStrings().splitToList(line);
//Create a new Person and get above words
Person person = new Person();
person.setName(nameAndFamily.get(0));
person.setFamily(nameAndFamily.get(1));
return person;
}
).collect(Collectors.toList());
//Process the person list
personList.forEach(person -> {
//You can whatever you want to the each person
//Print
System.out.println(person.getName());
System.out.println(person.getFamily());
});
}
Regarding your first question, I can't say much, without knowing anything about the data you like to write/read.
For your second question, you would normally do something like this:
String pathToFile = "C:/Users/SomeUser/Documents/testdata.txt";
InputStream in = new FileInputStream(pathToFile);
As your data gains more complexity you should probably think about using a defined format, if that is possible, something like JSON, YAML or similar for example.
Hope this helps a bit. Good luck with your project.
As for the format the text file needs to take, you should elaborate a bit on the kind of data. So I can't say much there.
But to navigate the file system, you just need to write the path a bit different:
The drive letter is a single character at the beginning of the path i.e. no colon ":"
replace the backslash with a slash
then you should be set.
So for example...
C:\users\johndoe\documents\projectfiles\mydatafile.txt
becomes
c/users/johndoe/documents/projectfiles/mydatafile.txt
With this path, you can use all the IO classes for file manipulation.
I am writing a method to lookup a specific ID that is stored within a txt file.
These details are assigned to an arrayList titled list, if the lookup string matches the data stored in list then it reads the id,firstname,surname (IE the whole line of the txt file) and then creates an instance of another class profile.
I then want to add this lookup data to a new arrayList titled lookup then to output it. I have the below method however, it does not work and just jumps to my else clause.
Could anyone tell me where i'm going wrong and how to fix would be appreciated. Thanks.
Could you instead use a TupleMap for the same effect?
// create our map Map peopleByForename
= new HashMap>();
// populate it peopleByForename.put("Bob", new Tuple2(new Person("Bob
Smith",
new Person("Bob Jones"));
// read from it Tuple bobs = peopleByForename["Bob"];
Person bob1 = bobs.Item1; Person bob2 = bobs.Item2;
Then an example of reading the key:value from the txt file can be found here : Java read txt file to hashmap, split by ":" using Buffered Reader.
If you are using Java8, you can use Lambda to help you. Just replace this line:
if(list.contains(IDlookup))
to this one:
boolean containsId = list.stream().anyMatch((Profile p) -> p.getId().equals(IDlookup));
if (containsId)
I have many CSV files with different column header. Currently I am reading those csv files and map them to different POJO classes based on their column header. So some of the CSV files have around 100 column headers which makes difficult to create a POJO class.
So Is there any technique where I can use single pojo, so when reading those csv files can map to a single POJO class or I should read the CSV file line by line and parse accordingly or I should create the POJO during runtime(javaassist)?
If I understand your problem correctly, you can use uniVocity-parsers to process this and get the data in a map:
//First create a configuration object - there are many options
//available and the tutorial has a lot of examples
CsvParserSettings settings = new CsvParserSettings();
settings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(settings);
parser.beginParsing(new File("/path/to/your.csv"));
// you can also apply some transformations:
// NULL year should become 0000
parser.getRecordMetadata().setDefaultValueOfColumns("0000", "Year");
// decimal separator in prices will be replaced by comma
parser.getRecordMetadata().convertFields(Conversions.replace("\\.00", ",00")).set("Price");
Record record;
while ((record = parser.parseNextRecord()) != null) {
Map<String, String> map = record.toFieldMap(/*you can pass a list of column names of interest here*/);
//for performance, you can also reuse the map and call record.fillFieldMap(map);
}
Or you can even parse the file and get beans of different types in a single step. Here's how you do it:
CsvParserSettings settings = new CsvParserSettings();
//Create a row processor to process input rows. In this case we want
//multiple instances of different classes:
MultiBeanListProcessor processor = new MultiBeanListProcessor(TestBean.class, AmountBean.class, QuantityBean.class);
// we also need to grab the headers from our input file
settings.setHeaderExtractionEnabled(true);
// configure the parser to use the MultiBeanProcessor
settings.setRowProcessor(processor);
// create the parser and run
CsvParser parser = new CsvParser(settings);
parser.parse(new File("/path/to/your.csv"));
// get the beans:
List<TestBean> testBeans = processor.getBeans(TestBean.class);
List<AmountBean> amountBeans = processor.getBeans(AmountBean.class);
List<QuantityBean> quantityBeans = processor.getBeans(QuantityBean.class);
See an example here and here
If your data is too big and you can't hold everything in memory, you can stream the input row by row by using the MultiBeanRowProcessor instead. The method rowProcessed(Map<Class<?>, Object> row, ParsingContext context) will give you a map of instances created for each class in the current row. Inside the method, just call:
AmountBean amountBean = (AmountBean) row.get(AmountBean.class);
QuantityBean quantityBean = (QuantityBean) row.get(QuantityBean.class);
...
//perform something with the instances parsed in a row.
Hope this helps.
Disclaimer: I'm the author of this library. It's open-source and free (Apache 2.0 license)
To me, creating a POJO class is not a good idea in this case. As neither number of columns nor number of files are constant.
Therefore, it is better to use something more dynamic for which you do not have to change your code to a great extent just to support more columns OR files.
I would go for a List (Or Map) of Map List<Map<>> for a given csv file.
Where each map represents a row in your csv file with key as column name.
You can easily extend it to multiple csv files.
i am going to make a application, comparising 2 .csv lists, using OpenCSV. It should works like that:
Open 2 .csv files ( each file has columns: Name,Emails)
Save results ( and here is a prbolem, i don't know if it should be save to table or something)
Compare From List1 and List2 value of "Emails column".
If Email from List 1 appear on List2 - delete it(from list 1)
Export results to new .csv file
I don't know if it's good algorithm. Please Tell me which option to saving results of reading .csv file is best in that case.
Kind Regards
You can get around this more easily with univocity-parsers as it can read your data into columns:
CsvParserSettings parserSettings = new CsvParserSettings(); //parser config with many options, check the tutorial
parserSettings.setHeaderExtractionEnabled(true); // uses the first row as headers
// To get the values of all columns, use a column processor
ColumnProcessor rowProcessor = new ColumnProcessor();
parserSettings.setRowProcessor(rowProcessor);
CsvParser parser = new CsvParser(parserSettings);
//This will parse everything and pass the data to the column processor
parser.parse(new FileReader(new File("/path/to/your/file.csv")));
//Finally, we can get the column values:
Map<String, List<String>> columnValues = rowProcessor.getColumnValuesAsMapOfNames();
Let's say you parsed the second CSV with that. Just grab the emails and create a set:
Set<String> emails = new HashSet<>(columnValues.get("Email"));
Now just iterate over the first CSV and check if the emails are in the emails set.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
If you have a hard requirement to use openCSV then here is what I believe is the easiest solution:
First off I like Jeronimo's suggestion about the HashSet. Read the second csv file first using the CSVToBean and save off the email addresses in the HashSet.
Then create a Filter class that implements the CSVToBeanFilter interface. In the constructor pass in the set and in the allowLine method you look up the email address and return true if it is not in the set (so you have a quick lookup).
Then you pass the filter in the CsvToBean.parse when reading/parsing the first file and all you will get are the records from the first file whose email addresses are not on the second file. The CSVToBeanFilter javadoc has a good example that shows how this works.
Lastly use the BeanToCSV to create a file from the filtered list.
In interest of fairness I am the maintainer of the openCSV project and it is also open source and free (Apache V2.0 license).
For a project I need to deal with CSV files where I do not know the columns before runtime. The CSV files are perfectly valid, I only need to perform a simple task on several different files over and over again. I do need to analyse the values of the columns, which is why I would need to use a library for working with CSV files. For simplicity, lets assume that I need to do something simple like appending a date column to all files, regardless how many columns they have. I want to do that with Super CSV, because I use the library for other tasks as well.
What I am struggeling with is more a conceptual issue. I am not sure how to deal with the files if I do not know in advance how many columns there are. I am not sure how I should define POJOs that map arbitrary CSV files or how I should define the Cell Processors if I do not know which and how many columns will be in the file. How can I dynamically create Cell processors that match the number of columns? How would I define POJOs for instance based on the header of the CSV file?
Consider the case where I have two CSV files: products.csv and address.csv. Lets assume I want to append a date column with today’s date for both files, without having to write two different methods (e.g. addDateColumnToProduct() and addDateColumnToAddress()) which do the same thing.
product.csv:
name, description, price
"Apple", "red apple from Italy","2.5€"
"Orange", "orange from Spain","3€"
address.csv:
firstname, lastname
"John", "Doe"
"Coole", "Piet"
Based on the header information of the CSV files, how could I define a POJO that maps the product CSV? The same question for Cell Processors? How could I define even a very simple cell processor that just basically has the right amount of parameters for the constructor, e.g. for the product.csv
CellProcessor[] processor = new CellProcessor[] {
null,
null,
null
};
and for the address.csv:
CellProcessor[] processor = new CellProcessor[] {
null,
null
};
Is this even possible? Am I on the wrong track to achieve this?
Edit 1:
I am not looking for a solution that can deal with CSV files having variable columns in one file. I try to figure out if it is possible dealing with arbitrary CSV files during runtime, i.e. can I create POJOs based only on the header information which is contained in the CSV file during runtime. Without knowing in advance how many columns a csv file will have.
Solution
Based on the answer and comments from #baba
private static void readWithCsvListReader() throws Exception {
ICsvListReader listReader = null;
try {
listReader = new CsvListReader(new FileReader(fileName), CsvPreference.TAB_PREFERENCE);
listReader.getHeader(true); // skip the header (can't be used with CsvListReader)
int amountOfColumns=listReader.length();
CellProcessor[] processor = new CellProcessor[amountOfColumns];
List<Object> customerList;
while( (customerList = listReader.read(processor)) != null ) {
System.out.println(String.format("lineNo=%s, rowNo=%s, customerList=%s", listReader.getLineNumber(),
listReader.getRowNumber(), customerList));
}
}
finally {
if( listReader != null ) {
listReader.close();
}
}
}
Maybe a little bit late but could be helpful...
CellProcessor[] processors=new CellProcessor[properties.size()];
for(int i=0; i< properties.zise(); i++){
processors[i]=new Optional();
}
return processors;
This is a very common issue and there are multiple tutorials on the internetz, including the Super Csv page:
http://supercsv.sourceforge.net/examples_reading_variable_cols.html
As this line says:
As shown below you can execute the cell processors after calling
read() by calling the executeProcessors() method. Because it's done
after reading the line of CSV, you have an opportunity to check how
many columns there are (using listReader.length()) and supplying the
correct number of processors.