Get CSV file header using apache commons

Get CSV file header using apache commons - java

I have been looking for the past 2 hours for a solution to my problem in vain. I'am trying to read a CSV File using Apache commons ,I am able to read the whole file but my problem is how to extract only the header of the CSV in an array?

I looked everywhere and even the solution above didn't work.
For anyone else with this issue, this does.
Iterable<CSVRecord> records;
Reader in = new FileReader(fileLocation);
records = CSVFormat.EXCEL.withHeader().withSkipHeaderRecord(false).parse(in);
Set<String> headers = records.iterator().next().toMap().keySet();
Note that your use of .next() has consumed one row of the CSV.

By default, first record read by CSVParser will always be a header record, e.g. in the below example:
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
FileReader fileReader = new FileReader("file");
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
List csvRecords = csvFileParser.getRecords();
csvRecords.get(0) will return the header record.

BufferedReader br = new BufferedReader(new FileReader(filename));
CSVParser parser = CSVParser.parse(br, CSVFormat.EXCEL.withFirstRecordAsHeader());
List<String> headers = parser.getHeaderNames();
This worked for me. The last line is what you need, extracts the headers found by the parser into a List of Strings.

Since Apache Commons CSV v1.9.0, the withSkipHeaderRecord() & the withFirstRecordAsHeader() methods are deprecated. A builder interface is provided. Use it thusly:
CSVFormat.DEFAULT.builder()
.setHeader()
.setSkipHeaderRecord(true)
.build();

In Kotlin:
val reader = File(path).bufferedReader()
val records = CSVFormat.DEFAULT.withFirstRecordAsHeader()
.withIgnoreHeaderCase()
.withTrim()
.parse(reader)
println(records.headerNames)

The code below works for me:
import java.io.FileReader;
import org.apache.commons.csv.*;
public static String[] headersInCSVFile (String csvFilePath) throws IOException {
//reading file
CSVFormat csvFileFormat = CSVFormat.DEFAULT;
FileReader fileReader = new FileReader(csvFilePath);
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
List csvRecords = csvFileParser.getRecords();
//Obtaining first record and splitting that into an array using delimiters and removing unnecessary text
String[] headers = csvRecords.get(0).toString().split("[,'=\\]\\[]+");
String[] result = new String[headers.length - 6];
for (int i = 6; i < headers.length; i++) {
//.replaceAll("\\s", "") removes spaces
result[i - 6] = headers[i].replaceAll("\\s", "");
}
return result;
}

Related

How to skip comments in a record

I'm using the Apache Commons CSV 1.9.0 library and to parse a csv file, the problem that I cannot set a comment marker "#" to fill the comment filed in the record so they can be skipped when looping through the file.
this is the code I'm using:
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVRecord;
Reader reader = Files.newBufferedReader((Paths.get(filename)), StandardCharsets.UTF_16LE);
CSVFormat csvFormat = CSVFormat.DEFAULT;
csvFormat.builder().setCommentMarker('#');
Iterable<CSVRecord> records = csvFormat.parse(reader);
char marker = csvFormat.getCommentMarker(); // marker is for test and it is empty.
for (CSVRecord record : records)
{
if (record.isSet(SHEET_COLUMN_1))
{
// TODO
}
}
can you please help me with this?
Kind regards,
Maan

CSVFormat.builder() creates a new instance of builder, but you're using old instance of csvFormat.
Use:
CSVFormat csvFormat = CSVFormat.DEFAULT
.builder()
.setCommentMarker('#')
.build();

Java OpenCSV Split by pipe limited

I am having an issue when reading from a file using comma split. I can read the file like this:
CSVReader reader = new CSVReader(new FileReader(FileName), '|' , '"' , 0);
Then when I want to get the individual values, I can read them like this:
String[] record = rowString.split(",");
The issue of course is that comma is not the most reliable way to read a file. Is there any way to split the string by pipe delimited like this?:
String[] record = rowString.split("\\|");
This is how I am reading the lines, it may possibly be in this code where I need to make such adjustment?
for(String[] row : allRows){
String rowString = Arrays.toString(row).toString();
String[] record = rowString.split(",");
}
Thank you.

I don't know if this answer the question but in my case this solve the problem:
val reader: Reader = Files.newBufferedReader(path)
val csvToBean = CsvToBeanBuilder<MyCsvSchema>(reader)
.withType(MyCsvSchema::class.java)
.withSeparator('|')
.withIgnoreLeadingWhiteSpace(true)
.build()
val list = csvToBean.parse()
This is a Kotlin code

Including double quotes while writing CSV using apache commons in java

I am using apache commons CSV to write csv files. I want to stick to this library. While I am writing a csv file, in the first column of generated file, it contains double quotes as quote character and other columns are generated as expected.
I really want to get rid of double quotes here. Please find below code for the same.
CSVFormat format = CSVFormat.DEFAULT;
FileWriter fw = new FileWriter("Temp.csv");
CSVPrinter printer = new CSVPrinter(fw, format);
String[] temp = new String[4];
for(int i=0; i<4; i++) {
if(i==1)
temp[0] = "#";
else
temp[0] = "";
temp[1] = "hello" + (i+1);
temp[2] = "";
temp[3] = "test";
Object[] temp1 = temp[]
printer.printRecord(temp1);
}
fw.close();
printer.close();
Temp.csv
"",hello1,,test
"#",hello2,,test
"",hello3,,test
"",hello4,,test
I don't want a quote character at the beginning of every row. I just want an empty string without quotes, same as in column 3. Can anyone help?

Mentioned in lars issue tracking, try to set the CSVFormat to the following,
final CSVFormat csvFileFormat = CSVFormat.DEFAULT.withEscape('\\').withQuoteMode(QuoteMode.NONE);

This is a known issue. You can vote for it in the apache commons csv issue tracker:
https://issues.apache.org/jira/browse/CSV-63

How to rename Columns via Lambda function - fasterXML

Im using the FasterXML library to parse my CSV file. The CSV file has the column names in its first line. Unfortunately I need the columns to be renamed. I have a lambda function for this, where I can pass the red value from the csv file in and get the new value.
my code looks like this, but does not work.
CsvSchema csvSchema =CsvSchema.emptySchema().withHeader();
ArrayList<HashMap<String, String>> result = new ArrayList<HashMap<String, String>>();
MappingIterator<HashMap<String,String>> it = new CsvMapper().reader(HashMap.class)
.with(csvSchema )
.readValues(new File(fileName));
while (it.hasNext())
result.add(it.next());
System.out.println("changing the schema columns.");
for (int i=0; i < csvSchema.size();i++) {
String name = csvSchema.column(i).getName();
String newName = getNewName(name);
csvSchema.builder().renameColumn(i, newName);
}
csvSchema.rebuild();
when i try to print out the columns later, they are still the same as in the top line of my CSV file.
Additionally I noticed, that csvSchema.size() equals 0 - why?

You could instead use uniVocity-parsers for that. The following solution streams the input rows to the output so you don't need to load everything in memory to then write your data back with new headers. It will be much faster:
public static void main(String ... args) throws Exception{
Writer output = new StringWriter(); // use a FileWriter for your case
CsvWriterSettings writerSettings = new CsvWriterSettings(); //many options here - check the documentation
final CsvWriter writer = new CsvWriter(output, writerSettings);
CsvParserSettings parserSettings = new CsvParserSettings(); //many options here as well
parserSettings.setHeaderExtractionEnabled(true); // indicates the first row of the input are headers
parserSettings.setRowProcessor(new AbstractRowProcessor(){
public void processStarted(ParsingContext context) {
writer.writeHeaders("Column A", "Column B", "... etc");
}
public void rowProcessed(String[] row, ParsingContext context) {
writer.writeRow(row);
}
public void processEnded(ParsingContext context) {
writer.close();
}
});
CsvParser parser = new CsvParser(parserSettings);
Reader reader = new StringReader("A,B,C\n1,2,3\n4,5,6"); // use a FileReader for your case
parser.parse(reader); // all rows are parsed and submitted to the RowProcessor implementation of the parserSettings.
System.out.println(output.toString());
//nothing else to do. All resources are closed automatically in case of errors.
}
You can easily select the columns by using parserSettings.selectFields("B", "A") in case you want to reorder/eliminate columns.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

parsing csv using java [duplicate]

Here is the line i am using currently
File booleanTopicFile;
// booleanTopicFile is csv file uploaded from form
CSVReader csvReader = new CSVReader(new InputStreamReader(new FileInputStream(booleanTopicFile), "UTF-8"));
Want to skip the first line of the csv which contains headings.
I dont want to use any separator as except the default one comma(,) which is already available in default constructor.
In parameterized constructor there is a option to skip no. of lines but how to deal with the 2nd and 3rd param of the constructor.
CSVReader csvReader = new CSVReader(new InputStreamReader(Reader reader, char c, char c1, int index);
--
Thanks

This constructor of CSVReader class will skip 1st line of the csv while reading the file.
CSVReader reader = new CSVReader(new FileReader(file), ',', '\'', 1);

At least since version 3.8 you can use the CSVReaderBuilder and set it to skip the first line.
Example:
CSVReader reader = new CSVReaderBuilder(inputStreamReader)
.withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
// Skip the header
.withSkipLines(1)
.build();

I found this question and response helpful, I'd like to expand on Christophe Roussy's comment. In the latest opencsv (2.3 as of this writing) The actual line of code is:
new CSVReader( new StringReader(csvText), CSVParser.DEFAULT_SEPARATOR,
CSVParser.DEFAULT_QUOTE_CHARACTER, 1);
Note it uses CSVParser instead of CSVReader.

with latest version opencsv version use -
CSVReader csvReader = new CSVReaderBuilder(new FileReader("book.csv")).withSkipLines(1).build()

watFileCsvBeans = new CsvToBeanBuilder<ClassType>(isr)
.withType(ClassType.class)
.withIgnoreLeadingWhiteSpace(true)
// CsvToBeanFilter with a custom allowLine implementation
.withFilter(line -> !line[0].equals("skipme"))
.build()
.parse();
It's useful in my case. Instead, "withSkipLines" is not working for me.
opencsv version: 5.5.2

You can also use withFilter:
watFileCsvBeans = new CsvToBeanBuilder<ClassType>(isr)
.withType(ClassType.class)
.withIgnoreLeadingWhiteSpace(true)
// CsvToBeanFilter with a custom allowLine implementation
.withFilter(line -> !line[0].equals("skipme"))
.build()
.parse();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Get CSV file header using apache commons - java

I have been looking for the past 2 hours for a solution to my problem in vain. I'am trying to read a CSV File using Apache commons ,I am able to read the whole file but my problem is how to extract only the header of the CSV in an array?

Since Apache Commons CSV v1.9.0, the withSkipHeaderRecord() & the withFirstRecordAsHeader() methods are deprecated. A builder interface is provided. Use it thusly: CSVFormat.DEFAULT.builder() .setHeader() .setSkipHeaderRecord(true) .build();

In Kotlin: val reader = File(path).bufferedReader() val records = CSVFormat.DEFAULT.withFirstRecordAsHeader() .withIgnoreHeaderCase() .withTrim() .parse(reader) println(records.headerNames)

Related

How to skip comments in a record

Java OpenCSV Split by pipe limited

Including double quotes while writing CSV using apache commons in java

How to rename Columns via Lambda function - fasterXML

parsing csv using java [duplicate]

Categories

Resources