How to read csv file of excel using apache csv - java

I have a CSV file(excel) in which each data is available in cell like excel. I'm using Apache CSV to parse excel type csv. While parsing the data my each character is getting separated by null character in between. This CSV file I'm getting from different source but when I'm copying the same excel csv file data and making a excel csv file manually, my below code able to get the desired output.
My CSV Excel:
My Code:
InputStream is =null;
Reader in =null;
is = new ByteArrayInputStream(excelCSVFile);
in = new InputStreamReader(is);
CSVParser parser = new CSVParser(in, CSVFormat.EXCEL.withHeader("EMPLOYEE_ID","NAME","DOJ",
"MOBILE_NO").withSkipHeaderRecord(true).withTrim(true));
List<CSVRecord> records = parser.getRecords();
for (CSVRecord record : records) {
System.out.println("Employee Id::"+record.get("EMPLOYEE_ID"));
System.out.println("Employee Name::"+record.get("NAME"));
}
I'm getting output here by above code is:
When I checked the ASCII value for blank character I got '0' as value means Null character.
I was looking to get output like this:

is = new FileInputStream(excelCSVFile);
instead of
is = new ByteArrayInputStream(excelCSVFile);
Note also that the fields are case-sensitive.

Related

Using string builder to place XML in single csv/xls column not working

I am having an issue with an XML response and formatting it into CSV which can then later be opened as an XLS and see the entire response into a single cell. I know.. its not how I would do it either, but they get what they ask for.
So far I have tried to use a string builder. This has been successful in formatting the response into a single line string, I have tested this by writing it to a text file and copying it to Eclipse.. when I place single quotes around the XML it turns to a string.
When trying to take this same response in its single line format and stick it into a csv file.. the csv file is breaking on comma's in the XML string and placing the response across several dozen cells.
BufferedReader br = new BufferedReader(new FileReader(new File('responseXml.txt')));
String l;
StringBuilder sb = new StringBuilder();
while((l=br.readLine())!= null){sb.append(l.trim());
File respfile = new File("outresp.txt")
respfile.append(l)
println respfile.text
//verified single line string
respContents = new File("outresp.txt").text
}
File file = new File('outXML.csv')
file.append(respContents)
println file.text
// open csv still broke across many lines
What I would like is a single xml string into a single xls cell.
to fit value into single column in csv (excel) you have two choices:
remove all new lines (\r\n) and comas (,)
replace each doublequote (") with two ones ("") and wrap whole value with doublequotes.
the second variant allows you to keep the original (multiline) string format in one excel cell.
here is the code for second variant:
//assume you have whole xml in responseXml variable
def responseXml = '''<?xml version="1.0"?>
<aaa text="hey, you">
<hello name="world"/>
</aaa>
'''
//take xml string in double quotes and escape all doublequotes
responseXml = '"'+ responseXml.replaceAll(/"/,'""') + '"'
def csv = new File('/11/1.csv')
csv.setText("col1,col2,xml\nfoo,bar") // emulate some existing file
csv.append(",${responseXml}\n")
as a result:

How to keep zero begin string when export data using opencsv library

I using opencsv library in java and export csv. But i have problem. When i used string begin zero look like : 0123456 , when i export it remove 0 and my csv look like : 123456. Zero is missing. I using way :
"\"\t"+"0123456"+ "\""; but when csv export it look like : "0123456" . I don't want it. I want 0123456. I don't want edit from excel because some end user don't know how to edit. How to export csv using open csv and keep 0 begin string. Please help
I think it is not really the problem when generating CSV but the way excel treats the data when opened via explorer.
Tried this code, and viewed the CSV in a text editor ( not excel ), notice that it shows up correctly, though when opened in excel, leading 0s are lost !
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"));
// feed in your array (or convert your data to an array)
String[] entries = "0123131#21212#021213".split("#");
List<String[]> a = new ArrayList<>();
a.add(entries);
//don't apply quotes
writer.writeAll(a,false);
writer.close();
If you are really sure that you want to see the leading 0s for numeric values when excel is opened by user, then each cell entry be in format ="dataHere" format; see code below:
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"));
// feed in your array (or convert your data to an array)
String[] entries = "=\"0123131\"#=\"21212\"#=\"021213\"".split("#");
List<String[]> a = new ArrayList<>();
a.add(entries);
writer.writeAll(a);
writer.close();
This is how now excel shows when opening excel from windows explorer ( double clicking ):
But now, if we see the CSV in a text editor, with the modified data to "suit" excel viewing, it shows as :
Also see link :
format-number-as-text-in-csv-when-open-in-both-excel-and-notepad
have you tried to use String like this "'"+"0123456". ' char will mark number as text when parse into excel
For me OpenCsv works correctly ( vers. 5.6 ).
for example my csv file has a row as the following extract:
"999739059";;;"abcdefgh";"001024";
and opencsv reads the field "1024" as 001024 corretly. Of course I have mapped the field in a string, not in a Double.
But, if you still have problems, you can grab a simple yet powerful parser that fully adheres with RFC 4180 standard:
mykong.com
Mykong shows you some examples using opencsv directly and, in the end, he writes a simple parser to use if you don't want to import OpenCSV , and the parser works very well , and you can use it if you still have any problems.
So you have an easy-to-understand source code of a simple parser that you can modify as you want if you still have any problem or if you want to customize it for your needs.

CSVPrinter with break line characters

I'm using org.apache.commons.csv.CSVPrinter (Java 8) in order to produce a CSV text file starting from a DB RecordSet. I have a description field in my DB table on where the user can insert whatever he want, such as a new line!
As I import the CSV on Excel or Google Spreadsheet each line with a new line character in the description corrupts the CSV structure, obviously.
Should I replace/remove these characters manually or is there a way to configure CSVPrinter in order to remove it automatically?
Thank you all in advance.
F
Edit: here a code snippet:
CSVFormat csvFormat = CSVFormat.DEFAULT.withRecordSeparator("\n").withQuoteMode(QuoteMode.ALL).withQuote('"');
CSVPrinter csvPrinter = new CSVPrinter(csvContent, csvFormat);
// prepare a list of string gathered from the DB. I explicitly use a String array because I need to perform some text editing to DB content before writing it in the CSV
List fasciaOrariaRecord = new ArrayList();
fasciaOrariaRecord.add(...);
fasciaOrariaRecord.add(...);
// ...
csvPrinter.printRecord(csvHeader);
// more rows...
csvPrinter.close();
Any value with line endings should be escaped with quotes. If your CSV library is not doing this for you automatically I'd recommend using univocity-parsers. In your particular case, there is a pre-built routine you can use to dump database contents into CSV.
Try this:
ResultSet resultSet = statement.executeQuery("SELECT * FROM table");
//Get a CSV writer settings object pre-configured for Excel
CsvWriterSettings writerSettings = Csv.writeExcel();
writerSettings.setHeaderWritingEnabled(true); //writes the column names to the output file
CsvRoutines routines = new CsvRoutines(writerSettings);
//use an encoding Excel likes
routines.write(resultSet, new File("/path/to/output.csv"), "windows-1252");
Hope this helps.
Disclaimer: I'm the author of this library. It's open source and free (Apache 2.0 license)

opencsv content - values after comma

I am using Java and opencsv(2.3) to create csv files.
It is created properly. But when I am opening the file I see all the data appears in single column.
In order to align the values into separate columns
1.I select "Text to Columns" in data tab of excel
2.And I select Delimiter as ";"
I see all the values are splitted into separte columns properly but the values after comma are getting vanished
CSVWriter I use to create CSV files:
File file = new File(fileName);
CSVWriter writer = new CSVWriter(new FileWriter(fileName, true), ';');
String[] col= new String[4];
for(Customer c : CustomerList) {
col[0] = c.getCustomerName();
col[1] = c.getCustomerId();
col[2] = c.getCustomerBirthDate();
col[3] = c.getRegFee(); /** 145,65**/
col[4] = c.getRegPlace();
writer.writeNext(col);
}
writer.close();
CSV File - Actual content:
"Micky";"1";"19901220";"455,56";"Place1"
"Grace";"2";"19901231";"465,87";"Place2"
CSV File - while opening using excel:
"Micky";"1";"19901220";"455" // , 56 and Place1 are vanished
"Grace";"2";"19901231";"465" // , 87 and Place2 are vanished
I think the problem is to do with the way you're importing it to Excel.
Using your sample above, I've created a CSV file and opened it in Notepad to verify the content.
If you double-click a CSV file (and have Excel associated with that file type) it will open in Excel and it looks like Excel is attempting to use the comma as a delimiter by default. It displays the data across 2 columns.
If you open Excel, then import the CSV file you can tell Excel that your file is delimited and that the semi-colon is the delimiter. Import using the From Text menu item from the Data tab:
It will then display correctly:

Error Parsing due to CSV Differences Before/After Saving (Java w/ Apache Commons CSV)

I have a 37 column CSV file that I am parsing in Java with Apache Commons CSV 1.2. My setup code is as follows:
//initialize FileReader object
FileReader fileReader = new FileReader(file);
//intialize CSVFormat object
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
//initialize CSVParser object
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
//Get a list of CSV file records
List<CSVRecord> csvRecords = csvFileParser.getRecords();
// process accordingly
My problem is that when I copy the CSV to be processed to my target directory and run my parsing program, I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Index for header 'Title' is 7 but CSVRecord only has 6 values!
at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:110)
at launcher.QualysImport.createQualysRecords(Unknown Source)
at launcher.QualysImport.importQualysRecords(Unknown Source)
at launcher.Main.main(Unknown Source)
However, if I copy the file to my target directory, open and save it, then try the program again, it works. Opening and saving the CSV adds back the commas needed at the end so my program won't compain about not having enough headers to read.
For context, here is a sample line of before/after saving:
Before (failing): "data","data","data","data"
After (working): "data","data",,,,"data",,,"data",,,,,,
So my question: why does the CSV format change when I open and save it? I'm not changing any values or encoding, and the behavior is the same for MS-DOS or regular .csv format when saving. Also, I'm using Excel to copy/open/save in my testing.
Is there some encoding or format setting I need to be using? Can I solve this programmatically?
Thanks in advance!
EDIT #1:
For additional context, when I first view an empty line in the original file, it just has the new line ^M character like this:
^M
After opening in Excel and saving, it looks like this with all 37 of my empty fields:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,^M
Is this a Windows encoding discrepancy?
Maybe that's a compatibility issue with whatever generated the file in the first place. It seems that Excel accepts a blank line as a valid row with empty strings in each column, with the number of columns to match some other row(s). Then it saves it according to CSV conventions with the column delimiter.
(the ^M is the Carriage Return character; on Microsoft systems it precedes the Line Feed character at the end of a line in text files)
Perhaps you can deal with it by creating your own Reader subclass to sit between the FileReader and the CSVParser. Your reader will read a line, and if it is blank then return a line with the correct number of commas. Otherwise just return the line as-is.
For example:
class MyCSVCompatibilityReader extends BufferedReader
{
private final BufferedReader delegate;
public MyCSVCompatibilityReader(final FileReader fileReader)
{
this.delegate = new BufferedReader(fileReader);
}
#Override
public String readLine()
{
final String line = this.delegate.readLine();
if ("".equals(line.trim())
{ return ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"; }
else
{ return line; }
}
}
There are a lot of other details to implement correctly when implementing the interface. You'll need to pass through calls to all the other methods (close, ready, reset, skip, etc.), and ensure that each of the various read() methods work correctly. It might be easier, if the file will fit in memory easily, to just read the file and write the fixed version to a new StringWriter then create a StringReader to the CSVParser.
Maybe try this:
Creates a parser for the given File.
parse(File file, Charset charset, CSVFormat format)
//import import java.nio.charset.StandardCharsets;
//StandardCharsets.UTF_8
Note: This method internally creates a FileReader using FileReader.FileReader(java.io.File) which in turn relies on the default encoding of the JVM that is executing the code.
Or maybe try withAllowMissingColumnNames?
//intialize CSVFormat object
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING).withAllowMissingColumnNames();

Categories

Resources