I am displaying a csv file (with a UTF-8 encoding and french accents) in a JTable. To read the file I use the following lines (s_path is a String which corresponds to the path of the csv file):
reader = new CSVReader(new FileReader(s_path),',');
do{
String currentLine = reader.readNext();
...
}
To display the content of a cell in the JTable I use html (i.e., "html" and "body" markup and eventually "br" markups if a cell is displayed on several lines)
It works fine when I execute the project in Eclipse. However, if I export a .jar file from my project (and use the exact same csv files), the accent are not displayed (e.g., see the following image).
I really don't have any idea how to solve this problem. Do you have any suggestions?
Don't hesitate to ask for more details if necessary.
Thanks in advance.
If the file really is encoded using UTF8, you should use
reader = new CSVReader(new InputStreamReader(new FileInputStream(s_path), StandardCharsets.UTF_8), ',');
to read it. The way you're doing it, it uses the default charset of your platform, which is probably not UTF8.
Related
i'm trying to delete the first 3 rows of a tif file content generated by a scanner, because i cant open correctly.
example of rows to delete:
------=_Part_23XX49_-1XXXX3073.1XXXXX20715
ID: documento<br>
MimeType: image/tiff
I have no problem about change the content, but when i save the new file, i cant open correctly again.
System.out.println(new InputStreamReader(in).getEncoding());
this method tell me that the encoding of source file is "Cp1252", so i've put an argument in the JVM (-Dfile.encoding=Cp1252), but nothing appear to change.
This is what i do:
StringBuilder fileContent = new StringBuilder();
// working with content and save result content in fileContent variable
// save the file again
FileWriter fstreamWrite = new FileWriter(f.getAbsolutePath());
out = new BufferedWriter(fstreamWrite);
out.write(fileContent.toString());
Is possible that something is going wrong with Encoding?
if i do the operation with notepad++, i obtain a correct tiff that i can open without problem.
I found the TIFF Java library that maybe gonna be useful for your requirements.
Please take a look at the readme how to read and how to write a tiff file.
Hope this can help you
I using opencsv library in java and export csv. But i have problem. When i used string begin zero look like : 0123456 , when i export it remove 0 and my csv look like : 123456. Zero is missing. I using way :
"\"\t"+"0123456"+ "\""; but when csv export it look like : "0123456" . I don't want it. I want 0123456. I don't want edit from excel because some end user don't know how to edit. How to export csv using open csv and keep 0 begin string. Please help
I think it is not really the problem when generating CSV but the way excel treats the data when opened via explorer.
Tried this code, and viewed the CSV in a text editor ( not excel ), notice that it shows up correctly, though when opened in excel, leading 0s are lost !
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"));
// feed in your array (or convert your data to an array)
String[] entries = "0123131#21212#021213".split("#");
List<String[]> a = new ArrayList<>();
a.add(entries);
//don't apply quotes
writer.writeAll(a,false);
writer.close();
If you are really sure that you want to see the leading 0s for numeric values when excel is opened by user, then each cell entry be in format ="dataHere" format; see code below:
CSVWriter writer = new CSVWriter(new FileWriter("yourfile.csv"));
// feed in your array (or convert your data to an array)
String[] entries = "=\"0123131\"#=\"21212\"#=\"021213\"".split("#");
List<String[]> a = new ArrayList<>();
a.add(entries);
writer.writeAll(a);
writer.close();
This is how now excel shows when opening excel from windows explorer ( double clicking ):
But now, if we see the CSV in a text editor, with the modified data to "suit" excel viewing, it shows as :
Also see link :
format-number-as-text-in-csv-when-open-in-both-excel-and-notepad
have you tried to use String like this "'"+"0123456". ' char will mark number as text when parse into excel
For me OpenCsv works correctly ( vers. 5.6 ).
for example my csv file has a row as the following extract:
"999739059";;;"abcdefgh";"001024";
and opencsv reads the field "1024" as 001024 corretly. Of course I have mapped the field in a string, not in a Double.
But, if you still have problems, you can grab a simple yet powerful parser that fully adheres with RFC 4180 standard:
mykong.com
Mykong shows you some examples using opencsv directly and, in the end, he writes a simple parser to use if you don't want to import OpenCSV , and the parser works very well , and you can use it if you still have any problems.
So you have an easy-to-understand source code of a simple parser that you can modify as you want if you still have any problem or if you want to customize it for your needs.
if I have a delimited text file with apostrophes in, like ' as in:
BB;Art’s Tavern;6487 Western Ave., Glen Arbor, MI 49636;
what do I need to do to allow those to be parsed correctly through a BufferedReader in Java?
the code Im currently using to open the file for reading is thus in an android application:
StringBuffer buf = new StringBuffer();
InputStream is = context.getResources().openRawResource(R.raw.lvpa);
BufferedReader reader = new BufferedReader(new InputStreamReader(is,"UTF-8"));
Currently the apostrophes are being returned as question marks ? in a black box.
The contents of the file are then parsed into a model.
any help would be appreciated:)
Thanks
The file you are reading is not recorded in UTF-8. You need to know which encoding your file is in before you attempt to read it. If possible open it in whatever text editor you use to examine it and save it off in UTF-8 and try reading it again. (Some text editors will give the option of setting the encoding when you save the file.)
I am building an app where users have to guess a secret word. I have *.txt files in assets folder. The problem is that words are in Albanian language. Our language uses letters like "ë" and "ç", so whenever I try to read from the file some word containing any of those characters I get some wicked symbol and I can not implement string.compare() for these characters. I have tried many options with UTF-8, changed Eclipse setting but still the same error.
I wold really appreciate if someone has got any advice.
The code I use to read the files is:
AssetManager am = getAssets();
strOpenFile = "fjalet.txt";
InputStream fins = am.open(strOpenFile);
reader = new BufferedReader(new InputStreamReader(fins));
ArrayList<String> stringList = new ArrayList<String>();
while ((aDataRow = reader.readLine()) != null) {
aBuffer += aDataRow + "\n";
stringList.add(aDataRow);
}
Otherwise the code works fine, except for mentioned characters
It seems pretty clear that the default encoding that is in force when you create the InputStreamReader does not match the file.
If the file you are trying to read is UTF-8, then this should work:
reader = new BufferedReader(new InputStreamReader(fins, "UTF-8"));
If the file is not UTF-8, then that won't work. Instead you should use the name of the file's true encoding. (My guess is that it is in ISO/IEC_8859-1 or ISO/IEC_8859-16.)
Once you have figured out what the file's encoding really is, you need to try to understand why it does not correspond to your Java platform's default encoding ... and then make a pragmatic decision on what to do about it. (Should you hard-wire the encoding into your application ... as above? Should you make it a configuration property or command parameter? Should you change the default encoding? Should you change the file?)
You need to determine the character encoding that was used when creating the file, and specify this encoding when reading it. If it's UTF-8, for example, use
reader = new BufferedReader(new InputStreamReader(fins, "UTF-8"));
or
reader = new BufferedReader(new InputStreamReader(fins, StandardCharsets.UTF_8));
if you're under Java 7.
Text editors like Notepad++ have good heuristics to guess what the encoding of a file is. Try opening it with such an editor and see which encoding it has guessed (if the characters appear correctly).
You should know encoding of the file.
InputStream class reads file binary. Although you can interpet input as character, it will be implicit guessing, which may be wrong.
InputStreamReader class converts binary to chars. But it should know character set.
You should use the following version to feed it by character set.
UPDATE
Don't suggest you have UTF-8 encoded file, which may be wrong. Here in Russia we have such encodings as CP866, WIN1251 and KOI8, which are all differ from UTF8. Probably you have some popular Albanian encoding of text files. Check your OS setting to guess.
I am creating a CSV and writing content in UTF-8 to support German and English by specifying encoding as below
BufferedWriter outFile = new BufferedWriter( new OutputStreamWriter( outputStream, "UTF-8" ) );
The above is working fine till I add the below separator indication (;) in the header of CSV
outFile.write( "sep=;" );
outFile.newLine();
Without this delimiter ; my CSV will be wrong but when I inclde this the encoding is failing and UTf-8 not in place.
Is there any other keyword like "sep=" to specify in header of CSV to specify encoding?
I tried encoding="UTF-8" and it is not working.
Thanks.
You cannot open a UTF8 csv file with Excel 2007. Microsft have no understanding of the word "standards". Because of this, it is notoriously difficult to generate a csv file which opens in every possible application that reads .csv files and keeps the correct encoding.
If you must use Excel 2007, I would suggest using encoding with Microsofts own "windows 1252" as it supports German characters. Don't use the header, and also look in to using tab as a separator. Yes I know the c stands for comma, but tab seems to be more consistent with Excel 2007 if you save the file back again.