I want to write Burmese text into .csv file. After writing Burmese text to a .csv file I open that file with MS Office but it does not show the Burmese text.
A Burmese font is setup in my PC.
Below is my code:
OutputStreamWriter char_output = new OutputStreamWriter(new
FileOutputStream(CSV.getAbsolutePath().toString()),
Charset.forName("UTF-8").newEncoder());
char_output.write(message + str);
char_output.write("\n");
for (int i = 0; i < pList.size(); i++)
{
StringBuilder sb = new StringBuilder();
sb.append(pList.get(i).getOrderNumber()).append(",").append(pList.get(i).getProductName()).append(",");
sb.append(pList.get(i).getProductDiscription()).append(",").append(pList.get(i).getWeightKg()).append(",");
sb.append(pList.get(i).getWeightViss()).append(",").append(pList.get(i).getQty()).append(",");
sb.append(pList.get(i).getDate()).append("\n");
char_output.write(sb.toString())`FileOutputStream(CSV.getAbsolutePath().toString() ),
}
char_output.close();
Many would say, use a CSV library.
Instead of Charset.forName("UTF-8").newEncoder() it suffices to use "UTF-8", but not wrong.
Instead of "\n" under Windows "\r\n" might be more convenient. (System.getProperty("file.encoding") would take Android's "\n".)
You need to handle commas and quotes. If a string value contains a comma, then it should be withing double quotes. Every inner double quote self-escaped, that is doubled.
Instead of a comma, also semi-colon is used. Even better is a tab character "\t".
To detect UTF-8, you may write a BOM character at the begin of file. This is a zero-width space. BOM = Byte Order Mark, refering to UTF-16LE and UTF-16BE (reversed byte pair).
sb.append("\uFEFF"); // Add a BOM at the begin of file.
The file ending might be ".csv" but you may also lie, and give it an ending ".xls" to let Excel open it by double clicking.
Related
I am having an issue with an XML response and formatting it into CSV which can then later be opened as an XLS and see the entire response into a single cell. I know.. its not how I would do it either, but they get what they ask for.
So far I have tried to use a string builder. This has been successful in formatting the response into a single line string, I have tested this by writing it to a text file and copying it to Eclipse.. when I place single quotes around the XML it turns to a string.
When trying to take this same response in its single line format and stick it into a csv file.. the csv file is breaking on comma's in the XML string and placing the response across several dozen cells.
BufferedReader br = new BufferedReader(new FileReader(new File('responseXml.txt')));
String l;
StringBuilder sb = new StringBuilder();
while((l=br.readLine())!= null){sb.append(l.trim());
File respfile = new File("outresp.txt")
respfile.append(l)
println respfile.text
//verified single line string
respContents = new File("outresp.txt").text
}
File file = new File('outXML.csv')
file.append(respContents)
println file.text
// open csv still broke across many lines
What I would like is a single xml string into a single xls cell.
to fit value into single column in csv (excel) you have two choices:
remove all new lines (\r\n) and comas (,)
replace each doublequote (") with two ones ("") and wrap whole value with doublequotes.
the second variant allows you to keep the original (multiline) string format in one excel cell.
here is the code for second variant:
//assume you have whole xml in responseXml variable
def responseXml = '''<?xml version="1.0"?>
<aaa text="hey, you">
<hello name="world"/>
</aaa>
'''
//take xml string in double quotes and escape all doublequotes
responseXml = '"'+ responseXml.replaceAll(/"/,'""') + '"'
def csv = new File('/11/1.csv')
csv.setText("col1,col2,xml\nfoo,bar") // emulate some existing file
csv.append(",${responseXml}\n")
as a result:
I am building an app where users have to guess a secret word. I have *.txt files in assets folder. The problem is that words are in Albanian language. Our language uses letters like "ë" and "ç", so whenever I try to read from the file some word containing any of those characters I get some wicked symbol and I can not implement string.compare() for these characters. I have tried many options with UTF-8, changed Eclipse setting but still the same error.
I wold really appreciate if someone has got any advice.
The code I use to read the files is:
AssetManager am = getAssets();
strOpenFile = "fjalet.txt";
InputStream fins = am.open(strOpenFile);
reader = new BufferedReader(new InputStreamReader(fins));
ArrayList<String> stringList = new ArrayList<String>();
while ((aDataRow = reader.readLine()) != null) {
aBuffer += aDataRow + "\n";
stringList.add(aDataRow);
}
Otherwise the code works fine, except for mentioned characters
It seems pretty clear that the default encoding that is in force when you create the InputStreamReader does not match the file.
If the file you are trying to read is UTF-8, then this should work:
reader = new BufferedReader(new InputStreamReader(fins, "UTF-8"));
If the file is not UTF-8, then that won't work. Instead you should use the name of the file's true encoding. (My guess is that it is in ISO/IEC_8859-1 or ISO/IEC_8859-16.)
Once you have figured out what the file's encoding really is, you need to try to understand why it does not correspond to your Java platform's default encoding ... and then make a pragmatic decision on what to do about it. (Should you hard-wire the encoding into your application ... as above? Should you make it a configuration property or command parameter? Should you change the default encoding? Should you change the file?)
You need to determine the character encoding that was used when creating the file, and specify this encoding when reading it. If it's UTF-8, for example, use
reader = new BufferedReader(new InputStreamReader(fins, "UTF-8"));
or
reader = new BufferedReader(new InputStreamReader(fins, StandardCharsets.UTF_8));
if you're under Java 7.
Text editors like Notepad++ have good heuristics to guess what the encoding of a file is. Try opening it with such an editor and see which encoding it has guessed (if the characters appear correctly).
You should know encoding of the file.
InputStream class reads file binary. Although you can interpet input as character, it will be implicit guessing, which may be wrong.
InputStreamReader class converts binary to chars. But it should know character set.
You should use the following version to feed it by character set.
UPDATE
Don't suggest you have UTF-8 encoded file, which may be wrong. Here in Russia we have such encodings as CP866, WIN1251 and KOI8, which are all differ from UTF8. Probably you have some popular Albanian encoding of text files. Check your OS setting to guess.
I'm loading string resources from a text file (so as to not have to rebuild if I need to change them) which when appended to the JTextArea displays as "Some sentence,\n on the same line."
When I hard code the exact same String, it appends fine.
Where could this be going wrong?
What does your text file look like? If "\n" is in the text file it's probably copied literally, i.e. it's not treated as an escape sequence.
EDIT: You could try reading the text file as a property file and automatically have e.g. \n parsed a newline.
Properties p = new Properties();
InputStream fileStream = new FileInputStream("myfile.txt");
p.load(fileStream);
String value = p.getProperty(key);
In the text file do this...
"1st_Half_of_String"+"\n"+"2nd_Half_of_String"
StringBuffer contents=new StringBuffer();
BufferedReader input = new BufferedReader(new FileReader("/home/xyz/abc.txt"));
String line = null; //not declared within while loop
while (( line = input.readLine()) != null){
contents.append(line);
}
System.out.println(contents.toString());
File abc.txt contains
\u0905\u092d\u0940 \u0938\u092e\u092f \u0939\u0948 \u091c\u0928\u0924\u093e \u091c\u094b \u091a\u093e\u0939\u0924\u0940 \u0939\u0948 \u092
I want to dispaly in Hindi language in console using Java.
if i simply print like this
String str="\u0905\u092d\u0940 \u0938\u092e\u092f \u0939\u0948 \u091c\u0928\u0924\u093e \u091c\u094b \u091a\u093e\u0939\u0924\u0940 \u0939\u0948 \u092";
System.out.println(str);
then it works fine but when i try to read from a file it doesn't work.
help me out.
Use Apache Commons Lang.
import org.apache.commons.lang3.StringEscapeUtils;
// open the file as ASCII, read it into a string, then
String escapedStr; // = "\u0905\u092d\u0940 \u0938\u092e\u092f \u0939\u0948 ..."
// (to include such a string in a Java program you would have to double each \)
String hindiStr = StringEscapeUtils.unescapeJava( escapedStr );
System.out.println(hindiStr);
(Make sure your console is set up to display Hindi (correct fonts, etc) and the console's encoding matches your Java encoding. The Java code above is just the bare bones.)
You should store the contents in the file as UTF-8 encoded Hindi characters. For instance, in your case it would be अभी समय है जनता जो चाहती है. That is, instead of saving unicode escapes, directly save the raw Hindi characters. You can then simply read like normal.
You just have to make sure that the editor you use saves it using UTF-8 encoding. See Spanish language chars are not displayed properly?
Otherwise, you'll have to make the file a .properties file and read using java.util.Properties as it offers unicode unescaping support inherently.
Also read Reading unicode character in java
I am developing a small java application. At some point i am writing some data in a plain text file. Using the following code:
Writer Candidateoutput = null;
File Candidatefile = new File("Candidates.txt"),
Candidateoutput = new BufferedWriter(new FileWriter(Candidatefile));
Candidateoutput.write("\n Write this text on next line");
Candidateoutput.write("\t This is indented text");
Candidateoutput.close();
Now every thing goes fine, the file is created with the expected text. The only problem is that the text was not formatted all the text was on single line. But if I copy and paste the text in MS Word then the text is formatted automatically.
Is there any way to preserver text formatting in Plain text file as well?
Note: By text formatting I am referring to \n and \t only
Use System.getProperty("line.separator") for new lines - this is the platform-independent way of getting the new-line separator. (on windows it is \r\n, on linux it's \n)
Also, if this is going to be run on non-windows machines, avoid using \t - use X (four) spaces instead.
You can use line.separator system property to solve your issue.
E.g.
String separator = System.getProperty("line.separator");
Writer Candidateoutput = null;
File Candidatefile = new File("Candidates.txt"),
Candidateoutput = new BufferedWriter(new FileWriter(Candidatefile));
Candidateoutput.write(separator + " Write this text on next line");
Candidateoutput.write("\t This is indented text");
Candidateoutput.close();
line.separator system property is a platform independent way of getting a newline from your environment.
A PrintWriter does this platform independent - use the println() methods.
You would have to use the Java utility Formatter which can be found here: java.util.Formatter
Then all you would have to do is create an object of Formatter type such as this:
private Formatter output;
In this case, output will be the output file you are writing to.
Then you have to pass the file name to the output object like this:
output = new Formatter("name.of.your.file.txt")
Once that's done, you can either hard-code the file contents to your output file using the output.format command which is similar to the System.out.println or printf commands.
Or use the Scanner utility to input the data into memory then use output.format to output this data to the output object or file.
This is an example on how to write a record to output:
output.format( "%d %s %s %2f\n" , field1.decimal, field2.string, field3.string, field4.double)
There is a little bit more to it than this, but this sure beats parsing data, or using a bunch of complicated third party plugins.
To read this file you would redirect the Scanner utility to read a file instead of the console:
input = new Scanner(new File( "name.of.your.file.txt")
Window's Notepad needs \r\n to display a new-line correctly. Only \n is ignored by Notepad.
Well Windows expects a newline and a carriage return char to indicate a new line. So you'd want to do \r\n to make it work.