I'm trying to create a java method that will write to a CSV file that has already been created. Each time the method is called from the class, it should append all the fields from that class, separated by commas, and then skip to the next line.
My class overall looks something like:
Student bob = new Student("Bob", "Johnson", "10111990", "B+");
// call bob on the addStudentInfo:
bob.addStudentInfo();
//this should add to the file student.txt with a line containing the 4 fields.
EDIT: woops, didn't frame a question, though ya'll really felt like answering it in the most condescending way as possible, thanks for that.
I want to know what line should be written that does that. Like, I don't want to copy paste my whole code since I can't share it all. Basically, I have the parts of it that can actually write to the file, create the file, etc. . . I just need some line in the code that skips a line in that file to the next.
The following appends a student record to the file. Irrespective of the old contents.
public class Student {
private static final char SEPARATOR = ',';
private static final String NEWLINE = "\r\n";
private static final Charset CHARSET = StandardCharsets.UTF_8;
public void addStudentInfo() throws IOException {
Path path = Paths.get(".../student.txt");
StringBuilder sb = new StringBuilder();
if (!Files.exists(path)) {
sb.append("\uFEFF")
.append("First name").append(SEPARATOR)
...
.append(NEWLINE);
}
sb.append(...).append(SEPARATOR).append(...).append(NEWLINE);
Files.write(path,
Collections.singletonList(sb),
CHARSET,
StandardOpenOption.CREATE, StandardOpenOption.APPEND);
}
This uses a Windows line-ending and a UTF-8 encoding. So on every computer this application will render the same file format, whether Linux, China or whatsoever.
When the file is empty a header line is written.
In that case as the first character a Unicode BOM character (a zero-width space) is written U+FEFF.
This marks the file as Unicode, as otherwise Windows would assume the current Windows ANSI encoding, which varies regionally. (However the BOM is optional, ugly, invisible and can play havoc, when reading all in.)
StringBuilder is a CharSequence (as is String), and this version of Files.write writes an Iterable<CharSequence>; hence we create a List containing one element.
Related
I have a 37 column CSV file that I am parsing in Java with Apache Commons CSV 1.2. My setup code is as follows:
//initialize FileReader object
FileReader fileReader = new FileReader(file);
//intialize CSVFormat object
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
//initialize CSVParser object
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
//Get a list of CSV file records
List<CSVRecord> csvRecords = csvFileParser.getRecords();
// process accordingly
My problem is that when I copy the CSV to be processed to my target directory and run my parsing program, I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Index for header 'Title' is 7 but CSVRecord only has 6 values!
at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:110)
at launcher.QualysImport.createQualysRecords(Unknown Source)
at launcher.QualysImport.importQualysRecords(Unknown Source)
at launcher.Main.main(Unknown Source)
However, if I copy the file to my target directory, open and save it, then try the program again, it works. Opening and saving the CSV adds back the commas needed at the end so my program won't compain about not having enough headers to read.
For context, here is a sample line of before/after saving:
Before (failing): "data","data","data","data"
After (working): "data","data",,,,"data",,,"data",,,,,,
So my question: why does the CSV format change when I open and save it? I'm not changing any values or encoding, and the behavior is the same for MS-DOS or regular .csv format when saving. Also, I'm using Excel to copy/open/save in my testing.
Is there some encoding or format setting I need to be using? Can I solve this programmatically?
Thanks in advance!
EDIT #1:
For additional context, when I first view an empty line in the original file, it just has the new line ^M character like this:
^M
After opening in Excel and saving, it looks like this with all 37 of my empty fields:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,^M
Is this a Windows encoding discrepancy?
Maybe that's a compatibility issue with whatever generated the file in the first place. It seems that Excel accepts a blank line as a valid row with empty strings in each column, with the number of columns to match some other row(s). Then it saves it according to CSV conventions with the column delimiter.
(the ^M is the Carriage Return character; on Microsoft systems it precedes the Line Feed character at the end of a line in text files)
Perhaps you can deal with it by creating your own Reader subclass to sit between the FileReader and the CSVParser. Your reader will read a line, and if it is blank then return a line with the correct number of commas. Otherwise just return the line as-is.
For example:
class MyCSVCompatibilityReader extends BufferedReader
{
private final BufferedReader delegate;
public MyCSVCompatibilityReader(final FileReader fileReader)
{
this.delegate = new BufferedReader(fileReader);
}
#Override
public String readLine()
{
final String line = this.delegate.readLine();
if ("".equals(line.trim())
{ return ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"; }
else
{ return line; }
}
}
There are a lot of other details to implement correctly when implementing the interface. You'll need to pass through calls to all the other methods (close, ready, reset, skip, etc.), and ensure that each of the various read() methods work correctly. It might be easier, if the file will fit in memory easily, to just read the file and write the fixed version to a new StringWriter then create a StringReader to the CSVParser.
Maybe try this:
Creates a parser for the given File.
parse(File file, Charset charset, CSVFormat format)
//import import java.nio.charset.StandardCharsets;
//StandardCharsets.UTF_8
Note: This method internally creates a FileReader using FileReader.FileReader(java.io.File) which in turn relies on the default encoding of the JVM that is executing the code.
Or maybe try withAllowMissingColumnNames?
//intialize CSVFormat object
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING).withAllowMissingColumnNames();
I am trying to read in .properties files with many different languages, add new entries to them, sort, and print back to file. I have the encoding as UTF-8, and it works for all my current languages except Russian. When reading the file in I get all question marks from the Russian file. When it prints back out it has a lot of the correct text, but has random question marks here and there. Here is my code for reading in the file.
Properties translation = new Properties() {
private static final long serialVersionUID = 1L;
#Override
public synchronized Enumeration<Object> keys() {
return Collections.enumeration(new TreeSet<Object>(super.keySet()));
}
}
byte[] readIn = Files.readAllBytes(Paths.get(filePath));
String replacer = new String(readIn).replace("\\","\\\\");
translation.load(new InputStreamReader(new ByteArrayInputStream(replacer.getBytes()),"UTF-8"));
new String(readIn) and replacer.getBytes() don't use UTF8. They use your platform default encoding. Pass StandardCharsets.UTF_8 as an additional argument to both calls.
BTW, transforming a STring to a byte array, to then transform back the bytes to characters and reading them is a waste of time and resources. Just do
translation.load(new StringReader(replacer));
Sometimes changing the encoding to utf-8 gives rise to errors, such as some extra characters or does nothing.
The link: How can i read a Russian file in Java? may help you.
Im having a strange issue trying to write in text files with strings which contain characters like "ñ", "á".. and so on. Let me first show you my little piece of code:
import java.io.*;
public class test {
public static void main(String[] args) throws Exception {
String content = "whatever";
int c;
c = System.in.read();
content = content + (char)c;
FileWriter fw = new FileWriter("filename.txt");
BufferedWriter bw = new BufferedWriter(fw);
bw.write(content);
bw.close();
}
}
In this example, im just reading a char from the keyboard input and appending it to a given string; then writting the final string into a txt. The problem is that if I type an "ñ" for example (i have a Spanish layout keyboard), when i check the txt, it shows a strange char "¤" where there should be a "ñ", that is, the content of the file is "whatever¤". The same happens with "ç", "ú"..etc. However it writes it fine ("whateverñ") if i just forget about the keyboard input and i write:
...
String content = "whateverñ";
...
or
...
content = content + "ñ";
...
It makes me think that there might be something wrong with the read() method? Or maybe im using it wrongly? or should i use a different method to get the keyboard input? or..? Im a bit lost here.
(Im using the jdk 7u45 # Windows 7 Pro x64)
So ...
It works (i.e. you can read the accented characters on the output file) if you write them as literal strings.
It doesn't work when you read them from System.in and then write them.
This suggests that the problem is on the input side. Specifically, I think your console / keyboard must be using a character encoding for the input stream that does not match the encoding that Java thinks should be used.
You should be able to confirm this tentative diagnosis by outputting the characters you are reading in hexadecimal, and then checking the codes against the unicode tables (which you can find at unicode.org for example).
It strikes me as "odd" that the "platform default encoding" appears to be working on the output side, but not the input side. Maybe someone else can explain ... and offer a concrete suggestion for fixing it. My gut feeling is that the problem is in the way your keyboard is configured, not in Java or your application.
files do not remember their encoding format, when you look at a .txt, the text editor makes a "best guess" to the encoding used.
if you try to read the file into your program again, the text should be back to normal.
also, try printing the "strange" character directly.
I have a minor problem, the \n's in my file isn't working in my output I tried two methods:
PLEASE NOTE:
*The text in the file here is a much simplified example. That is why I do not just use the output.append("\n\n"); in the second method. Also the \ns in the file are not always at the END of the line i.e. a line n the file could be Stipulation 1.1\nUnder this Stipulation...etc. *
The \n's in the file need to work. Also both JOptionPane.showMessageDialog(null,rules); and System.out.println(rules); give the same formatted output
Text in File:
A\n
B\n
C\n
D\n
Method 1:
private static void setGameRules(File f) throws FileNotFoundException, IOException
{
rules = Files.readAllLines(f.toPath(), Charset.defaultCharset());
JOptionPane.showMessageDialog(null,rules);
}
Output 1:
A\nB\nC\nD\n
Method 2:
private static void setGameRules(File f) throws FileNotFoundException, IOException
{
rules = Files.readAllLines(f.toPath(), Charset.defaultCharset());
StringBuilder output = new StringBuilder();
for (String s : rules)
{
output.append(s);
output.append("\n\n");//these \n work but the ones in my file do not
}
System.out.println(output);
}
Output 2:
A\n
B\n
C\n
D\n
The character sequence \n is simply a human readable representation of an unprintable character.
When reading it from a file, you get two characters a '\' and an 'n', not the line break character.
As such, you'll need to replace the placeholders in your file with a 'real' line break character.
Using the method I mentioned earlier: s = s.replaceAll( "\\\\n", System.lineSeparator() ); is one way, I'm sure there are others.
Perhaps in readAllLines you can add add the above line of code to do the replacement before, or as, you stick the line in the rules array.
Edit:
The reason this doesn't work the way you expect is because you're reading it from a file. If it was hardcoded into your class, the compiler would see the '\n' sequence and say "Oh boy! A line separator! I'll just replace that with (char)0x0A".
What do you mean with "it is not working"? In what way are they not working? Do you expect to see a line break? I am not sure if you actually have the characters '\n' at the end of each line, or the LineFeed Character (0x0A). The reason your '\n' would work in the Javas source is, that this is a way to escape the linefeed character. Tell us a little about your input file, how is it generated?
Second thing I notice is, that you print the text to the console in the second Method. I am not certain, that the JOptionPane will even display line breaks this way. I think it uses a JLabel, see Java: Linebreaks in JLabels? for that. The console does interpret \n as a linebreak.
The final Answer looks like this:
private static void setGameRules(File f) throws FileNotFoundException, IOException {
rules = Files.readAllLines(f.toPath(), Charset.defaultCharset());
for(int i =0;i!=rules.size();i++){
rules.set(i, rules.get(i).replaceAll( "\\\\n","\n"));
}
}
As #Ray said the \n in the file was just being read as chars \ and n not as the line seperator \n
I just added a for-loop to run through the list and replace them using:
rules.set(i, rules.get(i).replaceAll( "\\\\n","\n")
I have a text.txt file which contains following txt.
Kontagent Announces Partnership with Global Latino Social Network Quepasa
Released By Kontagent
I read this text file into a string documentText.
documentText.subString(0,9) gives Kontagent, which is good.
But, documentText.subString(87,96) gives y Kontage in windows (IntelliJ Idea) and gives Kontagent in Unix environment. I am guessing it is happening because of blank line in the file (after which the offset got screwed). But, I cannot understand, why I get two different results. I need to get one result in the both the environments.
To read file as string I used all the functions talked about here
How do I create a Java string from the contents of a file? . But, I still get same results after using any of the functions.
Currently I am using this function to read the file into documentText String:
public static String readFileAsString(String fileName)
{
File file = new File(fileName);
StringBuilder fileContents = new StringBuilder((int)file.length());
Scanner scanner = null;
try {
scanner = new Scanner(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
String lineSeparator = System.getProperty("line.separator");
try {
while(scanner.hasNextLine()) {
fileContents.append(scanner.nextLine() + lineSeparator);
}
return fileContents.toString();
} finally {
scanner.close();
}
}
EDIT: Is there a way to write a general function which will work for both windows and UNIX environments. Even if file is copied in text mode.
Because, unfortunately, I cannot guarantee that everyone who is working on this project will always copy files in binary mode.
The Unix file probably uses the native Unix EOL char: \n, whereas the Windows file uses the native Windows EOL sequence: \r\n. Since you have two EOLs in your file, there is a difference of 2 chars. Make sure to use a binary file transfer, and all the bytes will be preserved, and everything will run the same way on both OSes.
EDIT: in fact, you are the one which appends an OS-specific EOL (System.getProperty("line.separator")) at the end of each line. Just read the file as a char array using a Reader, and everything will be fine. Or use Guava's method which does it for you:
String s = CharStreams.toString(new FileReader(fileName));
On Windows, a newline character \n is prepended by \r or a carriage return character. This is non-existent in Linux. Transferring the file from one operating system to the other will not strip/append such characters but occasionally, text editors will auto-format them for you.
Because your file does not include \r characters (presumably transferred straight from Linux), System.getProperty("line.separator") will return \r\n and account for non-existent \r characters. This is why your output is 2 characters behind.
Good luck!
Based on input you guys provided, I wrote something like this
documentText = CharStreams.toString(new FileReader("text.txt"));
documentText = this.documentText.replaceAll("\\r","");
to strip off extra \r if a file has \r.
Now,I am getting expect result in windows environment as well as unix. Problem solved!!!
It works fine irrespective of what mode file has been copied.
:) I wish I could chose both of your answer, but stackoverflow doesn't allow.