Parsing in Java - unexpected quotes - java

I'm trying to understand a piece of code. I didn't write it, I'm just trying to make it work.
It's meant to transform a .csv file.
The code is this:
import java.io.*;
import au.com.bytecode.opencsv.*;
public class StockParser
{
public static void main(String[] args) throws FileNotFoundException, IOException
{
CSVReader reader = new CSVReader(new FileReader("/home/cloudera/Desktop/training.csv"));
String [] nextLine;
String [] previousLine;
String [] headernew = new String [reader.readNext().length +1];
CSVWriter writer = new CSVWriter(new FileWriter("/home/cloudera/Desktop/final.csv"), ',');
nextLine = reader.readNext();
for (int i = 0; i < nextLine.length;i++)
{
headernew[i] = nextLine[i];
}
headernew[headernew.length-1] = "action";
writer.writeNext(headernew);
previousLine = reader.readNext();
while ((nextLine = reader.readNext()) != null)
{
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + " etc...");
headernew = new String [nextLine.length + 1];
for (int i = 0; i < headernew.length-1;i++)
{
headernew[i] = nextLine[i];
}
if (Double.parseDouble(previousLine[4]) < Double.parseDouble(nextLine[4]))
{
headernew[headernew.length-1] = "SELL";
}
else
{
headernew[headernew.length-1] = "BUY";
}
writer.writeNext(headernew);
previousLine = nextLine;
}
reader.close();
writer.close();
}
}
It works generally, but there's a problem: the input file's first line is
Open,High,Low,Close,Volume,Adj Close
followed by lines like
59.30,60.05,58.88,59.41,3373800,59.41
The output file should have the same first line, + action, and then similar lines, + BUY or SELL, but when I run this code, it somehow loses the
Open,High,Low,Close,Volume,Adj Close,action
line,
and the next lines look like
"59.64","60.26","58.88","59.83","3069100","59.83","BUY"
Where did the quotes come from, and what should I do to get rid of them?

This is an answer to:
Where did the quotes come from, and what should I do to get rid of
them?
The constructor described in the documentation of CSVWriter allows specifying a quote-character. Try the following:
CSVWriter writer = new CSVWriter(new FileWriter("/home/cloudera/Desktop/final.csv"), ',', CSVWriter.NO_QUOTE_CHARACTER);
The last parameter should suppress all quoting characters.

To answer you simply, use
CSVWriter writer = new CSVWriter(new FileWriter(
"/home/cloudera/Desktop/final.csv"), ',',CSVWriter.NO_QUOTE_CHARACTER);
But there are few more issues with your code which I tried to correct making my assumptions.
public static void main(String[] args) throws FileNotFoundException,
IOException {
CSVReader reader = new CSVReader(new FileReader(
"training.csv"));
String[] nextLine;
String[] previousLine;
nextLine = reader.readNext();
String[] headernew = new String[nextLine.length + 1];
CSVWriter writer = new CSVWriter(new FileWriter(
"final.csv"), ',', CSVWriter.NO_QUOTE_CHARACTER);
for (int i = 0; i < nextLine.length; i++) {
headernew[i] = nextLine[i];
}
headernew[headernew.length - 1] = "action";
writer.writeNext(headernew);
previousLine = reader.readNext();
while ((nextLine = reader.readNext()) != null) {
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + " etc...");
headernew = new String[nextLine.length + 1];
for (int i = 0; i < headernew.length - 1; i++) {
headernew[i] = nextLine[i];
}
if (Double.parseDouble(previousLine[4]) < Double
.parseDouble(nextLine[4])) {
headernew[headernew.length - 1] = "SELL";
} else {
headernew[headernew.length - 1] = "BUY";
}
writer.writeNext(headernew);
previousLine = nextLine;
}
reader.close();
writer.close();
}

Ok, EddyG suggested the right direction for finding the problem. The opencsv jar has Class CSVWriter, and it has a different constructor variants, among which is public CSVWriter(Writer writer,char separator,char quotechar). Fiddling with quotechar improved things.
Source: opencsv documentation

Related

Java Program Add double Quote in all the line

I'm writing a simple program that take a csv file and produce a csv with a new column. My problem is: the program quote the old columns and all the columns.
Below my code
public class CSVmodifier {
public static void modify(String Path,String Escape) throws IOException {
int i=0;
String filename = new File(Path).getName().toString();
//setName(string) modify the original filename
String fileoutname = setName(filename);
File file= new File(fileoutname);
try {
FileWriter out = new FileWriter(file);
Reader reader =
Files.newBufferedReader(Paths.get(Path));
CSVReader csvReader = new CSVReader(reader);
CSVWriter csvWriter = new CSVWriter(out);
String[] nextRecord;
while ((nextRecord = csvReader.readNext()) != null) {
int dimension = nextRecord.length;
String[] newline = new String[dimension+1];
int y = 0;
//formatNumber create a string number with 9
//zero in the front-> 1 > "000000001"
newline[0]=formatNumber(i+1);
while(y<dimension) {
newline[y+1] = nextRecord[y];
y++;
}
i+=1;
csvWriter.writeNext(newline);
}
csvWriter.close();
} finally {
}
}
public static String formatNumber(int i) {
String formatted = String.format("%09d", i);
return formatted;
}
}
my sample is :
"John","Doe","120 jefferson st.","Riverside", "NJ", "08075"
the wrong output is :
"000000001","""John""",""Doe"",""120 jefferson st."",""Riverside"", ""NJ"", ""08075"""
I cannot upload the file, but i'll show you a sample file (input line) that give the same problem:
'1231';'02512710795';'+142142';'2019/12/12';'statale';'blablabla';'iradsasad';'-123131';'+414214141';
'003';'08206810965';'+000000001492106';'2019/06/23';'Scuola statale elemetare';'Ola!'
There Output line:
'000000001';"'1231';'02512710795';'+142142';'2019/12/12';'statale';'blablabla';'iradsasad';'-123131';'+414214141'; "
'000000002';"'003';'08206810965';'+000000001492106';'2019/06/23';'Scuola statale'; "
I assume your CSVWriter class is com.opencsv.CSVWriter.
You are using csvWriter.writeNext(String[]), which is a shortcut for writeNext(nextLine, true); (JavaDoc).
The second parameter, which is always true in this case, is applyQuotesToAll:
True if all values are to be quoted. False applies quotes only to values which contain the separator, escape, quote, or new line characters.
So all you need to do, is changing this line:
csvWriter.writeNext(newline)
to this:
csvWriter.writeNext(newline, false);
This is working for me, check and let me know. Just change the logic in while block it replaces multiple double quotes to a single quote.
public class CSVmodifier {
public static void modify(String Path, String Escape) throws IOException {
int i = 0;
String filename = new File(Path).getName().toString();
//setName(string) modify the original filename
String fileoutname = setName(filename);
File file = new File(fileoutname);
try {
FileWriter out = new FileWriter(file);
Reader reader
= Files.newBufferedReader(Paths.get(Path));
CSVReader csvReader = new CSVReader(reader);
CSVWriter csvWriter = new CSVWriter(out);
String[] nextRecord;
while ((nextRecord = csvReader.readNext()) != null) {
int dimension = nextRecord.length;
String[] newline = new String[dimension + 1];
int y = 0;
//formatNumber create a string number with 9
//zero in the front-> 1 > "000000001"
newline[0] = formatNumber(i + 1);
while (y < dimension) {
newline[y + 1] = nextRecord[y].replace("\"\"", "\"");
if (newline[y + 1].startsWith("\"") && newline[y + 1].endsWith("\"")) {
newline[y + 1] = newline[y + 1].substring(1, newline[y + 1].length() - 1);
}
y++;
}
i += 1;
csvWriter.writeNext(newline);
}
csvWriter.close();
} finally {
}
}
public static String formatNumber(int i) {
String formatted = String.format("%09d", i);
return formatted;
}
}

Editing a text File in Java and saving it as a new text file

I have a text file with 5 lines, I wish to read in those lines and be able to number them 1 - 5, and save them in a different file. The numbers begin before the start of the line. I have tried to hard code in a loop to read in the number but I keep getting errors.
public class TemplateLab5Bronze {
static final String INPUT_FILE = "testLab5Bronze.txt";
static final String OUTPUT_FILE = "outputLab5Bronze.txt";
public static void main(String[] args) {
try {
FileReader in = new FileReader(INPUT_FILE);
PrintWriter out = new PrintWriter(OUTPUT_FILE);
System.out.println("Working");
BufferedReader inFile = new BufferedReader(in);
PrintWriter outFile = new PrintWriter(out);
outFile.print("Does this print?\n");
String trial = "Tatot";
outFile.println(trial);
System.out.format("%d. This is the top line\n", (int) 1.);
System.out.format("%d. \n", (int) 2.);
System.out.format("%d. The previous one is blank.\n", (int) 3.);
System.out.format("%d. Short one\n", (int) 4.);
System.out.format("%d. This is the last one.\n", (int) 5.);
/*if(int j = 1; j < 6; j++){
outFile.print( i + trial);
}*/
String line;
do {
line = inFile.readLine();
if (line != null) {
}
} while (line != null);
inFile.close();
in.close();
outFile.close();
} catch (IOException e) {
System.out.println("Doesnt Work");
}
System.out.print("Done stuff!");
}
}
This is all the code I have so far, excluding the import statements, the commented for loop is what I was trying to use. Is there another way to do this?
One way to do it is to add to the printWriter while looping through the existing file:
FileReader fr = new FileReader("//your//path//to//lines.txt");
BufferedReader br = new BufferedReader(fr);
try (PrintWriter writer = new PrintWriter("//your//other//path//newlines.txt", "UTF-8")) {
String line;
int num = 1;
while ((line = br.readLine()) != null) {
writer.println(num + ". " + line);
num++;
}
}
Note: I didn't put in any catch statements, but you might want to catch some/all of the following: FileNotFoundException, UnsupportedEncodingException, IOException
You don't need two PrintWriters. Use only one.
PrintWriter outFile = new PrintWriter(OUTPUT_FILE);
You can simply use a counter instead of a for loop (which you have incorrectly written as if - as mentioned by #Shirkam)
String line;
int count=1;
do {
line = inFile.readLine();
if (line != null) {
outFile.println( count++ +"." + line);
}
} while (line != null);
inFile.close();
This works fine at my end.

How to append multiple text in text file

I want the results from 'name' and 'code' to be inserted into log.txt file, but if I run this program only the name results gets inserted into .txt file, I cannot see code results appending under name. If I do System.outprintln(name) & System.outprintln(code) I get results printed in console but its not being inserted in a file.Can someone tell me what am I doing wrong?
Scanner sc = new Scanner(file, "UTF-8");
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter out = new PrintWriter(new FileWriter("log.txt", true));
while ((line = br.readLine()) != null) {
if (line.contains("text1")) {
String[] splits = line.split("=");
String name = splits[2];
for (int i = 0; i < name.length(); i++) {
out.println(name);
}
}
if (line.contains("text2")) {
String[] splits = line.split("=");
String code = splits[2];
for (int i = 0; i < code.length(); i++) {
out.println(code);
}
}
out.close()
}
File looks like:
Name=111111111
Code=333,5555
Category-Warranty
Name=2222222
Code=111,22
Category-Warranty
Have a look at this code. Does that work for you?
final String NAME = "name";
final String CODE = "code";
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter out = new PrintWriter(new FileWriter("log.txt", true));
while ((line = br.readLine()) != null) {
String[] splits = line.split("=");
String key = splits[0];
String value = splits[1];
if (key.equals(NAME) || key.equals(CODE)) {
out.println(value);
}
}
out.close();
You have a couple of problems in your code:
you never actually assign the variables name and code.
you close() your PrintWriter inside the while-loop, that means you will have a problem if you read more than one line.
I don't see why this wouldn't work, without seeing more of what you are doing:
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter out = new PrintWriter(new FileWriter("log.txt", true));
while ((line = br.readLine()) != null) {
if (line.contains("=")) {
if (line.contains("text1")) {
String[] splits = line.split("=");
if (splits.length >= 2) {
out.println(splits[1]);
}
}
if (line.contains("text2")) {
String[] splits = line.split("=");
if (splits.length >= 2) {
out.println(splits[1]);
}
}
}
}
out.flush();
out.close();
Make sure the second if condition is satisfied i.e. the line String contains "text2".

ArrayIndexOutOfBoundsException - when parsing a csv file

I want to transform a csv file. My file looks like that:
I am using the opencsv libary to parse my csv. That is my run method to parse the file:
public void run() throws Exception {
CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
String [] nextLine;
int i = -1;
String fileName = "";
String companyName = "";
String currency = "";
String writerPath;
List<String> returnList = null;
List<String> dateList = null;
while ((nextLine = reader.readNext()) != null && i < 10) {
String[] line = nextLine;
System.out.println(line[0]);
System.out.println(line);
i++;
//fileName of the String
if(!line[0].contains("NULL")) {
fileName = line[0];
}
writerPath = "C:\\Users\\Desktop\\CSVOutput\\" + fileName + ".csv";
//write csv file
CSVWriter writer = new CSVWriter(new FileWriter(writerPath), ';');
//write Header
String[] entries = "Name;Date;TotalReturn;Currency".split(";");
writer.writeNext(entries);
//create Content
//companyName of the String
if(!line[1].contains("Name")) {
companyName = line[1];
System.out.println(companyName);
}
//currency
if(!line[2].contains("CURRENCY")) {
currency = line[2];
}
//total returns
returnList = new ArrayList<String>();
if(line[0].contains("NULL")) {
for(int j = 3; j <= line.length; j++) {
returnList.add(line[j]); // EXCPETION COMES HERE!
}
}
//"Name;Date;TotalReturn;Currency"
List<String[]> data = new ArrayList<String[]>();
for(int m = 0; m <= line.length; m++) {
data.add(new String[] {companyName, "lolo", "hereComesTheDateLater", currency});
}
writer.writeAll(data);
//close Writer
writer.close();
}
System.out.println("Done");
}
}
I am getting an
java.lang.ArrayIndexOutOfBoundsException: 3039
at com.TransformCSV.main.ParseCSV.run(ParseCSV.java:78)
at com.TransformCSV.main.ParseCSV.main(ParseCSV.java:20)
at this line: returnList.add(line[j]);?
Why? What are possible ways to fix that?
I really appreciate your answer!
You want j < line.length and not <=. If there are 10 elements in an Array then there is not an item at index 10 - you only have 0-9.
Further using loads of variables and assigning them is not the preferred way to parse CSV. Java is an Object Orientated language.
Use an Object to represent each line and bind the line using the opencsv javabean API
You are parsing the file till length of file <= instead you have to use <. It will access the file till line.length - 1
Replace with this
for(int j = 3; j <line.length; j++) {
returnList.add(line[j]);
}

The first element discarded while sorting text file using arrays in java

I have this code to sort a text file using arrays in java, but it always discard the first line of the text while sorting.
Here is my code:
import java.io.*;
public class Main {
public static int count(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return count;
} finally {
is.close();
}
}
public static String[] getContents(File aFile) throws IOException {
String[] words = new String[count(aFile.getName()) + 1];
BufferedReader input = new BufferedReader(new FileReader(aFile));
String line = null; //not declared within while loop
int i = 0;
while ((line = input.readLine()) != null) {
words[i] = line;
i++;
}
java.util.Arrays.sort(words);
for (int k = 0; k < words.length; k++) {
System.out.println(words[k]);
}
return words;
}
public static void main(String[] args) throws IOException {
File testFile = new File("try.txt");
getContents(testFile);
}
}
Here is the text file try.txt:
Daisy
Jane
Amanda
Barbara
Alexandra
Ezabile
the output is:
Alexandra
Amanda
Barbara
Ezabile
Jane
Daisy
To solve this problem I have to insert an empty line in the beginning of the text file, is there a way not to do that? I don't know what goes wrong?
I compiled your code (on a Mac) and it works for me. Try opening the file in a hexeditor and see if there is some special character at the beginning of your file. That might be causing the sorting to be incorrect for the first line.
You probably have a BOM (Byte Order Marker) at the beginning at the file. By definition they will be interpreted as zero-width non-breaking-space.
So if you have
String textA = new String(new byte[] { (byte)0xef, (byte)0xbb, (byte) 0xbf, 65}, "UTF-8");
String textB = new String(new byte[] { 66}, "UTF-8");
System.err.println(textA + " < " + textB + " = " + (textA.compareTo(textB) < 0));
The character should show up in your length of the strings, so try printing the length of each line.
System.out.println(words[k] + " " + words[k].length());
And use a list or some other structure so you don't have to read the file twice.
Try something simpler, like this:
public static String[] getContents(File aFile) throws IOException {
List<String> words = new ArrayList<String>();
BufferedReader input = new BufferedReader(new FileReader(aFile));
String line;
while ((line = input.readLine()) != null)
words.add(line);
Collections.sort(words);
return words.toArray(new String[words.size()]);
}
public static void main(String[] args) throws IOException {
File testFile = new File("try.txt");
String[] contents = getContents(testFile);
for (int k = 0; k < contents.length; k++) {
System.out.println(contents[k]);
}
}
Notice that you don't have to iterate over the file to determine how many lines it has, instead I'm adding the lines to an ArrayList, and at the end, converting it to an array.
Use List and the add() method to read your file contents.
Then use Collections.sort() to sort the List.

Categories

Resources