ArrayIndexOutOfBoundsException - when parsing a csv file - java

I want to transform a csv file. My file looks like that:
I am using the opencsv libary to parse my csv. That is my run method to parse the file:
public void run() throws Exception {
CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
String [] nextLine;
int i = -1;
String fileName = "";
String companyName = "";
String currency = "";
String writerPath;
List<String> returnList = null;
List<String> dateList = null;
while ((nextLine = reader.readNext()) != null && i < 10) {
String[] line = nextLine;
System.out.println(line[0]);
System.out.println(line);
i++;
//fileName of the String
if(!line[0].contains("NULL")) {
fileName = line[0];
}
writerPath = "C:\\Users\\Desktop\\CSVOutput\\" + fileName + ".csv";
//write csv file
CSVWriter writer = new CSVWriter(new FileWriter(writerPath), ';');
//write Header
String[] entries = "Name;Date;TotalReturn;Currency".split(";");
writer.writeNext(entries);
//create Content
//companyName of the String
if(!line[1].contains("Name")) {
companyName = line[1];
System.out.println(companyName);
}
//currency
if(!line[2].contains("CURRENCY")) {
currency = line[2];
}
//total returns
returnList = new ArrayList<String>();
if(line[0].contains("NULL")) {
for(int j = 3; j <= line.length; j++) {
returnList.add(line[j]); // EXCPETION COMES HERE!
}
}
//"Name;Date;TotalReturn;Currency"
List<String[]> data = new ArrayList<String[]>();
for(int m = 0; m <= line.length; m++) {
data.add(new String[] {companyName, "lolo", "hereComesTheDateLater", currency});
}
writer.writeAll(data);
//close Writer
writer.close();
}
System.out.println("Done");
}
}
I am getting an
java.lang.ArrayIndexOutOfBoundsException: 3039
at com.TransformCSV.main.ParseCSV.run(ParseCSV.java:78)
at com.TransformCSV.main.ParseCSV.main(ParseCSV.java:20)
at this line: returnList.add(line[j]);?
Why? What are possible ways to fix that?
I really appreciate your answer!

You want j < line.length and not <=. If there are 10 elements in an Array then there is not an item at index 10 - you only have 0-9.
Further using loads of variables and assigning them is not the preferred way to parse CSV. Java is an Object Orientated language.
Use an Object to represent each line and bind the line using the opencsv javabean API

You are parsing the file till length of file <= instead you have to use <. It will access the file till line.length - 1
Replace with this
for(int j = 3; j <line.length; j++) {
returnList.add(line[j]);
}

Related

BufferedReader saving sentence, String array, then 2D int array

I have a .txt file that contains text that I would like to save part of it in a String, part of it in a String array, and then the last part in a 2D int array, and am faced with two issues:
How to read and save both of the arrays when their size is not known ahead of time?
2D array is not reading/saving properly
Here is the text file for reference:
This is the sentence to be read.
This
is
a
String
array.
90 47 110 95 95
101 87
54 0 38 12
Here is part of my method that is supposed to read and save the three data types:
BufferedReader br = new BufferedReader(new FileReader(fileName));
sentence = br.readLine();
stringArr = new String[5]; //how to initialize without set number of elements?
for(int i = 0; i<stringArr.length; i++){
stringArr[i] = br.readLine();
}
int2DArr = new int[3][5]; //how to initialize with any amount of rows and columns?
for(int i = 0; i<int2DArr.length; i++){
for(int j = 0; j<int2DArr[i].length; j++){
int2DArr[i][j] = br.read();
//how to read the space after each int ?
}
}
How would I "grab" the size of the arrays by reading the text file, so that when I initialize both arrays, I have the proper sizes? Any help would be greatly appreciated!
Instead of trying to achieve everything in a single pass we can pass through the file twice and obtain a neater code.
It will consume double time of course but it is going to help you understand how you could break bigger problems into smaller ones and deal with them one by one.
Here are the steps:
Determine size of stringArr and intArr in first pass
Fill value in respective array in second pass
If you are wondering how no of columns for int2DArr is determine. Simply we don't do it our self. We use the concept of Jagged Arrays
Read more here How do I create a jagged 2d array in Java?
import java.util.*;
import java.io.*;
class ReadFileIntoArr {
public static void main(String args[]) throws IOException {
String fileName = "test.txt";
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line = br.readLine();
int strSize = 0;
int intSize = 0;
boolean isDigit = false;
while (line != null && line.trim().length() != 0) {
if (!isDigit && Character.isDigit(line.charAt(0)))
isDigit = true;
if (isDigit)
intSize++;
else
strSize++;
line = br.readLine();
}
br = new BufferedReader(new FileReader(fileName));
String[] stringArr = new String[strSize];
for (int i = 0; i < stringArr.length; i++)
stringArr[i] = br.readLine();
int[][] int2DArr = new int[intSize][];
for (int i = 0; i < int2DArr.length; i++)
int2DArr[i] = Arrays.stream(br.readLine().split(" ")).mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(stringArr));
System.out.println(Arrays.deepToString(int2DArr));
}
}
Note: In single pass this could be accomplished with the help of ArrayList and later transfer everything into respective array.
Update: After understanding the constraints for your problem here is another version
import java.util.*;
import java.io.*;
class ReadFileIntoArr {
public static void main(String args[]) throws IOException {
String fileName = "test.txt";
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line = br.readLine();
int strSize = 0;
int intSize = 0;
boolean isDigit = false;
while (line != null && line.trim().length() != 0) {
if (!isDigit && isDigit(line.charAt(0)))
isDigit = true;
if (isDigit)
intSize++;
else
strSize++;
line = br.readLine();
}
br = new BufferedReader(new FileReader(fileName));
String[] stringArr = new String[strSize];
for (int i = 0; i < stringArr.length; i++)
stringArr[i] = br.readLine();
int[][] int2DArr = new int[intSize][];
for (int i = 0; i < int2DArr.length; i++)
int2DArr[i] = convertStringArrToIntArr(br.readLine().split(" "));
System.out.println(Arrays.toString(stringArr));
System.out.println(Arrays.deepToString(int2DArr));
}
public static boolean isDigit(char c) {
return c >= '0' && c <= '9';
}
public static int[] convertStringArrToIntArr(String[] strArr) {
int[] intArr = new int[strArr.length];
for (int i = 0; i < strArr.length; i++)
intArr[i] = Integer.parseInt(strArr[i]);
return intArr;
}
}
Path path = Paths.get(fileName);
List<String> lines = Files.readAllLines(path, Charset.defaultCharset());
String title = lines.get(0);
List<String> words = new ArrayList<>();
for (int i = 1; i < lines.size(); ++i) {
String word = lines.get(i);
if (!word.isEmpty() && Character.isDigit(word.codePointAt(0)) {
break;
}
words.add(word);
}
String[] wordArray = words.toArray(new String[]);
int i0 = 1 + words.size();
int n = lines.size() - i0;
int[][] numbers = new int[n][];
for (int i = i0; i < lines.size(); ++i) {
String[] values = lines.get(i).trim().split("\\s+");
int m = values.length;
int[] row = new int[m];
for (int j = 0; j < m; ++m) {
row[j] = Integer.parse(values[j]);
}
numbers[i - i0] = row;
}
Path is a generalisation of File, also URLs.
Files is a treasure trove of file functions.
One could do without dynamically sized List; one would need to test first,
but normally one would use a List anyhow.
String.split splits on one or more whitespace.

Converting ArrayLists in Java

I have the following code which counts and displays the number of times each word occurs in the whole text document.
try {
List<String> list = new ArrayList<String>();
int totalWords = 0;
int uniqueWords = 0;
File fr = new File("filename.txt");
Scanner sc = new Scanner(fr);
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
for (int i = 0; i < space.length; i++) {
list.add(space[i]);
}
totalWords++;
}
System.out.println("Words with their frequency..");
Set<String> uniqueSet = new HashSet<String>(list);
for (String word : uniqueSet) {
System.out.println(word + ": " + Collections.frequency(list,word));
}
} catch (Exception e) {
System.out.println("File not found");
}
Is it possible to modify this code to make it so it only counts each occurrence once per line rather than in the entire document?
One can read the contents per line and then apply logic per line to count the words:
File fr = new File("filename.txt");
FileReader fileReader = new FileReader(file);
BufferedReader br = new BufferedReader(fileReader);
// Read the line in the file
String line = null;
while ((line = br.readLine()) != null) {
//Code to count the occurrences of the words
}
Yes. The Set data structure is very similar to the ArrayList, but with the key difference of having no duplicates.
So, just use a set instead.
In your while loop:
while (sc.hasNext()) {
String words = sc.next();
String[] space = words.split(" ");
//convert space arraylist -> set
Set<String> set = new HashSet<String>(Arrays.asList(space));
for (int i = 0; i < set.length; i++) {
list.add(set[i]);
}
totalWords++;
}
Rest of the code should remain the same.

Chinese character garbled for one line

I have 6 columns in one table and one of the column contains Chinese character and I have 200 records in that table.
I have written the code to save it one text file. The problem is while fetching all records, I am able to see the chinese text in the file. But while fetching only one record I am seeing the Chinese text is garbled.
I am using the below code.
public static void main(String args[]){
String outputFile = fileNameEncode("C:\\a\a.txt");
FileOutputStream os = new FileOutputStream(outputFile);
writeToFile(os);
}
private static String fileNameEncode(String name) {
String file;
try {
byte[] utf_byte = name.getBytes("UTF-8");
StringBuilder sb = new StringBuilder(1024);
for (byte b : utf_byte) {
int integer = b & 0xFF; // drop the minus sign
sb.append((char) integer);
}
file = sb.toString();
} catch (Exception e) {
file = name;
}
return file;
}
public void writeToFile(FileOutputStream os) {
PrintWriter pw = new PrintWriter(new BufferedWriter(new OutputStreamWriter(ostream, "GBK")));
for (int rowNum = 0; rowNum < arrayList.size(); rowNum++) {//arrayList contains data from db
ArrayList list = arrayList.get(rowNum);
for(int k = 0; k < list.size(); k++{
String[] data = new String[6];
for (int colNum = 0; colNum < 6; colNum++) {
data[colNum] = list.get(i).toString();
}
String outLine = composeLine(data, ctlInfo);
// write the line
pw.print(outLine);
pw.println();
}
}
}
private static String composeLine(String[] data, ControlInfo ctl) {
StringBuilder line = new StringBuilder();
String delim = ","
int elemCount = data.length;
for (int i = 0; i < elemCount; i++) {
if (i > 0)
line.append(delim);
if (data[i] != null && (data[i].contains("\n") || data[i].contains("\r") ||
data[i].contains("\r\n"))){
data[i] = data[i].replaceAll("(\\t|\\r?\\n)+", " ");
}
else {
line.append(data[i]);
}
}
return line.toString();
}
could you please let me know where I am wrong?
I found the issue, the code is good, the problem is in notepad++. If the character set in node pad ++ is Chinese(GB2312) then I am able to see the correct text. The note pad ++ is auto set GB2312 for two lines but for one line it is not doing auto set to GB2312.

reading content from file, then copying it to another file (updated code)

Here's the code:
FileReader fr = new FileReader("datos_clientes.txt");
BufferedReader br = new BufferedReader(fr);
while ((line = br.readLine()) != null) {
String nameMark = "#n";
String addressMark = "#d";
int nameStart = line.indexOf(nameMark) + nameMark.length();
int addressStart = line.indexOf(addressMark) + addressMark.length();
String name = line.substring(nameStart, addressStart - addressMark.length());
String address = line.substring(addressStart, line.length());
if (line.startsWith("tipo1.")) {
FileWriter fw = new FileWriter(name +".txt");
char[] vector = name.toCharArray();
char[] vector2 = address.toCharArray();
int index = 0;
while (index < vector.length) {
fw.write(vector[index]+vector2[index]);
index++;
}
fw.close();
} else if (line.startsWith("tipo2.")) {
FileWriter fw = new FileWriter(name +".txt");
char[] vector = name.toCharArray();
char[] vector2 = address.toCharArray();
int index = 0;
while (index < vector.length) {
fw.write(vector[index]+vector2[index]);
index++;
}
fw.close();
}
else if (line.startsWith("tipo3.")) {
FileWriter fw = new FileWriter(name +".txt");
char[] vector = name.toCharArray();
char[] vector2 = address.toCharArray();
int index = 0;
while (index < vector.length) {
fw.write(vector[index]+vector2[index]);
index++;
}
fw.close();
}
}
What I want from this code is to create the each new file with the name of the recipient and their address.
The new files just show a combination of random alphabethical characters.
Then I have three pre-made files which I have to include in each new file so for example if one of the new files is "Maria Roberts.txt" and this person will receive a "type 1" letter I want the file (Maria Roberts.txt) to include the name, her address and the file "type1.txt"
I don't know how to do that.
I know I add things in every new question... sorry, I thing it will be easier for me to understand it.
Thanks again!
You're adding one character from the name array with one character from the address array, then outputting the result.
fw.write(vector[index]+vector2[index]);
Instead, you want to write the entire name array, then (in a different loop) write the entire address array.
int index = 0;
while (index < vector.length) {
fw.write(vector[index]);
index++;
}
index = 0;
while (index < vector2.length) {
fw.write(vector2[index]);
index++;
}
That will just stick them together, but you can use your imagination and figure out how to separate them the way you want.

Parsing in Java - unexpected quotes

I'm trying to understand a piece of code. I didn't write it, I'm just trying to make it work.
It's meant to transform a .csv file.
The code is this:
import java.io.*;
import au.com.bytecode.opencsv.*;
public class StockParser
{
public static void main(String[] args) throws FileNotFoundException, IOException
{
CSVReader reader = new CSVReader(new FileReader("/home/cloudera/Desktop/training.csv"));
String [] nextLine;
String [] previousLine;
String [] headernew = new String [reader.readNext().length +1];
CSVWriter writer = new CSVWriter(new FileWriter("/home/cloudera/Desktop/final.csv"), ',');
nextLine = reader.readNext();
for (int i = 0; i < nextLine.length;i++)
{
headernew[i] = nextLine[i];
}
headernew[headernew.length-1] = "action";
writer.writeNext(headernew);
previousLine = reader.readNext();
while ((nextLine = reader.readNext()) != null)
{
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + " etc...");
headernew = new String [nextLine.length + 1];
for (int i = 0; i < headernew.length-1;i++)
{
headernew[i] = nextLine[i];
}
if (Double.parseDouble(previousLine[4]) < Double.parseDouble(nextLine[4]))
{
headernew[headernew.length-1] = "SELL";
}
else
{
headernew[headernew.length-1] = "BUY";
}
writer.writeNext(headernew);
previousLine = nextLine;
}
reader.close();
writer.close();
}
}
It works generally, but there's a problem: the input file's first line is
Open,High,Low,Close,Volume,Adj Close
followed by lines like
59.30,60.05,58.88,59.41,3373800,59.41
The output file should have the same first line, + action, and then similar lines, + BUY or SELL, but when I run this code, it somehow loses the
Open,High,Low,Close,Volume,Adj Close,action
line,
and the next lines look like
"59.64","60.26","58.88","59.83","3069100","59.83","BUY"
Where did the quotes come from, and what should I do to get rid of them?
This is an answer to:
Where did the quotes come from, and what should I do to get rid of
them?
The constructor described in the documentation of CSVWriter allows specifying a quote-character. Try the following:
CSVWriter writer = new CSVWriter(new FileWriter("/home/cloudera/Desktop/final.csv"), ',', CSVWriter.NO_QUOTE_CHARACTER);
The last parameter should suppress all quoting characters.
To answer you simply, use
CSVWriter writer = new CSVWriter(new FileWriter(
"/home/cloudera/Desktop/final.csv"), ',',CSVWriter.NO_QUOTE_CHARACTER);
But there are few more issues with your code which I tried to correct making my assumptions.
public static void main(String[] args) throws FileNotFoundException,
IOException {
CSVReader reader = new CSVReader(new FileReader(
"training.csv"));
String[] nextLine;
String[] previousLine;
nextLine = reader.readNext();
String[] headernew = new String[nextLine.length + 1];
CSVWriter writer = new CSVWriter(new FileWriter(
"final.csv"), ',', CSVWriter.NO_QUOTE_CHARACTER);
for (int i = 0; i < nextLine.length; i++) {
headernew[i] = nextLine[i];
}
headernew[headernew.length - 1] = "action";
writer.writeNext(headernew);
previousLine = reader.readNext();
while ((nextLine = reader.readNext()) != null) {
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + " etc...");
headernew = new String[nextLine.length + 1];
for (int i = 0; i < headernew.length - 1; i++) {
headernew[i] = nextLine[i];
}
if (Double.parseDouble(previousLine[4]) < Double
.parseDouble(nextLine[4])) {
headernew[headernew.length - 1] = "SELL";
} else {
headernew[headernew.length - 1] = "BUY";
}
writer.writeNext(headernew);
previousLine = nextLine;
}
reader.close();
writer.close();
}
Ok, EddyG suggested the right direction for finding the problem. The opencsv jar has Class CSVWriter, and it has a different constructor variants, among which is public CSVWriter(Writer writer,char separator,char quotechar). Fiddling with quotechar improved things.
Source: opencsv documentation

Categories

Resources