Optimising CSV parsing to be faster - java

I'm working on this "program" that reads data from 2 large csv files (line by line), compares an Array element from the files and, when a match is found, it writes my necessary data into a 3rd file. The only problem I have is that it is very slow. It reads 1-2 lines per second, which is extremely slow, considering I have millions of records. Any ideas on how could I make it faster? Here's my code:
public class ReadWriteCsv {
public static void main(String[] args) throws IOException {
FileInputStream inputStream = null;
FileInputStream inputStream2 = null;
Scanner sc = null;
Scanner sc2 = null;
String csvSeparator = ",";
String line;
String line2;
String path = "D:/test1.csv";
String path2 = "D:/test2.csv";
String path3 = "D:/newResults.csv";
String[] columns;
String[] columns2;
Boolean matchFound = false;
int count = 0;
StringBuilder builder = new StringBuilder();
FileWriter writer = new FileWriter(path3);
try {
// specifies where to take the files from
inputStream = new FileInputStream(path);
inputStream2 = new FileInputStream(path2);
// creating scanners for files
sc = new Scanner(inputStream, "UTF-8");
// while there is another line available do:
while (sc.hasNextLine()) {
count++;
// storing the current line in the temporary variable "line"
line = sc.nextLine();
System.out.println("Number of lines read so far: " + count);
// defines the columns[] as the line being split by ","
columns = line.split(",");
inputStream2 = new FileInputStream(path2);
sc2 = new Scanner(inputStream2, "UTF-8");
// checks if there is a line available in File2 and goes in the
// while loop, reading file2
while (!matchFound && sc2.hasNextLine()) {
line2 = sc2.nextLine();
columns2 = line2.split(",");
if (columns[3].equals(columns2[1])) {
matchFound = true;
builder.append(columns[3]).append(csvSeparator);
builder.append(columns[1]).append(csvSeparator);
builder.append(columns2[2]).append(csvSeparator);
builder.append(columns2[3]).append("\n");
String result = builder.toString();
writer.write(result);
}
}
builder.setLength(0);
sc2.close();
matchFound = false;
}
if (sc.ioException() != null) {
throw sc.ioException();
}
} finally {
//then I close my inputStreams, scanners and writer

Use an existing CSV library rather than rolling your own. It will be far more robust than what you have now.
However, your problem is not CSV parsing speed, it that your algorithm is O(n^2), for each line in the first file, you need to scan the second file. This kind of algorithm explodes very quickly with the size of data, when you have millions of rows, you'll run into problems. You need a better algorithm.
The other problem is you are re-parsing the second file for every scan. You should at least read it into an memory as an ArrayList or something first at the start of the program so you only need to load and parse it once.

Use univocity-parsers' CSV parser as it won't take much longer than a couple of seconds to process two files with 1 million rows each:
public void diff(File leftInput, File rightInput) {
CsvParserSettings settings = new CsvParserSettings(); //many config options here, check the tutorial
CsvParser leftParser = new CsvParser(settings);
CsvParser rightParser = new CsvParser(settings);
leftParser.beginParsing(leftInput);
rightParser.beginParsing(rightInput);
String[] left;
String[] right;
int row = 0;
while ((left = leftParser.parseNext()) != null && (right = rightParser.parseNext()) != null) {
row++;
if (!Arrays.equals(left, right)) {
System.out.println(row + ":\t" + Arrays.toString(left) + " != " + Arrays.toString(right));
}
}
leftParser.stopParsing();
rightParser.stopParsing();
}
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

Related

Reading a text file into multiple arrays in Java

I'm currently working on a program that reads in a preset text file and then manipulates the data in various ways. I've got the data manipulation to work with some dummy data but I still need to get the text file read in correctly.
The test file looks like this for 120 lines:
Aberdeen,Scotland,57,9,N,2,9,W,5:00,p.m. Adelaide,Australia,34,55,S,138,36,E,2:30,a.m. Algiers,Algeria,36,50,N,3,0,E,6:00,p.m.(etc etc)
So each of these needs to be read into its own array, in order String[] CityName,String[] Country,int[] LatDeg,int[] LatMin,String[] NorthSouth,int[] LongDeg,int LongMin,String[] EastWest,int[] Time.String[] AMPM
So the problem is that while I'm reasonably comfortable with buffered readers, designing this particular function has proven difficult. In fact, I've been drawing a blank for the past few hours. It seems like it would need multiple loops and counters but I can't figure out the precisely how.
I am assuming that you have one city per line type of file structure. If it is not, it will require a bit of tweaking in the following solution:
I will do the following way if I am more comfortable with BufferReader as you say:
List<List<String>> addresses = new ArrayList<List<String>>();
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
for(String line; (line = br.readLine()) != null; ) {
addresses.add(line.split(","));
}
}
Later, let's say you want to retrieve the country information of say 'Adelaid', you can try the following:
for (List<String> cityInfo : addresses) {
if("Adelaid".equals(cityInfo.get(0)) {
country = cityInfo.get(1);
}
}
Instead of creating different arrays (like String[] CityName,String[] Country, etc.,), try using a Domain Object.
Here, you can have a Domain object or Custom class Location with attributes
public class Location
{
private String cityName;
private String country;
private String latDeg;
etc
getters();
setters();
}`
Then you can write a file reader, each line item in the file will be a Location. So result will have
Location[] locations;
or
List locations;`
To carry out this task I should think the first thing you want to do is establish how many lines of data actually exist within the data file. You say it is 120 lines but what if it happens that it will be more or less? We would want to know exactly what it is so as to properly initialize all our different Arrays. We can use a simple method to accomplish this, let's call it the getFileLinesCount() method which will ulitmately return a Integer value that would be the number of text lines the data file holds:
private int getFileLinesCount(final String filePath) {
int lines = 0;
try{
File file =new File(filePath);
if(file.exists()){
FileReader fr = new FileReader(file);
try (LineNumberReader lnr = new LineNumberReader(fr)) {
while (lnr.readLine() != null){ lines++; }
}
}
else {
throw new IllegalArgumentException("GetFileLinesCount() Method Error!\n"
+ "The supplied file path does not exist!\n(" + filePath + ")");
}
}
catch(IOException e){ e.printStackTrace(); }
return lines;
}
Place this method somewhere within your main class. Now you need to Declare and initialize all your Arrays:
String filePath = "C:\\My Files\\MyDataFile.txt";
int lines = getFileLinesCount(filePath);
String[] CityName = new String[lines];
String[] Country = new String[lines];
int[] LatDeg = new int[lines];
int[] LatMin = new int[lines];
String[] NorthSouth = new String[lines];
int[] LongDeg = new int[lines];
int[] LongMin = new int[lines];
String[] EastWest = new String[lines];
int[] Time = new int[lines];
String[] AMPM = new String[lines];
Now to fill up all those Arrays:
public static void main(String args[]) {
loadUpArrays();
// Do whatever you want to do
// with all those Arrays.....
}
private void loadUpArrays() {
// Read in the data file.
try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
String sCurrentLine;
int x = 0;
// Read in one line at a time and Fill the Arrays...
while ((sCurrentLine = br.readLine()) != null) {
// Split each line read into an array upon itself.
String[] fileLine = sCurrentLine.split(",");
// Fill our required Arrays...
CityName[x] = fileLine[0];
Country[x] = fileLine[1];
LatDeg[x] = Integer.parseInt(fileLine[2]);
LatMin[x] = Integer.parseInt(fileLine[3]);
NorthSouth[x] = fileLine[4];
LongDeg[x] = Integer.parseInt(fileLine[5]);
LongMin[x] = Integer.parseInt(fileLine[6]);
EastWest[x] = fileLine[7];
Time[x] = Integer.parseInt(fileLine[8]);
AMPM[x] = fileLine[9];
x++;
}
br.close();
}
catch (IOException ex) { ex.printStackTrace(); }
}
Now, I haven't tested this, I just quickly punched it out but I think you can get the jest of it.
EDIT:
As #Mad Physicist has so graciously pointed out within his comment below, a List can be used to eliminate the need to count file lines therefore eliminating the need to read the data file twice. All the file lines can be placed into the List and the number of valid file lines can be determined by the size of the List. Filling of your desired arrays can now also be achieved by iterating through the List elements and processing the data accordingly. Everything can be achieved with a single method we'll call fillArrays(). Your Arrays declaration will be a little different however:
String[] CityName;
String[] Country;
int[] LatDeg;
int[] LatMin;
String[] NorthSouth;
int[] LongDeg;
int[] LongMin;
String[] EastWest;
String[] Time;
String[] AMPM;
public static void main(String args[]) {
fillArrays("C:\\My Files\\MyDataFile.txt");
// Whatever you want to do with all
// those Arrays...
}
private void fillArrays(final String filePath) {
List<String> fileLinesList = new ArrayList<>();
try{
File file = new File(filePath);
if(file.exists()){
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String strg;
while((strg = br.readLine()) != null){
// Make sure there is no blank line. If not
// then add line to List.
if (!strg.equals("")) { fileLinesList.add(strg); }
}
br.close();
}
}
else {
throw new IllegalArgumentException("GetFileLinesCount() Method Error!\n"
+ "The supplied file path does not exist!\n(" + filePath + ")");
}
// Initialize all the Arrays...
int lines = fileLinesList.size();
CityName = new String[lines];
Country = new String[lines];
LatDeg = new int[lines];
LatMin = new int[lines];
NorthSouth = new String[lines];
LongDeg = new int[lines];
LongMin = new int[lines];
EastWest = new String[lines];
Time = new String[lines];
AMPM = new String[lines];
// Fill all the Arrays...
for (int i = 0; i < fileLinesList.size(); i++) {
String[] lineArray = fileLinesList.get(i).split(",");
CityName[i] = lineArray[0];
Country[i] = lineArray[1];
LatDeg[i] = Integer.parseInt(lineArray[2]);
LatMin[i] = Integer.parseInt(lineArray[3]);
NorthSouth[i] = lineArray[4];
LongDeg[i] = Integer.parseInt(lineArray[5]);
LongMin[i] = Integer.parseInt(lineArray[6]);
EastWest[i] = lineArray[7];
Time[i] = lineArray[8];
AMPM[i] = lineArray[9];
}
}
catch(IOException e){ e.printStackTrace(); }
}
On another note...your Time Array can not be Integer since in data, what is considered the time contains a colon (:) which is a alpha character therefore (in case you haven't noticed) I have changed its declaration to String[]

read from file and write some parts in another file

I have to read from a text file and format the input. I'm new to java reading from files, and I don't know how to work with just some parts of what I read
Here is the initial file: http://pastebin.com/D0paWtAd
And I have to write in another file the following output:
Average,Joe,44,31,18,12,9,10
I've managed just to take everything from the file and print it to output. I would need help just in taking the output I need and print it to the screen. Any help is appreciated.
This is what I wrote up to now:
public class FileParsing {
public static String
read(String filename) throws IOException {
BufferedReader in = new BufferedReader(new FileReader("C:\\Users\\Bogdi\\Desktop\\example.txt"));
String s;
StringBuilder sb = new StringBuilder();
while((s = in.readLine())!= null) sb.append(s + "\n");
in.close();
return sb.toString();
}
If your goal is to do the specified output in another file you don't need to first get the content of your file in a StringBuilder before processing it, you can append the processed datas directly in a StringBuilder then you can write the result in a file. Here is an example that would work for the given file but you may have to modify it if the keys change in the future:
The following method will correctly process the datas from your file
public static String read(String filename) throws IOException {
BufferedReader in = new BufferedReader(new FileReader(filename));
String s;
StringBuilder sb = new StringBuilder();
while((s = in.readLine())!= null) {
String[] split1 = s.split("=");
if (split1[0].equals("name")) {
StringTokenizer tokenizer = new StringTokenizer(split1[1]);
sb.append(tokenizer.nextToken());
sb.append(",");
sb.append(tokenizer.nextToken());
sb.append(",");
} else if (split1[0].equals("index")) {
sb.append(split1[1] + ",");
} else if (split1[0].equals("FBid")) {
sb.append(split1[1]);
} else {
StringTokenizer tokenizer = new StringTokenizer(split1[1]);
String wasted = tokenizer.nextToken();
sb.append(tokenizer.nextToken() + ",");
}
}
in.close();
return sb.toString();
}
The next method will read any string to a file
public static void writeStringToFile(String string, String filePath) throws IOException {
BufferedWriter writer = new BufferedWriter(
new FileWriter(
new File(filePath)
)
);
writer.write(string);
writer.newLine();
writer.flush();
writer.close();
}
And here is a simple tests (File1.txt contains the datas from the file you shared on paste bin and I write them in another file)
public static void main(String[] args) throws Exception {
String datas = read("C:\\Tests\\File1.txt");
System.out.println(datas);
writeStringToFile(datas, "C:\\Tests\\FileOuput.txt" );
}
It will produce the exact output that you are expecting
[EDIT] #idk, apparently you have an exception executing my example, while it is working fine for me. That could only mean there is an error at data level. Here is the data sample that I used (and I believe I exactly copy the datas you shared)
And here is the result:
Good to know you are using "StringBuilder" component instead being concatenating your String values, way to go :).
More than knowledge on the Java.IO API to work with files, you will need some logic to get the results you expect. Here I came with an approach that could help you, not perfect, but can point you on how to face this problem.
//Reference to your file
String myFilePath = "c:/dev/myFile.txt";
File myFile = new File(myFilePath);
//Create a buffered reader, which is a good start
BufferedReader breader = new BufferedReader(new FileReader(myFile));
//Define this variable called line that will evaluate each line of our file
String line = null;
//I will use a StringBuilder to append the information I need
StringBuilder appender = new StringBuilder();
while ((line = breader.readLine()) != null) {
//First, I will obtain the characters after "equals" sign
String afterEquals = line.substring(line.indexOf("=") + 1, line.length());
//Then, if it contains digits...
if (afterEquals.matches(".*\\d+.*")) {
//I will just get the digits from the line
afterEquals = afterEquals.replaceAll("\\D+","");
}
//Finally, append the contents
appender.append(afterEquals);
appender.append(",");//This is the comma you want to include
}
//I will delete the last comma
appender.deleteCharAt(appender.length() - 1);
//Close the reader...
breader.close();
//Then create a process to write the content
BufferedWriter myWriter = new BufferedWriter(new FileWriter(new File("myResultFile.txt")));
//Write the full contents I get from my appender :)
myWriter.write(appender.toString());
//Close the writer
myWriter.close();
}
Hope this can help you. Happy coding!

how to read from a huge file and write to a new file by java

What I am doing is to read one file line by line, format every line, then write to a new file. But the problem is that the file is huge, nearly 178 MB. But always getting error message: IO console updater error, java heap space. Here is my code:
public class fileFormat {
public static void main(String[] args) throws IOException{
String strLine;
FileInputStream fstream = new FileInputStream("train_final.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(fstream));
BufferedWriter writer = new BufferedWriter(new FileWriter("newOUTPUT.txt"));
while((strLine = reader.readLine()) != null){
List<String> numberBox = new ArrayList<String>();
StringTokenizer st = new StringTokenizer(strLine);
while(st.hasMoreTokens()){
numberBox.add(st.nextToken());
}
for (int i=1; i< numberBox.size(); i++){
String head = numberBox.get(0);
String tail = numberBox.get(i);
String line = head + " "+tail ;
System.out.println(line);
writer.write(line);
writer.newLine();
}
numberBox.clear();
}
reader.close();
writer.close();
}
}
How can I avoid this error message? Moreover, I have set the VM preference: -xms1024m
Remove the line
System.out.println(line);
This is a workaround the fialing console updater, which otherwise runs out of memory.
The program looks okay. I suspect the problem is that you run this inside of Eclipse, and System.out is collected by Eclipse in memory (to be displayed in that Console window).
System.out.println(line);
Try to run it outside of Eclipse, change Eclipse settings to pipe System.out somewhere, or remove the line.
This part of the code:
for (int i=1; i< numberBox.size(); i++){
String head = numberBox.get(0);
String tail = numberBox.get(i);
String line = head + " "+tail ;
System.out.println(line);
writer.write(line);
writer.newLine();
}
Can be translated to:
String head = numberBox.get(0);
for (int i=1; i< numberBox.size(); i++){
String tail = numberBox.get(i);
System.out.print(head);
System.out.print(" ");
System.out.println(tail);
writer.write(head);
writer.write(" ");
writer.write(tail);
writer.newLine();
}
This may add a little code duplication but it avoids creating a lot of objects.
Also there if you merge this for loop with the loop contructing the numberBox, you won't need numberBox structure at all.
If you read whole file the heap memory will occupy so better option in to read the file in chuck. See my below code. It will start reading from the offset given in argument and will return the end offset . You need to pass number of lines to be read.
Please remember: You can use any collection to store these read lines and clear the collection before calling this method to read next chunk.
FileInputStream fis = new FileInputStream(file);
InputStreamReader streamReader = new InputStreamReader(fis, "UTF-8");
LineNumberReader reader = new LineNumberReader(streamReader);
//call this below method recursively until the file does not reaches to the end
public int getParsedLines(LineNumberReader reader, int iLineNumber_Start, int iNumberOfLinesToBeRead) {
int iLineNumber_End = 0;
int iReadUptoLines = iLineNumber_Start + iNumberOfLinesToBeRead;
try {
reader.mark(iLineNumber_Start);
reader.setLineNumber(iLineNumber_Start);
do {
String str = reader.readLine();
if (str == null) {
break;
}
// your code
iLineNumber_End = reader.getLineNumber();
} while (iLineNumber_End != iReadUptoLines);
} catch (Exception ex) {
// exception handling
}
return iLineNumber_End;
}

How to read and store data from a text file in which the first line are titles, and the other lines are related data

I have a text file with 300 lines or so. And the format is like:
Name Amount Unit CountOfOrder
A 1 ml 5000
B 1 mgm 4500
C 4 gm 4200
// more data
I need to read the text file line by line because each line of data should be together for further processing.
Now I just use string array for each line and access the data by index.
for each line in file:
array[0] = {data from the 'Name' column}
array[1] = {data from the 'Amount' column}
array[2] = {data from the 'Unit' column}
array[3] = {data from the 'CountOfOrder' column}
....
someOtherMethods(array);
....
However, I realized that if the text file changes its format (e.g. switch two columns, or insert another column), it would break my program (accessing through index might be wrong or even cause exception).
So I would like to use the title as reference to access each column. Maybe HashMap is a good option, but since I have to keep each line of data together, if I build a HashMap for each line, that would be too expensive.
Does anyone have any thought on this? Please help!
you only need a single hash map to map your column names to the proper column index. you fill the arrays by indexing with integers as you did before, to retrieve a column by name you'd use array[hashmap.get("Amount")].
You can read the file using opencsv.
CSVReader reader = new CSVReader(new FileReader("yourfile.txt"), '\t');
List<String[]> lines = reader.readAll();
The fist line contains the headers.
you can read each line of the file and assuming that the first line of the file has the column header you can parse that line to get all the names of the columns.
String[] column_headers = firstline.split("\t");
This will give you the name of all the columns now you just read through splitting on tabs and they will all line up.
You could do something like this:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream(FILE)));
String line = null;
String[] headers = null;
String[] data = null;
Map<String, List<String>> contents = new HashMap<String, List<String>>();
if ((line = in.readLine()) != null) {
headers = line.split("\t");
}
for(String h : headers){
contents.put(h, new ArrayList<String>());
}
while ((line = in.readLine()) != null) {
data = line.split("\t");
if(data.length != headers.length){
throw new Exception();
}
for(int i = 0; i < data.length; i++){
contents.get(headers[i]).add(data[i]);
}
}
It would give you flexibility, and would only require making the map once. You can then get the data lists from the map, so it should be a convenient data structure for the rest of your program to use.
This will give you individual list of columns.
public static void main(String args[]) throws FileNotFoundException, IOException {
List<String> headerList = new ArrayList<String>();
List<String> column1 = new ArrayList<String>();
List<String> column2 = new ArrayList<String>();
List<String> column3 = new ArrayList<String>();
List<String> column4 = new ArrayList<String>();
int lineCount=0;
BufferedReader br = new BufferedReader(new FileReader("file.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
String tokens[];
while (line != null) {
tokens = line.split("\t");
if(lineCount != 0)
{
int count = 0;
column1.add(tokens[count]); ++count;
column2.add(tokens[count]); ++count;
column3.add(tokens[count]); ++count;
column4.add(tokens[count]); ++count;
continue;
}
if(lineCount==0){
for(int count=0; count<tokens.length; count++){
headerList.add(tokens[count]);
lineCount++;
}
}
}
} catch (IOException e) {
} finally {
br.close();
}
}
using standard java.util.Scanner
String aa = " asd 9 1 3 \n d -1 4 2";
Scanner ss = new Scanner(aa);
ss.useDelimiter("\n");
while ( ss.hasNext()){
String line = ss.next();
Scanner fs = new Scanner(line);
System.out.println( "1>"+ fs.next()+" " +fs.nextInt() +" " +fs.nextLong()+" " +fs.nextBigDecimal());
}
using a bunch of hashmap's is ok...i won't be afraid ;)
if you need to process a lot of data...then try to translate your problem into a dataprocessing transformation
for example:
read all of you data into a hashmap's, but store them in a database using some JPA implementation....then you can go round'a'round your data ;)\

Java : Resizing a multidimensional array

I have a multidimensional array built from Strings that is initially created with the size [50][50], this is too big and now the array is full of null values, I am currently trying to remove these said null values, I have managed to resize the array to [requiredSize][50] but cannot shrink it any further, could anyone help me with this? I have scoured the internet for such an answer but cannot find it.
Here is my complete code too (I realise there may be some very unclean parts in my code, I am yet to clean anything up)
import java.io.*;
import java.util.*;
public class FooBar
{
public static String[][] loadCSV()
{
FileInputStream inStream;
InputStreamReader inFile;
BufferedReader br;
String line;
int lineNum, tokNum, ii, jj;
String [][] CSV, TempArray, TempArray2;
lineNum = tokNum = ii = jj = 0;
TempArray = new String[50][50];
try
{
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.println("Please enter the file path of the CSV");
String fileName = in.readLine();
inStream = new FileInputStream(fileName);
inFile = new InputStreamReader(inStream);
br = new BufferedReader(inFile);
StringTokenizer tok,tok2;
lineNum = 0;
line = br.readLine();
tokNum = 0;
tok = new StringTokenizer(line, ",");
while( tok.hasMoreTokens())
{
TempArray[tokNum][0] = tok.nextToken();
tokNum++;
}
tokNum = 0;
lineNum++;
while( line != null)
{
line = br.readLine();
if (line != null)
{
tokNum = 0;
tok2 = new StringTokenizer(line, ",");
while(tok2.hasMoreTokens())
{
TempArray[tokNum][lineNum] = tok2.nextToken();
tokNum++;
}
}
lineNum++;
}
}
catch(IOException e)
{
System.out.println("Error file may not be accessible, check the path and try again");
}
CSV = new String[tokNum][50];
for (ii=0; ii<tokNum-1 ;ii++)
{
System.arraycopy(TempArray[ii],0,CSV[ii],0,TempArray[ii].length);
}
return CSV;
}
public static void main (String args[])
{
String [][] CSV;
CSV = loadCSV();
System.out.println(Arrays.deepToString(CSV));
}
}
The CSV file looks as follows
Height,Weight,Age,TER,Salary
163.9,46.8,37,72.6,53010.68
191.3,91.4,32,92.2,66068.51
166.5,51.1,27,77.6,42724.34
156.3,55.7,21,81.1,50531.91
It can take any size obviously but this is just a sample file.
I just need to resize the array so that it will not contain any null values.
I also understand a list would be a better option here but it is not possible due to outside constraints. It can only be an multi dimensional array.
I think you need 3 changes to your program
After your while loop lineNum will be 1 more than the number of lines in the file so instead of declaring CSV to String[tokNum][50] declare it as CSV = new String[tokNum][lineNum-1];
tokNum will be the number of fields in a row so your for loop condition should be ii<tokNum rather than ii<tokNum-1
The last parameter for your arraycopy should be lineNum-1
i.e. the modified code to build your CSV array is:
CSV = new String[tokNum][lineNum-1];
for (ii=0; ii<tokNum ;ii++)
{
System.arraycopy(TempArray[ii],0,CSV[ii],0,lineNum-1);
}
and the output will then be:
[[Height, 163.9, 191.3, 166.5, 156.3], [Weight, 46.8, 91.4, 51.1, 55.7],
[Age, 37, 32, 27, 21], [TER, 72.6, 92.2, 77.6, 81.1],
[Salary, 53010.68, 66068.51, 42724.34, 50531.91]]
Notice that you don't really need to handle the first line of the file separately from the others but that is something you can cover as part of your cleanup.
10 to 1 this is a homework assignment. However, it looks like you've put somethought into it.
Don't make the TempArray variable. Make a "List of List of Strings". Something like:
List<List<String>> rows = new ArrayList<ArrayList<String>>();
while(file.hasMoreRows()) { //not valid syntax...but you get the jist
String rowIText = file.nextRow(); //not valid syntax...but you get the jist
List<String> rowI = new ArrayList<String>();
//parse rowIText to build rowI --> this is your homework
rows.add(rowI);
}
//now build String[][] using fully constructed rows variable
Here's an observation and a suggestion.
Observation: Working with (multidimensional) arrays is difficult in Java.
Suggestion: Don't use arrays to represent complex data types in Java.
Create classes for your data. Create a List of people:
class Person {
String height; //should eventually be changed to a double probably
String weight; // "
//...
public Person( String height, String weight /*, ... */ ) {
this.height = height;
this.weight = weight;
//...
}
}
List<Person> people = new ArrayList<Person>();
String line;
while ( (line = reader.nextLine()) != null ) {
String[] records = line.split(",");
people.add(new Person (records[0], records[1] /*, ... */));
}

Categories

Resources