Java: Improving speed of a reader program - java

Hey so I am working on this program that reads CSV files and I need to make a method which can return one entire column on values.
Currently I do it like this:
List<String> data = new LinkedList<>();
for(int i = 0; i < getRowCount(); i++){
data.add(getRow(i).get(column));
}
Where getRow() is this:
List<String> data = new LinkedList<>();
String column;
try (BufferedReader bufferedReader = new BufferedReader(new FileReader(file))) {
for(int i = 0; i < row; i++){
bufferedReader.readLine();
}
column = bufferedReader.readLine();
for(String col: column.split(columnSeparator.toString())){
data.add(col);
}
} catch (IOException e) {
e.printStackTrace();
}
and it works. But the flaw is, if there are too many columns in a file it takes way too long. It takes 27 secondso n 7500 lines and 9 columns. Over 10 minutes on 35000 lines and 16 columns. Do you know how could I make it faster?

Try to read the file once:
List<String> getColumn(int column) {
try (BufferedReader bufferedReader = new BufferedReader(new FileReader(file))) {
List<String> data = new LinkedList<>();
String line = bufferedReader.readLine();
while (line != null) {
String cols[] = line.split(columnSeparator.toString());
data.add(cols[column]);
line = bufferedReader.readLine();
}
return data;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}

What you are doing is the following:
Prepare to read the file (Creating ReaderObject, ...), read first line
Prepare to read the file, read first line, read second line
Preapre to read the file, read first line, read second line, read third line
.. And so on.
Apparently this is not very efficient (Your doing stuff in O(n²), with n = number of lines).
You could improve your code vastly, if you do it something like this:
Prepare to read the file
Read the first line
Read the second line
... And so on.
So first read all the lines at once:
List<String> lines = new LinkedList<>();
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null)
lines.add(line);
} catch (IOException e) {
e.printStackTrace();
}
You can then iterate over the lines to split them into columns and extract the data you're interested in:
List<String> data = new LinkedList<>();
for(String line : lines)
data.add(line.split(columnSeparator.toString())[column]);
Of course this still needs a little bit of error handling :)

I would suggest you to try this
DataType<T> listRef = getRowCount();
for(int i = 0; i < listRef.size(); i++)
{
data.add(getRow(i).get(column));
}
getRowCount is executed every single time when you call it in a for statement and you would eventually get all the rows but internally I believe that calling makes it go and execute that method getRowCount().size() times and you probably don't want to read a file that many times

Related

BufferedReader returning columns instead of row

When I do this...
try (BufferedReader bufferedReader = new BufferedReader(new FileReader(filename))) {
String line;
while ((line = bufferedReader.readLine()) != null) {
String[] data = line.split(",");
}
} catch (IOException e) {
e.printStackTrace();
}
If I print data[0], I get the first column instead of the first row. How can I modify it to return rows when I do data[0]?
When I did...
List<String> data;
try (BufferedReader bufferedReader = new BufferedReader(new FileReader(filename))) {
data = bufferedReader.lines().collect(Collectors.toList());
} catch (IOException e) {
e.printStackTrace();
}
And do data.get(0), I get rows as expected, so why not with the first method?
In your first example, you read each line with bufferedReader.readLine(), which already gives you a row. Then you split the row at the , char, which gives you columns for the current row.
Your seconds example, using bufferedReader.lines() returns a Stream of rows, which you collect with .collect(Collectors.toList()). Each of that rows in your List still simply has a string with all the commas in it. So what you probably want is a 2D-Array or a List<List<String>>.
You can achieve this as follows:
try (BufferedReader br = new BufferedReader(new FileReader(filename))) {
final List<List<String>> table = br.lines()
.map(row -> Stream.of(row.split(","))
.collect(Collectors.toList()))
.collect(Collectors.toList());
System.out.println(table.get(0));
} catch (IOException e) {
e.printStackTrace();
}
while ((line = bufferedReader.readLine()) != null) {
String[] data = line.split(",");
}
In here you're fetching a single line, spliting it into parts by , and assigning to the data variable. At that time you have a single line processed, there is no way to refer row x as you only got one row. And that variable is defined inside of the while loop so it's state is only accessible during one given iteration.
In a CSV file, each line is a row. Splitting the lines is what gives you the columns. So if you want the rows, you do not need to split the lines, just print them out the way they are. For example to print out just the first row, all you need to do is
try (BufferedReader bufferedReader = new BufferedReader(new FileReader(filename))) {
String firstRow = bufferedReader.readLine();
System.out.println(firstRow);
} catch (IOException e) {
e.printStackTrace();
}

How can I read a specifc column from a text file and calculate the average of this column?

I am a little stuck with a java exercise I am currently working on. I have a text file in this format:
Quio Kla,2221,3.6
Wow Pow,3332,9.3
Zou Tou,5556,9.7
Flo Po,8766,8.1
Andy Candy,3339,6.8
I now want to calculate the average of the whole third column, but I have to extract the data first I believe and store it in an array. I was able to read all the data with a buffered reader and print out the entire file in console, but that did not get me closer to get it into an array. Any suggestions on how I can read in a specific column of a text file with a buffered readder into an array would be highly appreciated.
Thank you very much in advance.
You can split your text file by using this portion of code:
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader("textfile.txt"));
String read = null;
while ((read = in.readLine()) != null) {
String[] splited = read.split(",");
for (String part : splited) {
System.out.println(part);
}
}
} catch (IOException e) {
System.out.println("There was a problem: " + e);
e.printStackTrace();
} finally {
try {
in.close();
} catch (Exception e) {
}
}
And then you'll have all your columns in the array part.
It`s definitely not the best solution, but should be sufficient for you
BufferedReader input = new BufferedReader(new FileReader("/file"));
int numOfColumn = 2;
String line = "";
ArrayList<Integer>lines = new ArrayList<>();
while ((line = input.readLine()) != null) {
lines.add(Integer.valueOf(line.split(",")[numOfColumn-1]));
}
long sum =0L;
for(int j:lines){
sum+=j;
}
int avg = (int)sum/lines.size();
I'm going to assume each data set is separated by newline characters in your text file.
ArrayList<Double> thirdColumn = new ArrayList<>();
BufferedReader in = null;
String line=null;
//initialize your reader here
while ((line = in.readLine())!=null){
String[] split = line.split(",");
if (split.length>2)
thirdColumn.add(Double.parseDouble(split[2]));
}
By the end of the while loop, you should have the thirdColumn ArrayList ready and populated with the required data.
The assumption is made that your data set has the following standard format.
String,Integer,Double
So naturally a split by a comma should give a String array of length 3, Where the String at index 2 contains your third column data.

Reading textfile line by line and put in object array

I have to make an EPG app using java, but I am kind of new in programming and it's due tomorrow and it's still not working properly.
I have a question about a small part: I have to read the programs from a text file. Each line contains multiple things, the channel, the title of the program, a subtitle, a category, etcetera.
I have to make sure that I can read the separate parts of each line, but it's not really working, it's only printing the parts from the first line.
I am trying, but I can't find why it's not printing all the parts from all the lines in stead of printing only the parts from the first line. Here's the code:
BufferedReader reader = new BufferedReader(newFileReader(filepath));
while (true) {
String line = reader.readLine();
if (line == null) {
break;
}
}
String[] parts = line.split("\\|", -1);
for(int i = 0; i < parts.length; i++) {
System.out.println(parts[i]);
}
reader.close();
Does anybody know how to get all the lines in stead of only the first?
Thank you!
readLine() only reads one line, so you need to loop it, as you said.
BUT with reading to the String inside of the while loop you always overwrite that String.
You would need to declare the String above the while loop that you can access it from outside, too.
BTW, it seems that your braces for the if don't match.
Anyway, I'd fill the information into an ArrayList, look below:
List<String> list = new ArrayList<>();
String content;
// readLine() and close() may throw errors, so they require you to catch it…
try {
while ((content = reader.readLine()) != null) {
list.add(content);
}
reader.close();
} catch (IOException e) {
// This just prints the error log to the console if something goes wrong
e.printStackTrace();
}
// Now proceed with your list, e.g. retrieve first item and split
String[] parts = list.get(0).split("\\|", -1);
// You can simplify the for loop like this,
// you call this for each:
for (String s : parts) {
System.out.println(s);
}
Use apache commons lib
File file = new File("test.txt");
List<String> lines = FileUtils.readLines(file);
As ArrayList is Dynamic,try,
private static List<String> readFile(String filepath) {
String line = null;
List<String> list = new ArrayList<String>();
try {
BufferedReader reader = new BufferedReader(new FileReader(filepath));
while((line = reader.readLine()) != null){
list.add(line);
}
} catch (Exception e) {
e.printStackTrace();
}
return list;
}

Detect first line of text file separately?

I am designing a program that will load a text file into different media file classes (Media > Audio > mp3, Media > Video > Avi, etc).
Now the first line of my text file is how many files there are in total, as in
3
exmaple.mp3,fawg,gseges
test.gif,wfwa,rgeg
ayylmao.avi,awf,gesg
Now that is what is in my text file, I want to first get the first line separately, then loop through the rest of the files.
Now I understand I can simply count how many files are in by using an int that grows as I loop but I want it clear in the file aswell, and I'm not sure how to go about this.
static public Media[] importMedia(String fileName)
{
try {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line = reader.readLine();
while(line != null)
{
//Get the first line of the text file seperatly? (Then maybe remove it? idk)
//Split string, create a temp media file and add it to a list for the rest of the lines
}
//String[] split = s.next().split(",");
} catch (Exception ex) { System.out.println(ex.getMessage()); }
return null;
}
I hope my question is clear, if it TL;DR I want to get the first line of a text file separately, then the rest Id like to loop through.
I wouldn't advice using a for-loop here, since the file might contain additional lines (e.g. comments or blank lines) to make it more human-readable. By examining the content of each line, you can make your processing more robust against this sort of thing.
static public Media[] importMedia(String fileName)
{
try {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
// Get and process first line:
String line = reader.readLine(); // <-- Get the first line. You could consider reader as a queue (sort-of), where readLine() dequeues the first element in the reader queue.
int numberOfItems = Integer.valueOf(line); // <-- Create an int of that line.
// Do the rest:
while((line = reader.readLine()) != null) // <-- Each call to reader.readLine() will get the next line in the buffer, so the first time around this will give you the second line, etc. until there are no lines left to read.
{
// You will not get the header here, only the rest.
if(!line.isEmpty() || line.startsWith("#") {
// If the line is not empty and doesn't start with a comment character (I chose # here).
String[] split = line.split(",");
String fileName = split[0];
// etc...
}
}
} catch (Exception ex) { System.out.println(ex.getMessage()); }
return null;
}
You don't need while loop to read up to end of file. Read first line and convert it to int than loop through.
static public Media[] importMedia(String fileName)
{
try {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
// Get and process first line:
int lineNo=Integer.parseInt(reader.readLine());
// Now read upto lineNo
for(int i=0; i < lineNo; i++){
//Do what you need with other lines.
String[] values = reader.readLine().split(",");
}
} catch (Exception e) {
//Your exception handling goes here
}
}

Using BufferedReader to count rows in a multi-dimensional array

I'm using BufferedReader to read a .csv file. I have no problem reading the file and extracting the data. However, the problem that I do have is that I have to hard-code my array declaration. For example:
String[][] numbers=new String[5258][16];
The .csv file I was using had 5258 rows and 16 columns. I'd like to be able to do something like this though:
String[][] numbers=new String[rowsInFile][16];
In other words, I want the variable 'rowsInFile' to be equivalent to the amount of rows in the file (I don't want to count the columns, because every .csv file I will be running through this program has 16 columns).
Here's the code I have so far:
int row = 0;
int col = 0;
String fileInput = JOptionPane.showInputDialog(null,
"Please enter the path of the CSV file to read:");
File file = new File(fileInput);
BufferedReader bufRdr;
bufRdr = new BufferedReader(new FileReader(file));
String line = null;
//get rows in the file
int rowsInFile = 0;
while(bufRdr.readLine() != null) {
rowsInFile++;
row++;
}
String[][] numbers=new String[rowsInFile][16];
//read each line of text file
row = 0;
while((line = bufRdr.readLine()) != null) {
StringTokenizer st = new StringTokenizer(line,",");
col=0;
while (st.hasMoreTokens()) {
//get next token and store it in the array
numbers[row][col] = st.nextToken();
col++;
}
row++;
}
However, I'm getting a null pointer exception. Any ideas of what I should do?
P.S. Yes, this code is surrounded by a try/catch statement.
The problem is, once you go through the BufferedReader once, you can't go back through it again. In other words, you have to use a new BufferedReader.
bufRdr = new BufferedReader(new FileReader(file));
row = 0;
while((line = bufRdr.readLine()) != null) {
Alternatively, you could use a dynamic array structure like an ArrayList<String[]> or a LinkedList<String[]> to store the rows.
LinkedList<String[]> numbers = new LinkedList<String[]>();
while( (line = bufRdr.readLine()) != null ) {
numbers.add(line.split(","));
}
Then instead of doing numbers[i][j], you use numbers.get(i)[j].
Instead of an array use something dynamic like a List. For example:
List<String[]> data = new ArrayList<String[]>();
Also using String's split() method will simplify the loading of the row.
Your problem is that BufferedReaders work by reading until the end of a file, and then they get stuck there. Your code requires reading through the file twice, but because you already reached an EOF, the BufferedReader is stuck returning null. I tend to solve this by stuffing the lines into an ArrayList, and using the size() method to get the number of lines. The source code looks something like this:
int rowsInFile=0;
ArrayList<String> lines = new ArrayList<String>();
String tmp = "";
while(tmp=bugRdr.readLine())
{
lines.add(tmp);
}
rowsInFile = lines.size();
String[][] numbers = new String[rowsInFile][16];
int row = 0;
for(String line : lines)
{
StringTokenizer st = new StringTokenizer(line,",");
col=0;
while (st.hasMoreTokens()) {
//get next token and store it in the array
numbers[row][col] = st.nextToken();
col++;
}
row++;
}

Categories

Resources