Read CSV and convert string to double - java

I am new to Java and I am reading a csv file using open CSV and my code is below:
import java.io.FileReader;
import java.util.Arrays;
import au.com.bytecode.opencsv.CSVReader;
public class ParseCSVLineByLine
{
double arr []=new arr[10];
public static void main(String[] args) throws Exception
{
//Build reader instance
//Read data.csv
//Default seperator is comma
//Default quote character is double quote
//Start reading from line number 2 (line numbers start from zero)
CSVReader reader = new CSVReader(new FileReader("data.csv"), ',' , '"' , 1);
//Read CSV line by line and use the string array as you want
String[] nextLine;
int i=0;
while ((nextLine = reader.readNext()) != null) {
if (nextLine != null && i<10) {
//Verifying the read data here
arr[i]=Double.parseDouble(Arrays.toString(nextLine).toString());
System.out.println(arr[i]);
}
i++;
}
}
}
But this does not works.But when I only print
Arrays.toString(nextLine).toString()
This prints
[1]
[2]
[3]
.
.
.
.
[10]
I think the conversion is having the problem.Any help is appreciated.

Thing is:
"[1]"
is not a string that could be parsed as number!
Your problem is that you turn the array as a whole into one string.
So instead of calling
Arrays.toString(nextLine).toString()
iterate nextLine and give each array member to parseDouble()!
Besides: I am pretty sure you receive a NumberFormatException or something alike. The JVM already tells you that you are trying to convert an invalid string into a number. You have to learn to read those exception messages and understand what they mean!
Long story short: your code probably wants to parse "[1]", but you should have it parse "1" instead!

Related

Reading single backslash from a text file using java code

There are lot of post on this, and everyone suggesting to change the text file content.
My requirement here is, i am parsing a c++ source file. During this parsing i might need to merge multi lines together when i find a backslash at the end.
Example:
char line[100]="hello join the multiple lines.\
Oh, dont ask me to edit CPP source file.";
How do I read this text from xyz.cpp file, and figure out the line has a backslash at the end.
I used FileInputReader to read line by line, but the backslash is missing when i get the line in java.
I hope you will not suggest me to change my CPP source code to replace \ with \
Thanks in advance.
The backslash is an escape character in Java. So if you want to match with a real backslash - \, then you have to look for \\.
You can use contains() or indexOf() for string literals.
Or read character by character and check the condition:
if (c == '\\')
Hope this helps!
On the simplest level you can just split the file data by newlines (data.split('\n') where data is a String) and then check if it ends in a backslash (line.endsWith('\\') where line is a String)
The following code loads the file and prints each line
import java.io.File;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Scanner;
public class Test {
public static void main(String args[]) throws Exception{
Scanner scan = new Scanner(new File("file.txt"));
scan.useDelimiter("\\Z");
String content = scan.next();
String[] lines = content.split("\n");
for (String value : lines) {
System.out.println(value);
}
}
}
I created a file "file.txt" containing the following lines
line 1
line 2\
line 2cnt
line 3
The code will output
line 1
line 2\
line 2cnt
line 3
To join the lines you can run the following code
public static void main(String args[]) throws Exception{
Scanner scan = new Scanner(new File("file.txt"));
scan.useDelimiter("\\Z");
String content = scan.next();
String[] lines = content.split("\n");
for (String value : lines) {
if (value.endsWith("\\")) {
value = value.substring(0, value.length()-1);
System.out.print(value);
} else {
System.out.println(value);
}
}
}
which will output
line 1
line 2line 2cnt
line 3
Edited as per your comment:
public static void main(String args[]) throws Exception {
FileInputStream fileInputStream = new FileInputStream("file.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
outputs
line 1
line 2\
line 2cnt
line 3
and
public static void main(String args[]) throws Exception {
FileInputStream fileInputStream = new FileInputStream("file.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream));
String line;
while ((line = reader.readLine()) != null) {
if (line.endsWith("\\")) {
line= line.substring(0, line.length()-1);
System.out.print(line);
} else {
System.out.println(line);
}
}
}
outputs
line 1
line 2line 2cnt
line 3
Not sure, why you don't see the backslash on your machine. Can you post the complete code and the file content? What platform are you running it on? What is the encodeing of the file? Maybe you need to pass the encoding to the InputStreamReader like this:
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream, "UTF-8"));
It was compilation issue of eclipse ide, i restarted the eclipse and did clean compile. Now every thing is working as expected. Thank you guys for your time.

My Java program doesn't iterate through my entire CSV file

Code:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Scanner;
public class dataReader {
#SuppressWarnings("rawtypes")
public static void main(String[] args) throws Exception {
String splitBy =",";
BufferedReader br = new BufferedReader(new FileReader("C:/practice/testData.csv"));
String line = br.readLine();
int counter = 0;
while ((line = br.readLine()) != null){
counter++;
String[] b = line.split(splitBy);
for (int x = 0; x < b.length; x++){
System.out.println(b[x]);
}
}
System.out.println(counter);
br.close();
}
}
When I run it, it goes through the CSV and displays some output but it starts with product ID 4000+. It basically only outputs results for the last thousand rows in the CSV. I'm wrangling with a really ugly CSV file to try to get some useful data out of it so that I can write all of it into a new database later on. I'd appreciate any tips.
As in the comments, the issue is that System.out.println prints to your console. The console will have a limit to the number of lines it can display, which is why you see the last ~1000 lines.
At a frist glance at your code, you skip the first line by assigning a readline() to your string line
String line = br.readLine();
You should change that as you read the first line and than enter the loop which will read the 2nd line before any operations on the first line took place.
Try something like
String line = "";
Your code won't handle more complicated CSVs, eg a row:
"Robert, John", 25, "Chicago, IL"
is not going to get parsed correctly. Just splitting on ',' is going to separate "Robert, John" into multiple cells though it should be just 1 cell.
The point is, you shouldn't be writing a CSV parser. I highly recommend go downloading the Apache Commons CSV library and using that! Reading CSVs is a solved problem; why reinvent the wheel?
Without seeing the structure of the file I can only speculate, but this line: String line = br.readLine(); needs to be changed to String line = "" or String line = null. As it currently stands, you are discarding your first line.

Array index out of bound in java error

package com.testing;
import java.io.BufferedReader;
import java.io.FileReader;
public class CsvParser {
public static void main(String[] args) {
String csvFile = "D:/code-home/SentimentAnalysis/test_data/Sentiment Analysis Dataset.csv";
BufferedReader br = null;
String line = "";
String cvsSplitBy = "\t"; // data is in format splitted by tab
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] tweet = line.split(cvsSplitBy);
System.out.println(tweet[1]);
System.out.println(tweet[3]);
}
}
}
The program's purpose is to parse the CSV format. I have used bufferRead method.
When I go to compile the program, it works fine. When I run the program,output is printed but there is a exception:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at com.testing.CsvParser.main(CsvParser.java:34)
First of all: Arrays in Java are Zero-indexed, which means that the first element in an array is not selected by array[1] but by array[0]. Since your OutOufBoundsException is fired at the index 1, your array has at most one element in it (you shoulds check for the size of the array before accessing it). Because you are trying to access the index 3 (fourth element in Java) in the very next line i suspect you expect at least 3 elements in each line. Since there is at most one element you seem to either be using the wrong splitcharacter or your file is not formatted as you expect it to be. I hope this helps you. Kind regards

Can I do this - token=str.split(" "||",");

import java.io.*;
import java.text.DecimalFormat;
import java.text.NumberFormat;
public class TrimTest{
public static void main(String args[]) throws IOException{
String[] token = new String[0];
String opcode;
String strLine="";
String str="";
try{
// Open and read the file
FileInputStream fstream = new FileInputStream("a.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
//Read file line by line and storing data in the form of tokens
if((strLine = br.readLine()) != null){
token = strLine.split(" ");// split w.r.t spaces
token = strLine.split(" "||",") // split if there is a space or comma encountered
}
in.close();//Close the input stream
}
catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
int i;
int n = token.length;
for(i=0;i<n;i++){
System.out.println(token[i]);
}
}
}
If the input MOVE R1,R2,R3
Split with respect to space or comma and save it into and array token[]
I want output as:
MOVE
R1
R2
R3
Thanks in Advance.
Try token = strLine.split(" |,").
split uses regex as argument and or in regex is |. You can also use character class like [\\s,] which is equal to \\s|, and means \\s = any white space (like normal space, tab, new line mark) OR comma".
You want
token = strLine.split("[ ,]"); // split if there is a space or comma encountered
Square brackets denote a character class. This class contains a space and a comma and the regex will match on any character of the character class.
Change it to strLine.split(" |,"), or maybe even strLine.split("\\s+|,").

Reading CSV file in Java and storing the values in an int array

I have a CSV file of strings in this format:
14/10/2011 422 391.6592 394.52324 0.039215686
13/10/2011 408.43 391.7612 395.0686031 0.039215686
12/10/2011 402.19 391.834 395.3478736 0.039215686
All I want to do is read in the csv file and then store the 3rd and 4th coloumns data in integer arrays.
This is the code I have written:
BufferedReader CSVFile =
new BufferedReader(new FileReader("appleData.csv"));
String dataRow = CSVFile.readLine();
int count = 0;
while (dataRow != null){
String[] dataArray = dataRow.split(",");
EMA[count] = dataArray[2];
SMA[count] = dataArray[3];
dataRow = CSVFile.readLine(); // Read next line of data.
}
// Close the file once all data has been read.
CSVFile.close();
I want to end up with two arrays, EMA which contains all the values from the 3rd coloumn and SMA which contains the values from the 4th coloumn.
I am getting a null pointer exception. Can someone please tell me what mistake I am making?
Your file appears to use whitespace/tab as a delimiter, but you're splitting at commas. That makes no sense to me.
You assume that the data row has a certain length without checking it. That makes no sense to me.
This code will show you how to do it better:
package cruft;
import org.apache.commons.lang3.StringUtils;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.util.LinkedHashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
/**
* CsvParser
* #author Michael
* #link http://stackoverflow.com/questions/14114358/reading-csv-file-in-java-and-storing-the-values-in-an-int-array/14114365#14114365
* #since 1/1/13 4:26 PM
*/
public class CsvParser {
public static void main(String[] args) {
try {
FileReader fr = new FileReader((args.length > 0) ? args[0] : "resources/test.csv");
Map<String, List<String>> values = parseCsv(fr, "\\s+", true);
System.out.println(values);
} catch (IOException e) {
e.printStackTrace();
}
}
public static Map<String, List<String>> parseCsv(Reader reader, String separator, boolean hasHeader) throws IOException {
Map<String, List<String>> values = new LinkedHashMap<String, List<String>>();
List<String> columnNames = new LinkedList<String>();
BufferedReader br = null;
br = new BufferedReader(reader);
String line;
int numLines = 0;
while ((line = br.readLine()) != null) {
if (StringUtils.isNotBlank(line)) {
if (!line.startsWith("#")) {
String[] tokens = line.split(separator);
if (tokens != null) {
for (int i = 0; i < tokens.length; ++i) {
if (numLines == 0) {
columnNames.add(hasHeader ? tokens[i] : ("row_"+i));
} else {
List<String> column = values.get(columnNames.get(i));
if (column == null) {
column = new LinkedList<String>();
}
column.add(tokens[i]);
values.put(columnNames.get(i), column);
}
}
}
++numLines;
}
}
}
return values;
}
}
Here's the input file I used to test it:
# This shows that comments, headers and blank lines work fine, too.
date value1 value2 value3 value4
14/10/2011 422 391.6592 394.52324 0.039215686
13/10/2011 408.43 391.7612 395.0686031 0.039215686
12/10/2011 402.19 391.834 395.3478736 0.039215686
Here's the output I got:
{date=[14/10/2011, 13/10/2011, 12/10/2011], value1=[422, 408.43, 402.19], value2=[391.6592, 391.7612, 391.834], value3=[394.52324, 395.0686031, 395.3478736], value4=[0.039215686, 0.039215686, 0.039215686]}
Process finished with exit code 0
[1] There should be a count++ inside the while loop
[2] You have not defined/initialized the arrays EMA and SMA - causing the exception.
[3] If you split() by comma and have a space separated file, the result will be an array of unity length, and indices 2 and 3 with generate NullPointerException - even if you initialize the arrays properly.
I suggest reading in the number by adding them to a List (like ArrayList or Vector) in the loop, since you do not know the size in advance. Once you get out of the loop, create 2 arrays of appropriate size and copyInto() the data in the arrays. Let the garbage collector deal with the Vectors.
The problem with your code is that int[] EMA is not an initialization. It just defines that EMA is an array of integers, without effectively creating it (you only have the reference).
My advice would be changing EMA and SMA to ArrayLists and instead of using attributions, you could add the current elements to the lists.
In the end of the loop, you get the number of elements at each ArrayList using the size() method and can change them into arrays using toArray method, fulfilling whichever goal you might have.
Of course, I am assuming that you forgot the commas at your example. Otherwise, you should change the delimiter to whitespace.

Categories

Resources