reading in CSV file, arrayindex out of bounds error

reading in CSV file, arrayindex out of bounds error - java

I'm attempting to read in a csv file, I've created a test file with 9 entries and their value, but my code won't read past the second line , it says the file isn't found past the second key, I've tried tweeking it as much as possible, can someone help me? sample input would include something like this but in a csv file (so each in a new line, I'm new here and still learning to edit text here):
Diego,2
Maria,2
Armando,5
Ken, 1
public static void main(String[] args) {
HashMap<String, Integer> h = new HashMap<String, Integer>(511);
try
{
Scanner readIn = new Scanner (new File ("test1.csv") );
System.out.println("I'm here 1");
while ( readIn.hasNext() )
{
System.out.print(readIn.next());// for testing purposes only
System.out.println("Check 2"); // for testing purposes only
String line = readIn.nextLine();
String str[] = line.split(",");
for (int i = 0; i < str.length ; i++)
{
String k = str[0];
int v = Integer.parseInt(str[1]);
h.insert(k , v);
}
System.out.println(h.toString());
}
readIn.close();
}
catch (ArrayIndexOutOfBoundsException ob)
{
System.out.println(" - The file wasn't found." );
}
catch (FileNotFoundException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}

A call to next() or nextLine() should be preceeded with the call to hasNext().But in you code you have checked if hasNext() returns true in the while loop,and then invoked both next() and nextLine() inside the loop.
You can modify your code as below:
while ( readIn.hasNext() )
{
String line = readIn.nextLine();
String str[] = line.split(",");
for (int i = 0; i < str.length ; i++)
{
String k = str[0];
int v = Integer.parseInt(str[1]);
h.put(k , v);
}
System.out.println(h.toString());
}

Your for loop isn't actually serving a purpose. You will notice that you never actually reference i in the loop. Prior to the OP's answer I believe you were trying to split on a string that didn't have a comma, but your code assumes that one will be there and hence the out of bounds exception. This relates to why I was telling you that the println() was problematic.
As far as your question about hasNext(), this is the only way you will know that you can read another line from the file. If you try to read past the end you will run into problems.

Rather writing code to read CSV file on your own. I'd suggest you to use standard libraries like Apache Commons CSV. It provides more methods to deal with CSV, Tab separated file, etc...
import java.io.FileReader;
import java.util.List;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVRecord;
public class SO35859431 {
public static void main(String[] args) {
String filePath = "D:\\user.csv";
try {
List<CSVRecord> listRecords = CSVFormat.DEFAULT.parse(new FileReader(filePath)).getRecords();
for (CSVRecord csvRecord : listRecords) {
/* Get record using index/position */
System.out.println(csvRecord.get(0));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

First there is no such insert() method in HashMap class.The correct one is put(k, v) & in while loop it should be hasNext(). Follow is my code alternate the BufferedReader.
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package read;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Scanner;
/**
*
* #author Isuru Rangana Jr
*/
public class Read {
/**
* #param args the command line arguments
*/
public static void main(String args[]) throws FileNotFoundException, IOException {
HashMap<String, Integer> h = new HashMap<String, Integer>();
try {
BufferedReader readIn = new BufferedReader(new FileReader(new File("test.csv")));
while (readIn.ready()) {
String line = readIn.readLine();
String str[] = line.split(",");
for (int i = 0; i < str.length; i++) {
String k = str[0];
int v = Integer.parseInt(str[1]);
h.put(k, v);
}
System.out.println(h.toString());
}
readIn.close();
} catch (ArrayIndexOutOfBoundsException ob) {
System.out.println(" - The file wasn't found.");
}
}
}

Related

Java: check CSV file on duplicate lines using ArrayList

I have a CSV file with this content:
2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894 2017-10-29 00:00:00.0,"1010",-125529,0,0,0,420743,0,0,256,420743,256,420743 2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894 2017-10-29 00:00:00.0,"1013",-10625,0,0,-687,599098,0,0,379,599098,379,599098 2017-10-29 00:00:00.0,"1604",-1794.9,0,0,-3.99,4081.07,0,0,361,4081.07,361,4081.07
So lines 1 and 3 are duplicates.
Now I want to read the file in and print out duplicate lines in the console.
I set up this Java code reading the file in and throwing it line by line into an ArrayList. Then I create an immutable
copy, loop through the ArrayList and in the binarySearch I use the immutable copy of the ArrayList:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class ReadValidationFile {
public static void main(String[] args) {
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
List<String> validationFileCopy = Collections.unmodifiableList(validationFile);
for(String line : validationFile){
int comp = Collections.binarySearch(validationFileCopy,line,new ComparatorLine());
if (comp <= 0){
System.out.println(line);
}
}
}
}
Comparator Class:
import java.util.Comparator;
public class ComparatorLine implements Comparator<String> {
#Override
public int compare(String s1, String s2) {
return s1.compareToIgnoreCase(s2);
}
}
I expect this line to be printed:
2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894
But the output I get is this:
2017-10-29 00:00:00.0,"1010",-125529,0,0,0,420743,0,0,256,420743,256,420743
Can you help me please to see what I am doing wrong? My comparator I think is okay. What is wrong with my
ArrayLists?

The other answer(s) correctly state that you should be using Set instead of List. But for the sake of learning, let's have a look at your code and see where you went wrong.
public class ReadValidationFile {
public static void main(String[] args) {
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
Semicolon is unnecessary.
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
This can all be achieved in just one line: List<String> validationFile = Files.readAllLines(Paths.get("validation_small.csv"), "utf-8");
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
List<String> validationFileCopy = Collections.unmodifiableList(validationFile);
Actually, this is not a copy. It is just an unmodifiable view of the same list.
for(String line : validationFile){
int comp = Collections.binarySearch(validationFileCopy,line,new ComparatorLine());
You might as well just search validationFile itself. However, you are calling binarySearch which only works on sorted lists, but your list is not sorted. See documentation.
if (comp <= 0){
System.out.println(line);
}
You are printing when it's not found (comp <= 0). If the search succeeds, it will return a non-negative number (comp >= 0). But another problem is that you are searching the whole list for each element, and the search will obviously always succeed (that is, if your list was sorted).
Save yourself all the trouble and use a Set instead. And, using Java 8 streams, the whole program can be reduced to the following:
public static void main(String[] args) throws Exception {
Set<String> uniqueLines = new HashSet<>();
Files.lines(Paths.get("", "utf-8"))
.filter(line -> !uniqueLines.add(line))
.forEach(System.out::println);
}
If you really need to ignore case when comparing strings (from your given data, it looks like it doesn't make any difference since it's just numbers), then store each unique line by first uppercasing and then lowercasing it. This apparently cumbersome technique is necessary because just lowercasing is not enough if dealing with non-English language text. The equalsIgnoreCase method also does this.
public static void main(String[] args) throws Exception {
Set<String> uniqueLines = new HashSet<>();
Files.lines(Paths.get("", "utf-8"))
.filter(line -> !uniqueLines.add(line.toUpperCase().toLowerCase()))
.forEach(System.out::println);
}

Create a Set while reading lines from the input csv file, anytime add() element to set returns false print the line as it is duplicate line.
If you want list of all duplicate lines then create a List which will have lines that returned false when tried add() to Set.
NOTE:
I have simulated your file reading by using a static data.
Small note, if your data only contains numbers and no alphabets then you do not need case-insensitive comparison.
If your data contains alphabets then also you do not need a special Comparator as you can insert data into Set using add(line.toLowerCase()) which will ensure that all lines are compared with lower case and then added to Set.
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class ReadValidationFile {
static List<String> validationFile = new ArrayList<>();
static {
validationFile.add("2017-10-29 00:00:00.0,\"1005\",-10227,0,0,0,332894,0,0,222,332894,222,332894");
validationFile.add("2017-10-29 00:00:00.0,\"1010\",-125529,0,0,0,420743,0,0,256,420743,256,420743");
validationFile.add("2017-10-29 00:00:00.0,\"1005\",-10227,0,0,0,332894,0,0,222,332894,222,332894");
validationFile.add("2017-10-29 00:00:00.0,\"1013\",-10625,0,0,-687,599098,0,0,379,599098,379,599098");
validationFile.add("2017-10-29 00:00:00.0,\"1604\",-1794.9,0,0,-3.99,4081.07,0,0,361,4081.07,361,4081.07");
}
public static void main(String[] args) {
// Option 1 : unique lines only
Set<String> uniqueLinesOnly = new HashSet<>(validationFile);
// Option 2 : unique lines and duplicate lines
Set<String> uniqueLines = new HashSet<>();
Set<String> duplicateLines = new HashSet<>();
for (String line : validationFile) {
if (!uniqueLines.add(line.toLowerCase())) {
duplicateLines.add(line.toLowerCase());
}
}
// Option 3 : unique lines and duplicate lines by Java Streams
Set<String> uniquesJava8 = new HashSet<>();
List<String> duplicatesJava8 = validationFile
.stream()
.filter(element -> !uniquesJava8.add(element.toLowerCase()))
.map(element -> element.toLowerCase())
.collect(Collectors.toList());
}
}

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class ReadValidationFile {
public static void main(String[] args){
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
Set<String> uniques = new HashSet<>();
List<String> duplicates = validationFile.stream().filter(i->!uniques.add(i)).collect(Collectors.toList());
System.out.println(duplicates);
}
}

Reading from a file and excluding words using Scanner

Im currently trying to write a program that counts the amounts of times different words are being used in a text, and then attach the values to a hashmap. In the main part of the program i use a scanner to read in the file with the text, and i initiate the GenWordCtr with another scanner thats supposed to read in a file with words i want excluded (words like "this, her, that"). Ive made sure that the string sent to op.process is lowercased, however when i run the program it still adds all the values that i want excluded from the statistics. What am i doing wrong? I know the main program works, ive tried it with single words.
TLDR - i want words excluded using a scanner to read in a text, for some reason they arent being excluded in the "process" operation of my program.
package textproc;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;
public class Holgersson {
public static final String[] REGIONS = { "blekinge", "bohuslän", "dalarna", "dalsland", "gotland", "gästrikland",
"halland", "hälsingland", "härjedalen", "jämtland", "lappland", "medelpad", "närke", "skåne", "småland",
"södermanland", "uppland", "värmland", "västerbotten", "västergötland", "västmanland", "ångermanland",
"öland", "östergötland" };
public static void main(String[] args) throws FileNotFoundException {
Scanner s = new Scanner(new File("../lab1/nilsholg.txt"));
Scanner stopwords = new Scanner(new File("undantagsord.txt"));
s.useDelimiter("(\\s|,|\\.|:|;|!|-|\\?|'|\\\")+"); // se handledning
TextProcessor gen = new GeneralWordCounter(stopwords);
while (s.hasNext()) {
String word = s.next().toLowerCase();
gen.process(word);
}
s.close();
gen.report();
}
}
package textproc;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
public class GeneralWordCounter implements TextProcessor {
private Map<String, Integer> m;
private Scanner excep;
GeneralWordCounter(Scanner r){
Map<String, Integer> m = new HashMap<String, Integer>();
this.m = m;
excep = r;
}
#Override
public void process(String word) {
// TODO Auto-generated method stub
boolean bin = false;
while(excep.hasNext() && bin == false) {
if(word.equals(excep.next().toLowerCase())) {
bin = true;
}
}
if(!bin) {
if(m.containsKey(word)) {
m.put(word, (m.get(word) + 1));
}
else {
m.put(word, 1);
}
}
}
#Override
public void report() {
// TODO Auto-generated method stub
for(String key : m.keySet()) {
if(m.get(key) >= 200) {
System.out.println(key + " - " + m.get(key));
}
}
}
}

You are using same Scanner instance for stopwords inside the loop, which might be getting exhausted within few number of below loops.
TextProcessor gen = new GeneralWordCounter(stopwords);
while (s.hasNext()) {
String word = s.next().toLowerCase();
gen.process(word);
}
Imagine this way, you have started above loop and passed the Scanner instance and when you called process method it started loop for word and reached to end of the file for second Scanner. Now, in the next loop you again called process method but this time pointer will be already at the end of the file as you are using same instance. So, you won't get expected output.
Instead, you need to create new instance of Scanner for each process method call.
public void process(String word) {
Scanner excep = new Scanner(new File("undantagsord.txt"));
// your code.

Write CSV file column-by-column

I was searching for an answer for this but I didn't find it. Does anyone have a solution for this kind of problem. I have a set of text variables that I have to write into the .CSV file using Java. I am currently doing a project with JavaScript that calls for Java. This is a function that I have right now that does the job well and writes the text into .CSV line by line.
function writeFile(filename, data)
{
try
{
//write the data
out = new java.io.BufferedWriter(new java.io.FileWriter(filename, true));
out.newLine();
out.write(data);
out.close();
out=null;
}
catch(e) //catch and report any errors
{
alert(""+e);
}
}
But now I have to write parts of text one by one like the example bellow.
first0,second0,third0
first1,second1,third1
first2,second2,third2
.
.
.
first9,second9,third9
So the algorithm goes like this. The function writes first0 with comma then goes to the next line writes first1, goes to next line writes first2 and so one until first9. After that part is done the script goes to the beginning of the file and writes second0 behind the comma, goes to the next line and writes second1 behind the comma and so on. You get the idea.
So now I need java

You might want to consider using Super CSV to write the CSV file. As well as taking care of escaping embedded double-quotes and commas, it offers a range of writing implementations that write from arrays/Lists, Maps or even POJOs, which means you can easily try out your ideas.
If you wanted to keep it really simple, you can assemble your CSV file in a two-dimensional array. This allows to to assemble it column-first, and then write the whole thing to CSV when it's ready.
package example;
import java.io.FileWriter;
import java.io.IOException;
import org.supercsv.io.CsvListWriter;
import org.supercsv.io.ICsvListWriter;
import org.supercsv.prefs.CsvPreference;
public class ColumnFirst {
public static void main(String[] args) {
// you can assemble this 2D array however you want
final String[][] csvMatrix = new String[3][3];
csvMatrix[0][0] = "first0";
csvMatrix[0][1] = "second0";
csvMatrix[0][2] = "third0";
csvMatrix[1][0] = "first1";
csvMatrix[1][1] = "second1";
csvMatrix[1][2] = "third1";
csvMatrix[2][0] = "first2";
csvMatrix[2][1] = "second2";
csvMatrix[2][2] = "third2";
writeCsv(csvMatrix);
}
private static void writeCsv(String[][] csvMatrix) {
ICsvListWriter csvWriter = null;
try {
csvWriter = new CsvListWriter(new FileWriter("out.csv"),
CsvPreference.STANDARD_PREFERENCE);
for (int i = 0; i < csvMatrix.length; i++) {
csvWriter.write(csvMatrix[i]);
}
} catch (IOException e) {
e.printStackTrace(); // TODO handle exception properly
} finally {
try {
csvWriter.close();
} catch (IOException e) {
}
}
}
}
Output:
first0,second0,third0
first1,second1,third1
first2,second2,third2

Here is my solution to the problem. You don't need to keep the whole data in the buffer thanks to the low-level random access file mechanisms. You would still need to load your records one by one:
package file.csv;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.RandomAccessFile;
import java.nio.channels.FileChannel;
import java.util.Arrays;
import java.util.List;
public class CsvColumnWriter {
public static void main(String args[]) throws Exception{
CsvColumnWriter csvWriter = new CsvColumnWriter(new File("d:\\csv.txt"), new File("d:\\csv.work.txt"), 3);
csvWriter.writeNextCol(Arrays.asList(new String[]{"first0", "first1", "first2"}));
csvWriter.writeNextCol(Arrays.asList(new String[]{"second0", "second1", "second2"}));
csvWriter.writeNextCol(Arrays.asList(new String[]{"third0", "third1", "third2"}));
}
public void writeNextCol(List<String> colOfValues) throws IOException{
// we are going to create a new target file so we have to first
// create a duplicated version
copyFile(targetFile, workFile);
this.targetStream = new BufferedOutputStream(new FileOutputStream(targetFile));
int lineNo = 0;
for(String nextColValue: colOfValues){
String nextChunk = nextColValue + ",";
// before we add the next chunk to the current line,
// we must retrieve the line from the duplicated file based on its the ofset and length
int lineOfset = findLineOfset(lineNo);
workRndAccFile.seek(lineOfset);
int bytesToRead = lineInBytes[lineNo];
byte[] curLineBytes = new byte[bytesToRead];
workRndAccFile.read(curLineBytes);
// now, we write the previous version of the line fetched from the
// duplicated file plus the new chunk plus a 'new line' character
targetStream.write(curLineBytes);
targetStream.write(nextChunk.getBytes());
targetStream.write("\n".getBytes());
// update the length of the line
lineInBytes[lineNo] += nextChunk.getBytes().length;
lineNo++;
}
// Though I have not done it myself but obviously some code should be added here to care for the cases where
// less column values have been provided in this method than the total number of lines
targetStream.flush();
workFile.delete();
firstColWritten = true;
}
// finds the byte ofset of the given line in the duplicated file
private int findLineOfset(int lineNo) {
int ofset = 0;
for(int i = 0; i < lineNo; i++)
ofset += lineInBytes[lineNo] +
(firstColWritten? 1:0); // 1 byte is added for '\n' if at least one column has been written
return ofset;
}
// helper method for file copy operation
public static void copyFile( File from, File to ) throws IOException {
FileChannel in = new FileInputStream( from ).getChannel();
FileChannel out = new FileOutputStream( to ).getChannel();
out.transferFrom( in, 0, in.size() );
}
public CsvColumnWriter(File targetFile, File workFile, int lines) throws Exception{
this.targetFile = targetFile;
this.workFile = workFile;
workFile.createNewFile();
this.workRndAccFile = new RandomAccessFile(workFile, "rw");
lineInBytes = new int[lines];
for(int i = 0; i < lines; i++)
lineInBytes[i] = 0;
firstColWritten = false;
}
private File targetFile;
private File workFile;
private int[] lineInBytes;
private OutputStream targetStream;
private RandomAccessFile workRndAccFile;
private boolean firstColWritten;
}

I'm just going ahead and assume that you have some freedom how to fulfill this task. To my knowledge, you can't 'insert' text into a file. You can only do it by reading the file completely, change it in-memory, and then write back the result into the file.
So it would be better if you invert your data structure in-memory and then write it. If your data object is a matrix, just transpose it, so that it is in the format you want to write.

How about this
Scanner input = new Scanner(System.in);
String[] lines = new String[9];
for (int j = 0; j < 2; j++) {
for (int i = 0; i < 9; i++) {
lines[i] += (String) input.nextLine() + ",";
}
}
for (int i = 0; i < 9; i++) {
lines[i] += (String) input.nextLine();
}

Based on your requirements of not losing any data if an error occurs, perhaps you should rethink the design and use an embedded database (there is a discussion of the merits of various embedded databases at Embedded java databases). You would just need a single table in the database.
I suggest this because in your original question it sounds like you are trying to use a CSV file like a database where you can update the columns of any row in any order. In that case, why not just bite the bullet and use a real database.
Anyhow, once you have all the columns and rows of your table filled in, export the database to a CSV file in "text file order" row1-col1, row1-col2 ... row2-col1 etc.
If an error occurs during the building of the database, or the exporting of the CSV file at least you will still have all the data from the previous run and can try again.

Best way to process strings in Java

I want make a game of life clone using text files to specify my starting board, then write the finished board to a new text file in a similar format. E.g. load this:
Board in:
wbbbw
bbbbb
wwwww
wbwbw
bwbwb
from a file, then output something like this:
Board out:
wbbbw
bbbbb
wwwww
wwwww
bwbwb
I do this by making a 2D array (char[][] board) of characters, reading the file line by line into a string, and using String.charAt() to access each character and store it in the array.
Afterward, I convert each element of board (i.e., board[0], board[1], etc.), back to a string using String.valueOf(), and write that to a new line in the second file.
Please tell me I'm an idiot for doing this and that there is a better way to go through the file -> string -> array -> string -> file process.

You can use String.toCharArray() for each line while reading the file.
char[][] board = new char[5][];
int i = 0;
while((line = buffRdr.readLine()) != null) {
board[i++] = line.toCharArray();
}
And while writing either String.valueOf() or java.util.Arrays.toString().
for(int i=0; i<board.length; i++) {
//write Arrays.toString(board[i]);
}

// Remember to handle whitespace chars in array
char[] row = "wbbbw bbbbb wwwww wbwbw bwbwb".toCharArray()
Everything else seems good.

Why not use an already existing text format such as JSON instead of inventing your own?
There are tons of JSON parsers out there that can read and write two dimensional arrays.
You get both the benefit of easy reading directly from the file(as with your original method) and the benefit of not having to parse an annoying string format.

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
public class GameOfLife
{
private String mFilename;
private ArrayList<String> mLines;
public GameOfLife(String filename)
{
mFilename = filename;
read();
}
public char get(int x, int y)
{
String line = mLines.get(y);
return line.charAt(x);
}
public void set(char c, int x, int y)
{
String line = mLines.get(y);
String replacement = line.substring(0, x) + c + line.substring(x + 1, line.length());
mLines.set(y, replacement);
}
private void read()
{
mLines = new ArrayList<String>();
try
{
BufferedReader in = new BufferedReader(new FileReader(mFilename));
String line = in.readLine();
while (line != null)
{
mLines.add(line);
line = in.readLine();
}
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
}
private void write()
{
try
{
BufferedWriter out = new BufferedWriter(new FileWriter(mFilename));
for (String line : mLines)
{
out.write(line + "\n");
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
}

Converting an CSV file to a JSON object in Java

Is there an open source java library to convert a CSV (or XLS) file to a JSON object?
I tried using json.cdl, but somehow it does not seem to work for large CSV strings.
I'm trying to find something like http://www.cparker15.com/code/utilities/csv-to-json/, but written in Java.

You can use Open CSV to map CSV to a Java Bean, and then use JAXB to convert the Java Bean into a JSON object.
http://opencsv.sourceforge.net/#javabean-integration
http://jaxb.java.net/guide/Mapping_your_favorite_class.html

Here is my Java program and hope somebody finds it useful.
Format needs to be like this:
"SYMBOL,DATE,CLOSE_PRICE,OPEN_PRICE,HIGH_PRICE,LOW_PRICE,VOLUME,ADJ_CLOSE
AAIT,2015-02-26 00:00:00.000,-35.152,0,35.152,35.12,679,0
AAL,2015-02-26 00:00:00.000,49.35,50.38,50.38,49.02,7572135,0"
First line is the column headers. No quotation marks anywhere. Separate with commas and not semicolons. You get the deal.
/* Summary: Converts a CSV file to a JSON file.*/
//import java.util.*;
import java.io.*;
import javax.swing.*;
import javax.swing.filechooser.FileNameExtensionFilter;
public class CSVtoJSON extends JFrame{
private static final long serialVersionUID = 1L;
private static File CSVFile;
private static BufferedReader read;
private static BufferedWriter write;
public CSVtoJSON(){
FileNameExtensionFilter filter = new FileNameExtensionFilter("comma separated values", "csv");
JFileChooser choice = new JFileChooser();
choice.setFileFilter(filter); //limit the files displayed
int option = choice.showOpenDialog(this);
if (option == JFileChooser.APPROVE_OPTION) {
CSVFile = choice.getSelectedFile();
}
else{
JOptionPane.showMessageDialog(this, "Did not select file. Program will exit.", "System Dialog", JOptionPane.PLAIN_MESSAGE);
System.exit(1);
}
}
public static void main(String args[]){
CSVtoJSON parse = new CSVtoJSON();
parse.convert();
System.exit(0);
}
private void convert(){
/*Converts a .csv file to .json. Assumes first line is header with columns*/
try {
read = new BufferedReader(new FileReader(CSVFile));
String outputName = CSVFile.toString().substring(0,
CSVFile.toString().lastIndexOf(".")) + ".json";
write = new BufferedWriter(new FileWriter(new File(outputName)));
String line;
String columns[]; //contains column names
int num_cols;
String tokens[];
int progress = 0; //check progress
//initialize columns
line = read.readLine();
columns = line.split(",");
num_cols = columns.length;
write.write("["); //begin file as array
line = read.readLine();
while(true) {
tokens = line.split(",");
if (tokens.length == num_cols){ //if number columns equal to number entries
write.write("{");
for (int k = 0; k < num_cols; ++k){ //for each column
if (tokens[k].matches("^-?[0-9]*\\.?[0-9]*$")){ //if a number
write.write("\"" + columns[k] + "\": " + tokens[k]);
if (k < num_cols - 1) write.write(", "); }
else { //if a string
write.write("\"" + columns[k] + "\": \"" + tokens[k] + "\"");
if (k < num_cols - 1) write.write(", ");
}
}
++progress; //progress update
if (progress % 10000 == 0) System.out.println(progress); //print progress
if((line = read.readLine()) != null){//if not last line
write.write("},");
write.newLine();
}
else{
write.write("}]");//if last line
write.newLine();
break;
}
}
else{
//line = read.readLine(); //read next line if wish to continue parsing despite error
JOptionPane.showMessageDialog(this, "ERROR: Formatting error line " + (progress + 2)
+ ". Failed to parse.",
"System Dialog", JOptionPane.PLAIN_MESSAGE);
System.exit(-1); //error message
}
}
JOptionPane.showMessageDialog(this, "File converted successfully to " + outputName,
"System Dialog", JOptionPane.PLAIN_MESSAGE);
write.close();
read.close();
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Requires Swing but comes with a nifty little GUI so those who know absolutely no Java can use it once packaged into an executable .jar. Feel free to improve upon it. Thank you StackOverflow for helping me out all these years.

#Mouscellaneous basically answered this for you so please give him the credit.
Here is what I came up with:
package edu.apollogrp.csvtojson;
import au.com.bytecode.opencsv.bean.CsvToBean;
import au.com.bytecode.opencsv.bean.HeaderColumnNameMappingStrategy;
import org.codehaus.jackson.map.ObjectMapper;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.List;
public class ConvertCsvToJson {
public static void main(String[] args) throws IOException, ClassNotFoundException {
if (args.length > 1) {
String pathToCsvFile = args[0];
String javaBeanClassName = "edu.apollogrp.csvtojson.bean." + args[1];
final File file = new File(pathToCsvFile);
if (!file.exists()) {
System.out.println("The file you specified does not exist. path=" + pathToCsvFile);
}
Class<?> type = null;
try {
type = Class.forName(javaBeanClassName);
} catch (ClassNotFoundException e) {
System.out.println("The java bean you specified does not exist. className=" + javaBeanClassName);
}
HeaderColumnNameMappingStrategy strat = new HeaderColumnNameMappingStrategy();
strat.setType(type);
CsvToBean csv = new CsvToBean();
List list = csv.parse(strat, new InputStreamReader(new FileInputStream(file)));
System.out.println(new ObjectMapper().writeValueAsString(list));
} else {
System.out.println("Please specify the path to the csv file.");
}
}
}
I used maven to include the dependencies, but you could also download them manually and include them in your classpath.
<dependency>
<groupId>net.sf.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.9.12</version>
</dependency>
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-core-asl</artifactId>
<version>1.9.12</version>
</dependency>

I have used excel file in this code.you can use csv.
i have wrote this class for particular Excel/csv format which is known to me.
import java.io.File;
public class ReadExcel {
private String inputFile;
public void setInputFile(String inputFile) {
this.inputFile = inputFile;
}
public void read() throws IOException {
File inputWorkbook = new File(inputFile);
Workbook w;
try {
w = Workbook.getWorkbook(inputWorkbook);
// Get the first sheet
Sheet sheet = w.getSheet(0);
// Loop over first 10 column and lines
int columns = sheet.getColumns();
int rows = sheet.getRows();
ContactList clist = new ContactList();
ArrayList<Contact> contacts = new ArrayList<Contact>();
for (int j = 1; j < rows; j++) {
Contact contact = new Contact();
for (int i = 0; i < columns; i++) {
Cell cell = sheet.getCell(i, j);
switch (i) {
case 0:
if (!cell.getContents().equalsIgnoreCase("")) {
contact.setSrNo(Integer.parseInt(cell.getContents()));
} else {
contact.setSrNo(j);
}
break;
case 1:
contact.setName(cell.getContents());
break;
case 2:
contact.setAddress(cell.getContents());
break;
case 3:
contact.setCity(cell.getContents());
break;
case 4:
contact.setContactNo(cell.getContents());
break;
case 5:
contact.setCategory(cell.getContents());
break;
}
}
contacts.add(contact);
}
System.out.println("done");
clist.setContactList(contacts);
JSONObject jsonlist = new JSONObject(clist);
File f = new File("/home/vishal/Downloads/database.txt");
FileOutputStream fos = new FileOutputStream(f, true);
PrintStream ps = new PrintStream(fos);
ps.append(jsonlist.toString());
} catch (BiffException e) {
e.printStackTrace();
System.out.println("error");
}
}
public static void main(String[] args) throws IOException {
ReadExcel test = new ReadExcel();
test.setInputFile("/home/vishal/Downloads/database.xls");
test.read();
}
}
i have used jxl.jar for excel reading

If your CSV is simple, then this is easy to write by hand - but CSV can include nasty edge cases with quoting, missing values, etc.
load the file using BufferedReader.readLine()
use String.split(",") to get the value from each line - NB this approach will only work correctly if your values don't have commas in!
write each value to the output using BufferedWriter
with the necessary JSON braces and quoting
You might want to use a CSV library, then convert to JSON 'by hand'

Here is a class I generated to return JSONArray, not just to print to a file.
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import org.json.simple.JSONArray;
import org.json.simple.parser.JSONParser;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.util.List;
import java.util.Map;
public class CsvToJson {
private static final Logger log = LoggerFactory.getLogger(UtilsFormat.class);
private static CsvToJson instance;
public static JSONArray convert(File input) throws Exception {
JSONParser parser = new JSONParser();
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<? extends Object> readAll = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input).readAll();
ObjectMapper mapper = new ObjectMapper();
JSONArray jsonObject = (JSONArray) parser.parse(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));
System.out.print(jsonObject.toString());
return new JSONArray();
}
}

With Java 8, writing JSON is at hand.
You didn't specify what JSON API you want, so I assume by "JSON object" you mean a string with a serialized JSON object.
What I did in the CSV Cruncher project:
Load the CSV using HSQLDB. That's a relatively small (~2 MB) library, which actually implements a SQL 2008 database.
Query that database using JDBC.
Build a JDK JSON object (javax.json.JsonObject) and serialize it.
Here's how to do it:
static void convertResultToJson(ResultSet resultSet, Path destFile, boolean printAsArray)
{
OutputStream outS = new BufferedOutputStream(new FileOutputStream(destFile.toFile()));
Writer outW = new OutputStreamWriter(outS, StandardCharsets.UTF_8);
// javax.json way
JsonObjectBuilder builder = Json.createObjectBuilder();
// Columns
for (int colIndex = 1; colIndex <= metaData.getColumnCount(); colIndex++) {
addTheRightTypeToJavaxJsonBuilder(resultSet, colIndex, builder);
}
JsonObject jsonObject = builder.build();
JsonWriter writer = Json.createWriter(outW);
writer.writeObject(jsonObject);
The whole implementation is here. (Originally I wrote my own CSV parsing and JSON writing, but figured out both are complicated enough to reach for a tested off-the-shelf library.)

If you're using Java 8, you can do something like this. No Libraries or complicated logic required.
Firstly, create a POJO representing your new JSON object. In my example it's called 'YourJSONObject' and has a constructor taking two strings.
What the code does is initially reads the file, then creates a stream of String based lines. ( a line is equivalent to a line in your CSV file).
We then pass the line in to the map function which splits it on a comma and then creates the YourJSONObject.
All of these objects are then collected to a list which we pass in to the JSONArray constructor.
You now have an Array of JSONObjects. You can then call toString() on this object if you want to see the text representation of this.
JSONArray objects = new JSONArray(Files.readAllLines(Paths.get("src/main/resources/your_csv_file.csv"))
.stream()
.map(s -> new YourJSONObject(s.split(",")[0], s.split(",")[1]))
.collect(toList()));

Old post but I thought I'd share my own solution. It assumes quotations are used around an in-value comma. It also removes all quotations afterwards.
This method accepts a String in CSV format. So it assumes you've already read the CSV file to a string. Make sure you didn't remove the NextLine characters ('\n') while reading.
This method in no way perfect, but it might be the quick one-method solution in pure java you are looking for.
public String CSVtoJSON(String output) {
String[] lines = output.split("\n");
StringBuilder builder = new StringBuilder();
builder.append('[');
String[] headers = new String[0];
//CSV TO JSON
for (int i = 0; i < lines.length; i++) {
String[] values = lines[i].replaceAll("\"", "").split("۞");
if (i == 0) //INDEX LIST
{
headers = values;
} else {
builder.append('{');
for (int j = 0; j < values.length && j < headers.length; j++) {
String jsonvalue = "\"" + headers[j] + "\":\"" + values[j] + "\"";
if (j != values.length - 1) { //if not last value of values...
jsonvalue += ',';
}
builder.append(jsonvalue);
}
builder.append('}');
if (i != lines.length - 1) {
builder.append(',');
}
}
}
builder.append(']');
output = builder.toString();
return output;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

reading in CSV file, arrayindex out of bounds error - java

Related

Java: check CSV file on duplicate lines using ArrayList

Reading from a file and excluding words using Scanner

Write CSV file column-by-column

Best way to process strings in Java

Converting an CSV file to a JSON object in Java

Categories

Resources