Join two csv files into one in java

Join two csv files into one in java - java

I have two csv files with multiple columns from multiple tables.I am using opencsv to make csv files.
I want to make one csv file containing all the columns from both files.
There is one common column in both the files.
But number of records are not same.
Please suggest something. Any help would be appreciated.
P.S.: Joining two files simply mean i want to add all the columns in one file..It is not the database join .
I want the combined csv file and use it to in some tool to generate a pdf

Load one file into a dictionary keyed by the common column value, then append all the records of the 2nd file to the respective entry in the dictionary (again by common column value).
Finally, write all dictionary k,v pairs to a new file.
improvised example:
CSVReader r1 = ...; // reader of 1st file
CSVReader r2 = ...; // reader of 2nd file
HashMap<String,String[]> dic = new HashMap<String,String[]>();
int commonCol = 1; // index of the commonColumn
r1.readNext(); // skip header
String[] line = null;
while ((line = r1.readNext()) != null)
{
dic.add(line[commonCol],line)
}
commonCol = 2; // index of the commonColumn in the 2nd file
r2.readNext(); // skip header
String[] line = null;
while ((line = r2.readNext()) != null)
{
if (dic.keySet().contains(line[commonCol])
{
// append line to existing entry
}
else
{
// create a new entry and pre-pend it with default values
// for the columns of file1
}
}
foreach (String[] line : dic.valueSet())
{
// write line to the output file.
}

We can do something like this if we know which columns have the duplicate data
int n1,n2;//stores the serial number of the column that has the duplicate data
BufferedReader br1=new BufferedReader(new InputStreamReader(new FileInputStream(f1)));
BufferedReader br2=new BufferedReader(new InputStreamReader(new FileInputStream(f2)));
String line1,line2;
while((line1=br1.readLine())!=null && (line2=br2.readLine())!=null){
String line=line1+","+line2;
String newL="";
StringTokenizer st=new StringTokenizer(line,",");
for(int i=1;i<=st.countTokens();i++){
if((i==n1)||(i==n1+n2))
continue;
else
newL=newL+","+st.nextToken();
}
String l=newL.substring(1);
//write this line to the output file
}

Related

How to truncate csv file to n of rows not reading the whole file

I have big csv(12 gb), so I can't read it in memory, and I need only 100 rows of them and save it back(truncate). Has java such api?

The other answers create a new file from the original file. As I understand it, you want to truncate the original file instead. You can do that quite easily using RandomAccessFile:
try (RandomAccessFile file = new RandomAccessFile(FILE, "rw")) {
for (int i = 0; i < N && file.readLine() != null; i++)
; // just keep reading
file.setLength(file.getFilePointer());
}
The caveat is that this will truncate after N lines, which is not necessarily the same thing as N rows, because CSV files can have rows that span multiple lines. For example, here is one CSV record that has a name, address, and phone number, and spans multiple lines:
Joe Bloggs, "1 Acacia Avenue,
Naboo Town,
Naboo", 01-234 56789
If you are sure all your rows only span one line, then the above code will work. But if there is any possibility that your CSV rows may span multiple lines, then you should first parse the file with a suitable CSV reader to find out how many lines you need to retain before you truncate the file. OpenCSV makes this quite easy:
final long numLines;
try (CSVReader csvReader = new CSVReader(new FileReader(FILE))) {
csvReader.skip(N); // Skips N rows, not lines
numLines = csvReader.getLinesRead(); // Gives number of lines, not rows
}
try (RandomAccessFile file = new RandomAccessFile(FILE, "rw")) {
for (int i = 0; i < numLines && file.readLine() != null; i++)
; // just keep reading
file.setLength(file.getFilePointer());
}

You should stream a file : read it line by line
For example :
CSVReader reader = new CSVReader(new FileReader("myfile.csv"));
String [] nextLine;
// the readnext => Reads the next line from the buffer and converts to a string array.
while ((nextLine = reader.readNext()) != null) {
System.out.println(nextLine);
}

If you need just a hundred lines, reading just that small portion of the file into memory would be really quick and cheap. You could use the Standard Library file APIs to achieve this quite easily:
val firstHundredLines = File("test.csv").useLines { lines ->
lines.take(100).joinToString(separator = System.lineSeparator())
}
File("test.csv").writeText(firstHundredLines)

Possible solution
File file = new File(fileName);
// collect first N lines
String newContent = null;
try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
newContent = reader.lines().limit(N).collect(Collectors.joining(System.lineSeparator()));
}
// replace original file with collected content
Files.write(file.toPath(), newContent.getBytes(), StandardOpenOption.TRUNCATE_EXISTING);

How to delete a specific row from a CSV file using search string

I am working on java project where I have to delete a specific row from a CSV file using java. Currently I am using opencsv. I am trying to achieve the below scenario where I have to delete the 3rd row from the list and I have two strings as input.
String 1 : cat
String 2 : mars
I am able to get the exact row and its number with my current code. How can I delete this row?
Here is my code:
private static void updateCsv(String string1 , String String2) throws IOException {
try {
CSVReader reader = new CSVReader(new FileReader(OUTPUTFILE), ',');
List<String[]> myEntries = reader.readAll();
reader.close();
//Iterate through my array to find the row the user input is located on
int i = 1;
for (String[] line : myEntries) {
String textLine = Arrays.toString(line).replaceAll("\\[|\\]", "");
//here i am checking for the two strings
if (textLine.contains(string1) && textLine.contains(string2) ) {
//here i am able to get the count the row as 3
System.out.println("Found - Your item is on row: ...:" + i);
// how can i delete the row that i have now ?
} else {
//System.out.println("Not found");
}
i++;
}
} catch (IOException e) {
System.out.println(e);
}
}

List<String[]> filtered = myEntries.stream()
.filter(entry -> !entry[1].equals(string1) &&
!entry[2].equals(string2)
.collect(Collectors.toList());
FileWriter fw = new FileWriter("result.csv");
CSVWriter w = new CSVWriter(fw);
filtered.forEach(line -> w.writeNext(line));
You can't delete a line from a file in java.
In the code in your question you load the entire contents of the CSV file into a List. What you want is to write all the List entries to a file except the line that contains string1 and string2.
According to the sample data in your question, string1 is compared with column B and string2 is compared with column C. Column B corresponds to the element at index 1 in the String array that contains a single line from the CSV file. Similarly, column C corresponds to index 2.
Using the stream API, that was introduced in Java 8, you simply filter out the unwanted line. The resulting list contains all the lines that you want to keep, so just write them to another file. After that, if you like, you can delete the original CSV file and rename the resulting file (which I named "result.csv" in my code, above).

how to import a txt file into a JTable and make the first column auto increment according to the number of lines of the file

i'm doing some java coding and i have to import a file into a Jtable that has 4 columns while my file has 3 (separated by whitespaces, i need the first column of each line to be auto increment, here is my code:
try {
FileReader files = new FileReader(file);
BufferedReader buf = new BufferedReader(files);
String line = null;
String tokens[] = null;
while ((line = buf.readLine()) != null) {
tokens = line.split("\\p{javaWhitespace}+");
//System.out.println( Arrays.toString( tokens ));
model.addRow(tokens);
}
}
and this is what I'm getting :
and this is my file :

Simply add an additional token to the front of your data. This is easiest using a Vector and not an array. The first item in the vector is your row index, the next itmes are filled from your tokens array. For example:
try {
FileReader files = new FileReader(file);
BufferedReader buf = new BufferedReader(files);
String line = null;
String tokens[] = null;
int count = 0;
while ((line = buf.readLine()) != null) {
tokens = line.split("\\p{javaWhitespace}+");
Vector<Object> row = new Vector<>();
row.add(count);
count++;
for (String text: tokens) {
row.add(text);
}
model.addRow(row); // add the Vector, not the tokens array
}
}
There are other ways, including extending the table model such that it automatically does this, and these may need to be done, depending on your needs -- for example, are the rows to renumber if one row is deleted or added during the running of the program? If so the logic needs to be within the table model.

Java csv header iteration dynamically

I have a requirement like:
My input will be a csv file where I will be having values like below:
action, userId, firstName, email, lastName
1,2,sample,abc#gmail.com,test
2,3,justtest,def#gmail.com,test
I have to read this csv file based on headers. Say for ex: if action =1 and email is null, then I have to change from action to 2 or email to null or something like this.
I have no idea on how to read and parse the values based on the headers. This is my code what I tried:
String csvFile = "C:\\Test.csv";
// create BufferedReader to read csv file
BufferedReader br;
String line = "";
br = new BufferedReader(new FileReader(csvFile));
br.readLine();
// Read the file line by line starting from the second line
while ((line = br.readLine()) != null) {
// Get all tokens available in line
String[] tokens = line.split(",");
if (tokens.length > 0) {
for (int i = 0; i < tokens.length; i++) {
System.out.println("All Rows ------->" + tokens[i]);
}
}
}
This is just printing all the values in new line like below:
All Rows ------->1411184866
All Rows ------->category
All Rows ------->123456
All Rows ------->Test
All Rows ------->TestFullName
All Rows ------->rsap#gmail.com
All Rows ------->3423131
Please help me completing this code. Thanks in advance.

For parsing the CSV file into a post-processable format, I would first create an enum that models the columns of the file, in correct order:
enum Column {
ACTION, USERID, FIRSTNAME, EMAIL, LASTNAME;
public static final Column[] VALUES = values();
}
Then, I would read the file into a mapping of columns into lists of column values (this data structure is also called a "multimap"):
Map<Column, List<String>> columns =
new LinkedHashMap<>();
// initialize the map
for (Column c : Column.VALUES) {
columns.put(c, new ArrayList<>());
}
String csvFile = "C:\\Test.csv";
String line = "";
// create BufferedReader to read csv file
BufferedReader br = new BufferedReader(new FileReader(csvFile));
br.readLine();
// Read the file line by line starting from the second line
while ((line = br.readLine()) != null) {
// Get all tokens available in line
String[] tokens = line.split(",");
if (tokens.length > 0) {
for (int i = 0; i < tokens.length; i++) {
columns.get(Column.VALUES[i]).add(tokens[i].trim());
}
}
}
Now that you have the data ordered, you can start post-processing the values based on your business rules. Of course you can also apply the rules while reading the file, but this might hurt readability of the code. Accessing individual cells in the file is easy; e.g. the email address on the second row can be retrieved using columns.get(Column.EMAIL).get(1).
Running
System.out.println(columns);
with your example file outputs this:
{ACTION=[1, 2], USERID=[2, 3], FIRSTNAME=[sample, justtest],
EMAIL=[abc#gmail.com, def#gmail.com], LASTNAME=[test, test]}

Use Apache Common CSV libraries (or any other suitable libraries) to read the CSV files and then apply the business logic in the program.
https://commons.apache.org/proper/commons-csv/
It will give you the list of rows in the form of CSVRecord and then you just need to apply business logic based on the values by iterarting over the list. The first element in the list will be your header.
Reader in = new FileReader("File Name");
CSVParser parser = new CSVParser(in, CSVFormat.EXCEL);
List<CSVRecord> csvRecords = parser.getRecords();

I want to read a text file, split it, and store the results in an array

I have a text file which has 10 fields(columns)each separated by a tab.And i have several such rows.I wish to read the text file, split it for every column, using a "tab" delimiter and then storing it in an array of 10 columns and unlimited rows.Can that be done?

An array can't have "unlimited rows" - you have to specify the number of elements on construction. You might want to use a List of some description instead, e.g. an ArrayList.
As for the reading and parsing, I'd suggest using Guava, particularly:
Files.newReaderSupplier
CharStreams.readLines
Splitter
(That lets you split the lines as you go... alternatively you could use Files.readLines to get a List<String>, and then process that list separately, again using Splitter.)

BufferedReader buf = new BufferedReader(new FileReader(fileName));
String line = null;
List<String[]> rows = new ArrayList<String[]>();
while((line=buf.readLine())!=null) {
String[] row = line.split("\t");
rows.add(row);
}
System.out.println(rows.toString()); // rows is a List
// use rows.toArray(...) to convert to array if necessary

Here is a simple way to load a .txt file and store it into a array for a set amount of lines.
import java.io.*;
public class TestPrograms {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
String conent = new String("da");
String[] daf = new String[5];//the intiger is the number of lines +1 to
// account for the empty line.
try{
String fileName = "Filepath you have to the file";
File file2 = new File(fileName);
FileInputStream fstream = new FileInputStream(file2);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
int i = 1;
while((conent = br.readLine()) != null) {
daf[i] = conent;
i++;
}br.close();
System.out.println(daf[1]);
System.out.println(daf[2]);
System.out.println(daf[3]);
System.out.println(daf[4]);
}catch(IOException ioe){
System.out.print(ioe);
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Join two csv files into one in java - java

Related

How to truncate csv file to n of rows not reading the whole file

How to delete a specific row from a CSV file using search string

how to import a txt file into a JTable and make the first column auto increment according to the number of lines of the file

Java csv header iteration dynamically

I want to read a text file, split it, and store the results in an array

Categories

Resources