Related
I am currently writing an algorithm that creates an ArrayList from a .txt file, checks it with a loop for duplicates (where the loop should look like this:
Line one is written to new .txt & boolean found is set to true because the string was already found.
Line 2 is written to new .txt etc.
But if two strings are identical, the duplicate, i.e. the second string should just be ignored and continue with the next one).
public class test {
public static void main(String[] args) throws IOException {
String suche = "88 BETRAG-MINUS VALUE 'M'.";
String suche2 = "88 BETRAG-PLUS VALUE 'P'";
boolean gefunden = false;
File neueDatei = new File("C:\\Dev\\xx.txt");
if (neueDatei.createNewFile()) {
System.out.println("Datei wurde erstellt");
}
if (gefunden == false) {
dateiEinlesen(null, gefunden);
ArrayList<String> arr = null;
inNeueDateischreiben(neueDatei, gefunden, arr, suche, suche2);
}
}
public static void dateiEinlesen(File neueDatei, boolean gefunden) {
BufferedReader reader;
String zeile = null;
try {
reader = new BufferedReader(new FileReader("C:\\Dev\\Test.txt"));
zeile = reader.readLine();
ArrayList<String[]> arr = new ArrayList<String[]>();
while (zeile != null) {
arr.add(zeile.split(" "));
zeile = reader.readLine();
}
System.out.println(arr);
} catch (IOException e) {
System.err.println("Error2 :" + e);
}
}
public static void inNeueDateischreiben(File neueDatei, boolean gefunden, ArrayList<String> arr, String suche2,
String suche22) throws IOException {
FileWriter writer = new FileWriter(suche22);
String lastValue = null;
for (Iterator<String> i = arr.iterator(); i.hasNext();) {
String currentValue = i.next();
if (lastValue != null && currentValue.equals(lastValue)) {
i.remove();
{
writer.write(suche2.toString());
gefunden = true;
}
}
writer.close();
}
}
}
Your variable namings (suche2, suche22) makes reading the code difficult.
Other than that, your writing algorithm looks funny. You only compare adjacent lines while duplicate lines could be anywhere. In addition, writer.write only hits when you find a duplicate. Also how you call it and other things don't look right.
Here are some general steps to write this correctly:
Open the file so you can read it line by line.
Create a file writer
Create a set or dictionary like data structure that enables you to look up items in constant time.
For each line that you read do the following:
Look if the line exists in the dictionary.
If not, write it to the new file
If it already exists in the dictionary, skip to step 4.
Add that line to the dictionary for later comparisons and go to step 4.
When the lines are exhausted close both files.
I suggest, you rewrite your code completely as the current version is very difficult to amend.
I am trying to write a program that checks two files and prints the common contents from both the files.
Example of the file 1 content would be:
James 1
Cody 2
John 3
Example of the file 2 content would be:
1 Computer Science
2 Chemistry
3 Physics
So the final output printed on the console would be:
James Computer Science
Cody Chemistry
John Physics
Here is what I have so far in my code:
public class Filereader {
public static void main(String[] args) throws Exception {
File file = new File("file.txt");
File file2 = new File("file2.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
BufferedReader reader2 = new BufferedReader(new FileReader(file2));
String st, st2;
while ((st = reader.readLine()) != null) {
System.out.println(st);
}
while ((st2 = reader2.readLine()) != null) {
System.out.println(st2);
}
reader.close();
reader2.close();
}
}
I am having trouble in figuring out how to match the file contents, and print only the student name and their major by matching the student id in each of the file. Thanks for all the help.
You can use the other answers and make an object to every file, like tables in databases.
public class Person{
Long id;
String name;
//getters and setters
}
public class Course{
Long id;
String name;
//getters and setters
}
Them you have more control with your columns and it is simple to use.
Further you will use an ArrayList<Person> and an ArrayList<Course> and your relation can be a variable inside your objects like courseId in Person class or something else.
if(person.getcourseId() == course.getId()){
...
}
Them if the match is the first number of the files use person.getId() == course.getId().
Ps: Do not use split(" ") in your case, because you can have other objects with two values i.e 1 Computer Science.
What you want is to organize your text file data into map, then merge their data. This will work even if your data are mixed, not in order.
public class Filereader {
public static void main(String[] args) throws Exception {
File file = new File("file.txt");
File file2 = new File("file2.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
BufferedReader reader2 = new BufferedReader(new FileReader(file2));
String st, st2;
Map<Integer, String> nameMap = new LinkedHashMap<>();
Map<Integer, String> majorMap = new LinkedHashMap<>();
while ((st = reader.readLine()) != null) {
System.out.println(st);
String[] parts = st.split(" "); // Here you got ["James", "1"]
String name = parts[0];
Integer id = Integer.parseInt(parts[1]);
nameMap.put(id, name);
}
while ((st2 = reader2.readLine()) != null) {
System.out.println(st2);
String[] parts = st2.split(" ");
String name = parts[1];
Integer id = Integer.parseInt(parts[0]);
majorMap.put(id, name);
}
reader.close();
reader2.close();
// Combine and print
nameMap.keySet().stream().forEach(id -> {
System.out.println(nameMap.get(id) + " " + majorMap.get(id));
})
}
}
You should read these files at the same time in sequence. This is easy to accomplish with a single while statement.
while ((st = reader.readLine()) != null && (st2 = reader2.readLine()) != null) {
// print both st and st2
}
The way your code is written now, it reads one file at a time, printing data to the console from each individual file. If you want to meld the results together, you have to combine the output of the files in a single loop.
Given that the intention may also be that you have an odd-sized file in one batch but you do have numbers to correlate across, or the numbers may come in a nonsequential order, you may want to store these results into a data structure instead, like a List, since you know the specific index of each of these values and know where they should fit in.
Combining the NIO Files and Stream API, it's a little simpler:
public static void main(String[] args) throws Exception {
Map<String, List<String[]>> f1 = Files
.lines(Paths.get("file1"))
.map(line -> line.split(" "))
.collect(Collectors.groupingBy(arr -> arr[1]));
Map<String, List<String[]>> f2 = Files
.lines(Paths.get("file2"))
.map(line -> line.split(" "))
.collect(Collectors.groupingBy(arr -> arr[0]));
Stream.concat(f1.keySet().stream(), f2.keySet().stream())
.distinct()
.map(key -> f1.get(key).get(0)[0] + " " + f2.get(key).get(0)[1])
.forEach(System.out::println);
}
As can easily be noticed in the code, there are assumptions of valid data an of consistency between the two files. If this doesn't hold, you may need to first run a filter to exclude entries missing in either file:
Stream.concat(f1.keySet().stream(), f2.keySet().stream())
.filter(key -> f1.containsKey(key) && f2.containsKey(key))
.distinct()
...
If you change the order such that the number comes first in both files, you can read both files into a HashMap then create a Set of common keys. Then loop through the set of common keys and grab the associated value from each Hashmap to print:
My solution is verbose but I wrote it that way so that you can see exactly what's happening.
import java.util.Set;
import java.util.HashSet;
import java.util.Map;
import java.util.HashMap;
import java.io.File;
import java.util.Scanner;
class J {
public static Map<String, String> fileToMap(File file) throws Exception {
// TODO - Make sure the file exists before opening it
// Scans the input file
Scanner scanner = new Scanner(file);
// Create the map
Map<String, String> map = new HashMap<>();
String line;
String name;
String code;
String[] parts = new String[2];
// Scan line by line
while (scanner.hasNextLine()) {
// Get next line
line = scanner.nextLine();
// TODO - Make sure the string has at least 1 space
// Split line by index of first space found
parts = line.split(" ", line.indexOf(' ') - 1);
// Get the class code and string val
code = parts[0];
name = parts[1];
// Insert into map
map.put(code, name);
}
// Close input stream
scanner.close();
// Give the map back
return map;
}
public static Set<String> commonKeys(Map<String, String> nameMap,
Map<String, String> classMap) {
Set<String> commonSet = new HashSet<>();
// Get a set of keys for both maps
Set<String> nameSet = nameMap.keySet();
Set<String> classSet = classMap.keySet();
// Loop through one set
for (String key : nameSet) {
// Make sure the other set has it
if (classSet.contains(key)) {
commonSet.add(key);
}
}
return commonSet;
}
public static Map<String, String> joinByKey(Map<String, String> namesMap,
Map<String, String> classMap,
Set<String> commonKeys) {
Map<String, String> map = new HashMap<String, String>();
// Loop through common keys
for (String key : commonKeys) {
// TODO - check for nulls if get() returns nothing
// Fetch the associated value from each map
map.put(namesMap.get(key), classMap.get(key));
}
return map;
}
public static void main(String[] args) throws Exception {
// Surround in try catch
File names = new File("names.txt");
File classes = new File("classes.txt");
Map<String, String> nameMap = fileToMap(names);
Map<String, String> classMap = fileToMap(classes);
Set<String> commonKeys = commonKeys(nameMap, classMap);
Map<String, String> nameToClass = joinByKey(nameMap, classMap, commonKeys);
System.out.println(nameToClass);
}
}
names.txt
1 James
2 Cody
3 John
5 Max
classes.txt
1 Computer Science
2 Chemistry
3 Physics
4 Biology
Output:
{Cody=Chemistry, James=Computer, John=Physics}
Notes:
I added keys in classes.txt and names.txt that purposely did not match so you see that it does not come up in the output. That is because the key never makes it into the commonKeys set. So, they never get inserted into the joined map.
You can loop through the HashMap if you want my calling map.entrySet()
I have a CSV file with this content:
2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894 2017-10-29 00:00:00.0,"1010",-125529,0,0,0,420743,0,0,256,420743,256,420743 2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894 2017-10-29 00:00:00.0,"1013",-10625,0,0,-687,599098,0,0,379,599098,379,599098 2017-10-29 00:00:00.0,"1604",-1794.9,0,0,-3.99,4081.07,0,0,361,4081.07,361,4081.07
So lines 1 and 3 are duplicates.
Now I want to read the file in and print out duplicate lines in the console.
I set up this Java code reading the file in and throwing it line by line into an ArrayList. Then I create an immutable
copy, loop through the ArrayList and in the binarySearch I use the immutable copy of the ArrayList:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class ReadValidationFile {
public static void main(String[] args) {
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
List<String> validationFileCopy = Collections.unmodifiableList(validationFile);
for(String line : validationFile){
int comp = Collections.binarySearch(validationFileCopy,line,new ComparatorLine());
if (comp <= 0){
System.out.println(line);
}
}
}
}
Comparator Class:
import java.util.Comparator;
public class ComparatorLine implements Comparator<String> {
#Override
public int compare(String s1, String s2) {
return s1.compareToIgnoreCase(s2);
}
}
I expect this line to be printed:
2017-10-29 00:00:00.0,"1005",-10227,0,0,0,332894,0,0,222,332894,222,332894
But the output I get is this:
2017-10-29 00:00:00.0,"1010",-125529,0,0,0,420743,0,0,256,420743,256,420743
Can you help me please to see what I am doing wrong? My comparator I think is okay. What is wrong with my
ArrayLists?
The other answer(s) correctly state that you should be using Set instead of List. But for the sake of learning, let's have a look at your code and see where you went wrong.
public class ReadValidationFile {
public static void main(String[] args) {
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
Semicolon is unnecessary.
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
This can all be achieved in just one line: List<String> validationFile = Files.readAllLines(Paths.get("validation_small.csv"), "utf-8");
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
List<String> validationFileCopy = Collections.unmodifiableList(validationFile);
Actually, this is not a copy. It is just an unmodifiable view of the same list.
for(String line : validationFile){
int comp = Collections.binarySearch(validationFileCopy,line,new ComparatorLine());
You might as well just search validationFile itself. However, you are calling binarySearch which only works on sorted lists, but your list is not sorted. See documentation.
if (comp <= 0){
System.out.println(line);
}
You are printing when it's not found (comp <= 0). If the search succeeds, it will return a non-negative number (comp >= 0). But another problem is that you are searching the whole list for each element, and the search will obviously always succeed (that is, if your list was sorted).
Save yourself all the trouble and use a Set instead. And, using Java 8 streams, the whole program can be reduced to the following:
public static void main(String[] args) throws Exception {
Set<String> uniqueLines = new HashSet<>();
Files.lines(Paths.get("", "utf-8"))
.filter(line -> !uniqueLines.add(line))
.forEach(System.out::println);
}
If you really need to ignore case when comparing strings (from your given data, it looks like it doesn't make any difference since it's just numbers), then store each unique line by first uppercasing and then lowercasing it. This apparently cumbersome technique is necessary because just lowercasing is not enough if dealing with non-English language text. The equalsIgnoreCase method also does this.
public static void main(String[] args) throws Exception {
Set<String> uniqueLines = new HashSet<>();
Files.lines(Paths.get("", "utf-8"))
.filter(line -> !uniqueLines.add(line.toUpperCase().toLowerCase()))
.forEach(System.out::println);
}
Create a Set while reading lines from the input csv file, anytime add() element to set returns false print the line as it is duplicate line.
If you want list of all duplicate lines then create a List which will have lines that returned false when tried add() to Set.
NOTE:
I have simulated your file reading by using a static data.
Small note, if your data only contains numbers and no alphabets then you do not need case-insensitive comparison.
If your data contains alphabets then also you do not need a special Comparator as you can insert data into Set using add(line.toLowerCase()) which will ensure that all lines are compared with lower case and then added to Set.
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class ReadValidationFile {
static List<String> validationFile = new ArrayList<>();
static {
validationFile.add("2017-10-29 00:00:00.0,\"1005\",-10227,0,0,0,332894,0,0,222,332894,222,332894");
validationFile.add("2017-10-29 00:00:00.0,\"1010\",-125529,0,0,0,420743,0,0,256,420743,256,420743");
validationFile.add("2017-10-29 00:00:00.0,\"1005\",-10227,0,0,0,332894,0,0,222,332894,222,332894");
validationFile.add("2017-10-29 00:00:00.0,\"1013\",-10625,0,0,-687,599098,0,0,379,599098,379,599098");
validationFile.add("2017-10-29 00:00:00.0,\"1604\",-1794.9,0,0,-3.99,4081.07,0,0,361,4081.07,361,4081.07");
}
public static void main(String[] args) {
// Option 1 : unique lines only
Set<String> uniqueLinesOnly = new HashSet<>(validationFile);
// Option 2 : unique lines and duplicate lines
Set<String> uniqueLines = new HashSet<>();
Set<String> duplicateLines = new HashSet<>();
for (String line : validationFile) {
if (!uniqueLines.add(line.toLowerCase())) {
duplicateLines.add(line.toLowerCase());
}
}
// Option 3 : unique lines and duplicate lines by Java Streams
Set<String> uniquesJava8 = new HashSet<>();
List<String> duplicatesJava8 = validationFile
.stream()
.filter(element -> !uniquesJava8.add(element.toLowerCase()))
.map(element -> element.toLowerCase())
.collect(Collectors.toList());
}
}
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class ReadValidationFile {
public static void main(String[] args){
List<String> validationFile = new ArrayList<>();
try(BufferedReader br = new BufferedReader(new FileReader("validation_small.csv"));){
String line;
while((line = br.readLine())!= null){
validationFile.add(line);
}
} catch (FileNotFoundException e) {
//e.printStackTrace();
System.out.println("file not found " + e.getMessage());
} catch (IOException e) {
e.printStackTrace();
}
Set<String> uniques = new HashSet<>();
List<String> duplicates = validationFile.stream().filter(i->!uniques.add(i)).collect(Collectors.toList());
System.out.println(duplicates);
}
}
Forgive me if this is a basic (or not very well explained) question, I am fairly new to Java and have been reading extensive material as well as trying to understand the relevant Javadoc but to no avail.
To give a brief background as to what I am trying to create, I have created a reader class which reads data in from a csv file (4 lines long) including fields such as Item ID, price, description etc. I have created a separate demo class that displays the details of this csv file (through creating an instance of my reader class) and am now trying to create a method that asks the user to input an Item ID that then displays the corresponding Item, based on the ID input by the user. The part I am stuck on is accessing specific rows/columns in a csv file and then comparing these with a given string (entered by the user which corresponds to a specific field in the csv file)
This is what I have come up with thus far:
input = new Scanner(System.in);
System.out.println("Enter a product code");
String prodC = input.next();
//Here I want to know if there is a way of accessing a field in a csv file
Any ideas would be greatly appreciated.
UPDATE
Thank you for quick responses, am currently reading through and seeing how I can try to implement the various techniques. In response to the comment asking about the file reader, this is how I have set that out:
public CatalogueReader(String filename) throws FileNotFoundException {
this.filename = filename;
this.catalogue = new Catalogue();
Scanner csvFile;
try {
csvFile = new Scanner(new File(filename));
} catch (FileNotFoundException fnf) {
throw new FileNotFoundException("File has not been found!");
}
csvFile.useDelimiter("\n");
boolean first = true;
String productCode;
double price;
String description;
double weight;
int rating;
String category;
boolean ageRestriction;
String csvRows;
while (csvFile.hasNextLine()) {
csvRows = csvFile.nextLine();
if (first) {
first = false;
continue;
}
System.out.println(csvRows);
String[] fields = csvRows.split(",");
productCode = (fields[0].trim());
price = Double.parseDouble(fields[1].trim());
description = fields[2].trim();
weight = Double.parseDouble(fields[3].trim());
rating = Integer.parseInt(fields[4].trim());
category = fields[5].trim();
ageRestriction = Boolean.parseBoolean(fields[6].trim());
catalogue.addAProduct(new Item(productCode, price, description, weight, rating, category, ageRestriction));
}
csvFile.close();
}
}
ok so for a CSV file like this:
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
here is a sample of how to read using Java Native Libraries
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class CSVReader {
public static void main(String[] args) {
CSVReader obj = new CSVReader();
obj.run();
}
public void run() {
String csvFile = YOURFILEPATHHERE ;
BufferedReader br = null;
String line = "";
String cvsSplitBy = ",";
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] country = line.split(cvsSplitBy);
System.out.println("Country [code= " + country[4]
+ " , name=" + country[5] + "]");
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
System.out.println("Done");
}
}
does this help?
If you are just doing a single look-up and then exiting then just remember the String you are looking for. As you parse the lines compare to see if you have a match and if you do then return that line.
For repeated searches that would be very inefficient though. Assuming your data set is not too large for memory you would be better off parsing the file and putting it into a Map:
Map<String, Data> dataMap = new HashMap<>();
Parse the file, putting all the lines into the map
Then the lookup just becomes:
Data d = dataMap.get(lineKey);
If d is null then there is no matching line. If it not null then you have found your line.
You can create an array list of object. An object for each line in the CSV. Then search the array object with your search criteria.
User CSVReader framework to read the csv file. Sample code (not exactly what you want)
FileInputStream fis = new FileInputStream(file);
CSVReader reader = new CSVReader(new BufferedReader( new InputStreamReader(fis, "UTF-8" )));
ArrayList<String> row = new ArrayList<String>();
ArrayList<Entry> entries = new ArrayList<Entry>();
// a line = ID, Name, Price, Description
while (!reader.isEOF()) {
reader.readFields(row);
if( row.size() >= 4)
entries.add(new Entry(row.get(0), row.get(1), row.get(2), row.get(3)));
}
System.out.println("Size : "+entries);
Ive been working on this code for quite sometime and just want to be given the simple heads up if im routing down a dead end. The point where im at now is to mathch identical cells from diffrent .csv files and copy one row into another csv file. The question really is would it be possible to write at specfic lines say for example if the the 2 cells match at row 50 i wish to write back on to row 50. Im assuming that i would maybe extract everything to a hashmap, write it in there then write back to the .csv file? is there a easier way?
for example i have one Csv that has person details, and the other has property details of where the actual person lives, i wish to copy the property details to the person csv, aswell as match them up with the correct person detail. hope this makes sense
public class Old {
public static void main(String [] args) throws IOException
{
List<String[]> cols;
List<String[]> cols1;
int row =0;
int count= 0;
boolean b;
CsvMapReader Reader = new CsvMapReader(new FileReader("file1.csv"), CsvPreference.EXCEL_PREFERENCE);
CsvMapReader Reader2 = new CsvMapReader(new FileReader("file2.csv"), CsvPreference.EXCEL_PREFERENCE);
try {
cols = readFile("file1.csv");
cols1 = readFile("fiel2.csv");
String [] headers = Reader.getCSVHeader(true);
headers = header(cols1,headers
} catch (IOException e) {
e.printStackTrace();
return;
}
for (int j =1; j<cols.size();j++) //1
{
for (int i=1;i<cols1.size();i++){
if (cols.get(j)[0].equals(cols1.get(i)[0]))
{
}
}
}
}
private static List<String[]> readFile(String fileName) throws IOException
{
List<String[]> values = new ArrayList<String[]>();
Scanner s = new Scanner(new File(fileName));
while (s.hasNextLine()) {
String line = s.nextLine();
values.add(line.split(","));
}
return values;
}
public static void csvWriter (String fileName, String [] nameMapping ) throws FileNotFoundException
{
ICsvListWriter writer = new CsvListWriter(new PrintWriter(fileName),CsvPreference.STANDARD_PREFERENCE);
try {
writer.writeHeader(nameMapping);
} catch (IOException e) {
e.printStackTrace();
}
}
public static String[] header(List<String[]> cols1, String[] headers){
List<String> list = new ArrayList<String>();
String [] add;
int count= 0;
for (int i=0;i<headers.length;i++){
list.add(headers[i]);
}
boolean c;
c= true;
while(c) {
add = cols1.get(0);
list.add(add[count]);
if (cols1.get(0)[count].equals(null))// this line is never read errpr
{
c=false;
break;
} else
count ++;
}
String[] array = new String[list.size()];
list.toArray(array);
return array;
}
Just be careful if you read all of the addresses and person details into memory first (as Thomas has suggested) - if you're only dealing with small CSV files then it's fine, but you may run out of memory if you're dealing with larger files.
As an alternative, I've put together an example that reads the addresses in first, then writes the combined person/address details while it reads in the person details.
Just a few things to note:
I've used CsvMapReader and CsvMapWriter because you were - this meant I've had to use a Map containing a Map for storing the addresses. Using CsvBeanReader/CsvBeanWriter would make this a bit more elegant.
The code from your question doesn't actually use Super CSV to read the CSV (you're using Scanner and String.split()). You'll run into issues if your CSV contains commas in the data (which is quite possible with addresses), so it's a lot safer to use Super CSV, which will handle escaped commas for you.
Example:
package example;
import java.io.StringReader;
import java.io.StringWriter;
import java.util.HashMap;
import java.util.Map;
import org.supercsv.io.CsvMapReader;
import org.supercsv.io.CsvMapWriter;
import org.supercsv.io.ICsvMapReader;
import org.supercsv.io.ICsvMapWriter;
import org.supercsv.prefs.CsvPreference;
public class CombiningPersonAndAddress {
private static final String PERSON_CSV = "id,firstName,lastName\n"
+ "1,philip,fry\n2,amy,wong\n3,hubert,farnsworth";
private static final String ADDRESS_CSV = "personId,address,country\n"
+ "1,address 1,USA\n2,address 2,UK\n3,address 3,AUS";
private static final String[] COMBINED_HEADER = new String[] { "id",
"firstName", "lastName", "address", "country" };
public static void main(String[] args) throws Exception {
ICsvMapReader personReader = null;
ICsvMapReader addressReader = null;
ICsvMapWriter combinedWriter = null;
final StringWriter output = new StringWriter();
try {
// set up the readers/writer
personReader = new CsvMapReader(new StringReader(PERSON_CSV),
CsvPreference.STANDARD_PREFERENCE);
addressReader = new CsvMapReader(new StringReader(ADDRESS_CSV),
CsvPreference.STANDARD_PREFERENCE);
combinedWriter = new CsvMapWriter(output,
CsvPreference.STANDARD_PREFERENCE);
// map of personId -> address (inner map is address details)
final Map<String, Map<String, String>> addresses =
new HashMap<String, Map<String, String>>();
// read in all of the addresses
Map<String, String> address;
final String[] addressHeader = addressReader.getCSVHeader(true);
while ((address = addressReader.read(addressHeader)) != null) {
final String personId = address.get("personId");
addresses.put(personId, address);
}
// write the header
combinedWriter.writeHeader(COMBINED_HEADER);
// read each person
Map<String, String> person;
final String[] personHeader = personReader.getCSVHeader(true);
while ((person = personReader.read(personHeader)) != null) {
// copy address details to person if they exist
final String personId = person.get("id");
final Map<String, String> personAddress = addresses.get(personId);
if (personAddress != null) {
person.putAll(personAddress);
}
// write the combined details
combinedWriter.write(person, COMBINED_HEADER);
}
} finally {
personReader.close();
addressReader.close();
combinedWriter.close();
}
// print the output
System.out.println(output);
}
}
Output:
id,firstName,lastName,address,country
1,philip,fry,address 1,USA
2,amy,wong,address 2,UK
3,hubert,farnsworth,address 3,AUS
From your comment, it seems like you have the following situation:
File 1 contains persons
File 2 contains addresses
You then want to match persons and addresses by some key ( one or more fields) and write the combination back to a CSV file.
Thus the simplest approach might be something like this:
//use a LinkedHashMap to preserve the order of the persons as found in file 1
Map<PersonKey, String[]> persons = new LinkedHashMap<>();
//fill in the persons from file 1 here
Map<PersonKey, String[]> addresses = new HashMap<>();
//fill in the addresses from file 2 here
List<String[]> outputLines = new ArrayList<>(persons.size());
for( Map.Entry<PersonKey, String[]> personEntry: persons.entrySet() ) {
String[] person = personEntry.getValue();
String[] address = addresses.get( personEntry.getKey() );
//merge the two arrays and put them into outputLines
}
//write outputLines to a file
Note that PersonKey might just be a String or a wrapper object ( Integer etc.) if you can match persons and addresses by one field. If you have more fields you might need a custom PersonKey object with equals() and hashCode() properly overridden.