Java program is creating extra columns in csv file - java

I am creating a csv file in a JDBC program that pulls data and then writes it to the csv file. The problem is that some of the data comes in like "Birmingham, AL" and when the program sees the "," it is creating a new column when really it should all be in one column. The first 15000 rows come in correctly with no commas, but then some commas start appearing and it is creating new columns where it shouldn't. I was wondering if there was a way to catch and avoid this or to workaround this issue. I hope I'm explaining this well enough. Feel free to ask for more information.
EDIT: Here is the snippet of code that does the work.
while (rs.next()) {
for (int i = 0; i < cols; i++) {
if (i > 0) {
Object value = rs.getObject(i);
if (value == null || rs.wasNull())
out.write("NULL" + ",");
else
out.write(value.toString() + ",");
}
}
out.newLine();
}
out.close();
writer.close();

instead of
out.write(value.toString() + ",");
try
out.write("\""+value.toString() + "\",");

Related

How can I format output values?

This my output file. I'used standart "System.out.print()" for writing operation. There is a tab space between each value. But due to difference of word size, format looks bad. How can I fix the format of output?
This should print an element of your String by column every two tabs. If you need more tabs because the names are too long, you can increase the number of tabs in //here rememberingh that every tab have a length of 8 characters.
try{
FileOutputStream file = new FileOutputStream("filename.txt", true);
PrintStream out = new PrintStream(file);
String s=line;
String[] items = s.split("\t");
for(String item : items){
if(item.length()<8){ //here
out.print(item + "\t\t" ); //here
}else{
if(item.length()>16){ //here
out.print(item + "\t" );
}
}
}catch (IOException e){
System.out.println("Error: " + e);
System.exit(1);
}
}
You can write to a .csv file which can be opened in Excel and will give you data divided by column.
Use System.out.format, you can specify width to make things consistent.
Check out the Oracle Tutorial explanation.

When copying the values from one Arraylist to another, all values get set as the last value entered in the first Arraylist

I am having some issue writing a small program for myself. I have a class that will read from a CSV and put the data into arraylist of the class Farts (this is short for fighting arts before you ask). This works, when I test my output I can see the values read in from the csv.
AssetManager assetManager =context.getAssets();
ArrayList<Farts> arrayfarts = new ArrayList<Farts>();
InputStream csvStream = null;
System.out.println("right before try loop");
try {
csvStream = assetManager.open("farts.txt");
InputStreamReader csvStreamReader = new InputStreamReader(csvStream);
CSVReader csvreader = new CSVReader(csvStreamReader);
Farts fdata = new Farts();
String[] line;
int temp=0;
while ((line=csvreader.readNext())!=null){
System.out.println("inside of while");
fdata.fname=line[0];
fdata.description=line[1];
arrayfarts.add(temp,fdata);
System.out.println("Array iteration " + temp + " of " + arrayfarts.size());
System.out.println(arrayfarts.get(temp).fname + " " + arrayfarts.get(temp).description + "\n");
temp++;
}
} catch (IOException e) {
e.printStackTrace();
System.out.println("ioexception"+e.toString());
}
return arrayfarts;
}
Back in my Main class I then use .addall to add what is returned from above. However the data in the Arraylist is populated by only the last value entered into it.
arrayFarts.addAll(readcsv.retrieveFarts(this));
System.out.println(arrayFarts.get(0).fname + " " + arrayFarts.get(0).description + "\n");
int temp=0;
while (temp<arrayFarts.size()){
cupboard().withDatabase(db).put(arrayFarts.get(temp));
System.out.println("Array iteration " + temp + " of " + arrayFarts.size());
System.out.println(arrayFarts.get(temp).fname+" "+arrayFarts.get(temp).description+"\n");
temp++;
}
Am I missing something?
The problem lies in these lines:
fdata.fname=line[0];
fdata.description=line[1];
arrayfarts.add(temp,fdata);
You need to create a new Farts instance each loop, not change a single instance.
As it stands, you simply add the same instance multiple times to the list, while changing its values. This means that you will see only the last parsed values after the loop completes. You will also see it multiple times, as ArrayList allows duplicates.

Deleting a specific line from a text file in Java

I am setting up a rank system where each member has a username and a rank. The program reads the username and rank from a text file and assigns it to the user.
One username and rank per line, such as:
user1 1
user2 2
user3 3
I have set up a program to add usernames and ranks to the text file, however I cannot seem to figure out how to delete a specific user from the list, such as if I wanted to only delete user 2 and his/her rank and leave the other two, however it is important that afterwards there isn't a blank line left behind.
Just for reference here is the code for how I write it to the file in the first place:
try {
BufferedWriter out = new BufferedWriter(new FileWriter("stafflist.txt", true));
for (int i = 0; i < 1; i++) {
out.newLine();
out.write(target.getUsername() + " " + target.getRights());
}
out.close();
SerializableFilesManager.savePlayer(target);
if (loggedIn) {
target.getPackets().sendGameMessage(modString + Utils.formatPlayerNameForDisplay(member.getUsername()) + "!", true);}
member.getPackets().sendGameMessage(successString + Utils.formatMemberNameForDisplay(target.getUsername()) + " to a Moderator.",true);
loggedIn = false;
} catch (IOException e) {
System.out.println("GiveMod - Can't find stafflist.txt");
}
return true;
You cannot delete data from the middle of a file (without leaving nulls). You need to rewrite at least what underneath it. A Simple solution would be loading everything in memory, remove that line and dump the collection again.
An alternative solution would be to:
Open a FileChannel from a RandomAccessFile
read the file line by line and keep the file-pointer of the line head. fileChannel.position();file.readLine(); load what comes after that into a collection. truncate the file from that position file.setLength(linePosition); and then dump the collection at the end of the file.
If your data doesn't fit in memory then you can use a temp file instead of a collection. Create a temp-file File.createTempFile(...), read the remaining data line by line and write to temp, truncate the original file ,read temp line by and write to original.
OR, guess what, use a database.
There seems to be an issue in your for loop. It is looping between 0 and 1, so I think the output you posted is incorrect. Anyway, if you want to only print out certain lines you can filter it as follows:
for (int i = 0; i < 1; i++) {
if(!target.getUsername().equals("user2")){
out.newLine();
out.write(target.getUsername() + " " + target.getRights());
}
}
Read the file into some Collection, remove desired users and rewrite the file using the modified Collection.

creating large csv files in Java getting really slow

i have a performance problem when trying to create a csv file starting from another csv file.
this is how the original file looks:
country,state,co,olt,olu,splitter,ont,cpe,cpe.latitude,cpe.longitude,cpe.customer_class,cpe.phone,cpe.ip,cpe.subscriber_id
COUNTRY-0001,STATE-0001,CO-0001,OLT-0001,OLU0001,SPLITTER-0001,ONT-0001,CPE-0001,28.21487,77.451775,ALL,SIP:+674100002743#IMS.COMCAST.NET,SIP:E28EDADA06B2#IMS.COMCAST.NET,CPE_SUBSCRIBER_ID-QHLHW4
COUNTRY-0001,STATE-0002,CO-0002,OLT-0002,OLU0002,SPLITTER-0002,ONT-0002,CPE-0002,28.294018,77.068924,ALL,SIP:+796107443092#IMS.COMCAST.NET,SIP:58DD999D6466#IMS.COMCAST.NET,CPE_SUBSCRIBER_ID-AH8NJQ
potentially it could be millions of lines like this, i have detected the problem with 1.280.000 lines.
this is the algorithm:
File csvInputFile = new File(csv_path);
int blockSize = 409600;
brCsvInputFile = new BufferedReader(frCsvInputFile, blockSize);
String line = null;
StringBuilder sbIntermediate = new StringBuilder();
skipFirstLine(brCsvInputFile);
while ((line = brCsvInputFile.readLine()) != null) {
createIntermediateStringBuffer(sbIntermediate, line.split(REGEX_COMMA));
}
private static void skipFirstLine(BufferedReader br) throws IOException {
String line = br.readLine();
String[] splitLine = line.split(REGEX_COMMA);
LOGGER.debug("First line detected! ");
createIndex(splitLine);
createIntermediateIndex(splitLine);
}
private static void createIndex(String[] splitLine) {
LOGGER.debug("START method createIndex.");
for (int i = 0; i < splitLine.length; i++)
headerIndex.put(splitLine[i], i);
printMap(headerIndex);
LOGGER.debug("COMPLETED method createIndex.");
}
private static void createIntermediateIndex(String[] splitLine) {
LOGGER.debug("START method createIntermediateIndex.");
com.tekcomms.c2d.xml.model.v2.Metadata_element[] metadata_element = null;
String[] servicePath = newTopology.getElement().getEntity().getService_path().getLevel();
if (newTopology.getElement().getMetadata() != null)
metadata_element = newTopology.getElement().getMetadata().getMetadata_element();
LOGGER.debug(servicePath.toString());
LOGGER.debug(metadata_element.toString());
headerIntermediateIndex.clear();
int indexIntermediateId = 0;
for (int i = 0; i < servicePath.length; i++) {
String level = servicePath[i];
LOGGER.debug("level is: " + level);
headerIntermediateIndex.put(level, indexIntermediateId);
indexIntermediateId++;
// its identificator is going to be located to the next one
headerIntermediateIndex.put(level + "ID", indexIntermediateId);
indexIntermediateId++;
}
// adding cpe.latitude,cpe.longitude,cpe.customer_class, it could be
// better if it would be metadata as well.
String labelLatitude = newTopology.getElement().getEntity().getLatitude();
// indexIntermediateId++;
headerIntermediateIndex.put(labelLatitude, indexIntermediateId);
String labelLongitude = newTopology.getElement().getEntity().getLongitude();
indexIntermediateId++;
headerIntermediateIndex.put(labelLongitude, indexIntermediateId);
String labelCustomerClass = newTopology.getElement().getCustomer_class();
indexIntermediateId++;
headerIntermediateIndex.put(labelCustomerClass, indexIntermediateId);
// adding metadata
// cpe.phone,cpe.ip,cpe.subscriber_id,cpe.vendor,cpe.model,cpe.customer_status,cpe.contact_telephone,cpe.address,
// cpe.city,cpe.state,cpe.zip,cpe.bootfile,cpe.software_version,cpe.hardware_version
// now i need to iterate over each Metadata_element belonging to
// topology.element.metadata
// are there any metadata?
if (metadata_element != null && metadata_element.length != 0)
for (int j = 0; j < metadata_element.length; j++) {
String label = metadata_element[j].getLabel();
label = label.toLowerCase();
LOGGER.debug(" ==label: " + label + " index_pos: " + j);
indexIntermediateId++;
headerIntermediateIndex.put(label, indexIntermediateId);
}
printMap(headerIntermediateIndex);
LOGGER.debug("COMPLETED method createIntermediateIndex.");
}
Reading the entire dataset, 1.280.000 lines take 800 ms! so the problem is in this method
private static void createIntermediateStringBuffer(StringBuilder sbIntermediate, String[] splitLine) throws ClassCastException,
NullPointerException {
LOGGER.debug("START method createIntermediateStringBuffer.");
long start, end;
start = System.currentTimeMillis();
ArrayList<String> hashes = new ArrayList<String>();
com.tekcomms.c2d.xml.model.v2.Metadata_element[] metadata_element = null;
String[] servicePath = newTopology.getElement().getEntity().getService_path().getLevel();
LOGGER.debug(servicePath.toString());
if (newTopology.getElement().getMetadata() != null) {
metadata_element = newTopology.getElement().getMetadata().getMetadata_element();
LOGGER.debug(metadata_element.toString());
}
for (int i = 0; i < servicePath.length; i++) {
String level = servicePath[i];
LOGGER.debug("level is: " + level);
if (splitLine.length > getPositionFromIndex(level)) {
String name = splitLine[getPositionFromIndex(level)];
sbIntermediate.append(name);
hashes.add(name);
sbIntermediate.append(REGEX_COMMA).append(HashUtils.calculateHash(hashes)).append(REGEX_COMMA);
LOGGER.debug(" ==sbIntermediate: " + sbIntermediate.toString());
}
}
// end=System.currentTimeMillis();
// LOGGER.info("COMPLETED adding name hash. " + (end - start) + " ms. " + (end - start) / 1000 + " seg.");
// adding cpe.latitude,cpe.longitude,cpe.customer_class, it should be
// better if it would be metadata as well.
String labelLatitude = newTopology.getElement().getEntity().getLatitude();
if (splitLine.length > getPositionFromIndex(labelLatitude)) {
String lat = splitLine[getPositionFromIndex(labelLatitude)];
sbIntermediate.append(lat).append(REGEX_COMMA);
}
String labelLongitude = newTopology.getElement().getEntity().getLongitude();
if (splitLine.length > getPositionFromIndex(labelLongitude)) {
String lon = splitLine[getPositionFromIndex(labelLongitude)];
sbIntermediate.append(lon).append(REGEX_COMMA);
}
String labelCustomerClass = newTopology.getElement().getCustomer_class();
if (splitLine.length > getPositionFromIndex(labelCustomerClass)) {
String customerClass = splitLine[getPositionFromIndex(labelCustomerClass)];
sbIntermediate.append(customerClass).append(REGEX_COMMA);
}
// end=System.currentTimeMillis();
// LOGGER.info("COMPLETED adding lat,lon,customer. " + (end - start) + " ms. " + (end - start) / 1000 + " seg.");
// watch out metadata are optional, it can appear as a void chain!
if (metadata_element != null && metadata_element.length != 0)
for (int j = 0; j < metadata_element.length; j++) {
String label = metadata_element[j].getLabel();
LOGGER.debug(" ==label: " + label + " index_pos: " + j);
if (splitLine.length > getPositionFromIndex(label)) {
String actualValue = splitLine[getPositionFromIndex(label)];
if (!"".equals(actualValue))
sbIntermediate.append(actualValue).append(REGEX_COMMA);
else
sbIntermediate.append("").append(REGEX_COMMA);
} else
sbIntermediate.append("").append(REGEX_COMMA);
LOGGER.debug(" ==sbIntermediate: " + sbIntermediate.toString());
}//for
sbIntermediate.append("\n");
end = System.currentTimeMillis();
LOGGER.info("COMPLETED method createIntermediateStringBuffer. " + (end - start) + " ms. ");
}
As you can see, this method adds a precalculated line to the StringBuffer, reads every line from input csv file, calculate new data from that lines and finally add the generated line to the StringBuffer, so finally i can create the file with that buffer.
I have run jconsole and i can see that there are no memory leaks, i can see the sawtooths representing the creation of objects and the gc recollecting garbaje. It never traspasses the memory heap threshold.
One thing i have noticed is that the time needed for add a new line to the StringBuffer is completed within a very few ms range, (5,6,10), but is raising with time, to (100-200) ms and i suspect more in a near future, so probably this is the battle horse.
I have tried to analyze the code, i know that there are 3 for loops, but they are very shorts, the first loop iterates over 8 elements only:
for (int i = 0; i < servicePath.length; i++) {
String level = servicePath[i];
LOGGER.debug("level is: " + level);
if (splitLine.length > getPositionFromIndex(level)) {
String name = splitLine[getPositionFromIndex(level)];
sbIntermediate.append(name);
hashes.add(name);
sbIntermediate.append(REGEX_COMMA).append(HashUtils.calculateHash(hashes)).append(REGEX_COMMA);
LOGGER.debug(" ==sbIntermediate: " + sbIntermediate.toString());
}
}
I have meassured the time needed to get the name from the splitline and it is worthless, 0 ms, the same to calculateHash method, 0 ms.
the other loop, are practically the same, iterates over 0 to n, where n is a very tiny int, 3 to 10 for example, so i do not understand why it takes more time to finish the method, the only thing i find is that to add a new line to the buffer is getting slow the process.
I am thinking about a producer consumer multi threaded strategy, a reader thread that reads every line and put them into a circular buffer, another threads take it one by one, process them and add a precalculated line to the StringBuffer, which is thread safe, when the file is fully readed, the reader thread sends a message to to the another threads telling them to stop. Finally i have to save this buffer to a file. What do you think? this is a good idea?
I am thinking about a producer consumer multi threaded strategy, a reader thread that reads every line and put them into a circular buffer, another threads take it one by one, process them and add a precalculated line to the StringBuffer, which is thread safe, when the file is fully readed, the reader thread sends a message to to the another threads telling them to stop. Finally i have to save this buffer to a file. What do you think? this is a good idea?
Maybe, but it's quite a lot of work, I'd try something simpler first.
line.split(REGEX_COMMA)
Your REGEX_COMMA is a string which gets compiled into an regex a million times. It's trivial, but I'd try to use a Pattern instead.
You're producing a lot of garbage with your split. Maybe you should avoid it by manually splitting the input into a reused ArrayList<String> (it's just a few lines).
If all you need is writing the result into a file, it might be better to avoid building one huge String. Maybe a List<String> or even a List<StringBuilder> would be better, maybe writing directly to a buffered stream would do.
You seem to be working with ASCII only. Your encoding is platform dependent which may mean you're using UTF-8, which is possibly slow. Switching to a simpler encoding could help.
Working with byte[] instead of String would most probably help. Bytes are half as big as chars and there's no conversion needed when reading a file. All the operations you do can be done with bytes equally easy.
One thing i have noticed is that the time needed for add a new line to the StringBuffer is completed within a very few ms range, (5,6,10), but is raising with time, to (100-200) ms and i suspect more in a near future, so probably this is the battle horse.
That's resizing, which could be sped up by using the suggested ArrayList<String>, as the amount of data to be copied is much lower. Writing the data out when the buffer gets big would do as well.
I have meassured the time needed to get the name from the splitline and it is worthless, 0 ms, the same to calculateHash method, 0 ms.
Never use currentTimeMillis for this as nanoTime is strictly better. Use a profiler. The problem with a profiler is that it changes what it should measure. As a poor man's profiler, you can compute the sum of all the times spend inside of the suspect method and compare it with the total time.
What's the CPU load and what does GC do when running the program?
I used superCSV library in my project to handle large set of lines. it is relatively fast than manually read the lines. Reference

How do I add data from a file to a JTable?

Here is my method so far:
public void readfile(JTable table) {
try{
BufferedReader in = new BufferedReader( new FileReader("out.txt"));
for(int i = 0; i<10; i++) {
for(int j = 0; j<5; j++) {
table.setValueAt(in.readLine(), i, j);
}
}
in.close();
}catch (Exception e) {
System.err.println("error: " + e.getMessage());
}
}
Here are the contents of out.txt:
test1
test2
test3
test4
test5
Where I run the program and attempt to load the file to the table, nothing happens. I also get an output that says the following:
error: 0 >= 0
Help me please?
I would narrow your problem down to a smaller problem, solve this smaller problem, and then widen it until you have want you want.
Think of the contents of a File as a big blob of text.
Think of the table as a Vector of Vectors.
Smaller problem: How do I convert a big blob of text into a Vector of Vectors? You need to be able to sove this problem first before tackling File I/O or DefaultTableModels.

Categories

Resources