Is it possible to access a specific row in a CSV file and replace a specific field in it? E.g. like in this pseudo code:
Row row = csvFile.getRow(123);
row.getField(2).set("someString");
Tried Apache Commons CSV and OpenCSV, but can't figure out how to do it with them. Only way I can think of is to iterate through a file, store everything in a new object and create a file, but that doesn't seem like a good way.
Thanks for any suggestion!
Sounds like a duplicate of this one to me.
I am not sure about how to do it with apache but I am sure they have the same mechanisms. In openCSV you would create an CSVReader to your source file and a CSVWriter to your destination file. Then read a record at a time, make any desired changes and write it out.
Or, if the file is small, you can do a readAll and writeAll.
But if you want to do it all in the same file read my comments in the duplicate.
Related
I would like to update a .csv file containing 1 million records. I want to update in the same file without creating another file. I am using Opencsv. I don't find any way to update the .csv file. If anyone can help me out that would be great.
For clarification let's say these small .csv file:
Initial csv file:
a,b,c,d
1,2,3,4
5,6,7,8
12,13,14,15
Desired csv file:
a,b,c,d,e,f
1,2,3,4, ,17
5,6,7,8,16,
12,13,14,15,
Basically - you cannot do that.
As csv does not use fixed-length records, most changes will require moving data up or down in the file. That cannot be done without completely rewriting the file.
1,2,3,4
5,6,7,8
...
changing the 8 to a 10 would require every byte from that location onwards in the whole file to be moved up one position.
To achieve the effect of editing the file it is usual to copy the file to a new name making the change you wish to make during the copy. You would then rename the two files so that the new one replaces the old one.
Normally file system doesnt allow to update the file. You need to rewrite the file. Because even though you have all in one file . it will be written in multiple segments. So if you are going to change the file segments will be damaged. So you need to rewrite every thing.
Im trying to convert a properties file into an excel but i dont know how.
It's look like this in netBeans but i can copy an entire column or all, only let me one for one (And there is a lot of data..)
Anyone knows how to convert this to a excel or at least copy an entire column?
EDIT1: I'm asking this question looking for a non-programming answer because i think netbeans can do anythink like converting this into an excel. (My english is not the best, i hope you can understand what i'm trying to say.)
You should read the file using something like that:
Read the file to a String usign Files.readAllbytes() and then split the content into a list of Strings:
Then you should apply some java project like apache POI here is a example using excel:
http://poi.apache.org/spreadsheet/how-to.html#sxssf
Good luck!!
Use JExcelpApi, http://jexcelapi.sourceforge.net/
Download Link: http://sourceforge.net/projects/jexcelapi/files/jexcelapi/2.6.12/
Tutorial: http://www.andykhan.com/jexcelapi/tutorial.html
You can save the properties file as a csv file.
Just rename it from [filename].properties to [filename].csv
Then you can import the csv file in Excel by using the import function whitch can be found in the menu under Data -> from text.
After clicking on from text there should open a popup in which you can set the import options.
In the first step inside the popupwindow select separated an go to the next step.
Here you can choose what symbol is the separator of the csv file.
Just write = into Separator->other and press on finish.
You can also find a good documentation about how to import csv to excel here: https://superuser.com/questions/407082/easiest-way-to-open-csv-with-commas-in-excel
I need to read a excel(xls) file stored on Hadoop cluster. Now I did some research and found out that I need to create a custom InputFormat for that. I read many articles but none of them is helpful from programming point of view. If someone can help me with sample code for writing custom inputformat so that I can understand the basics of "Programming InputFormat" and can use Apache POI library to read the excel file.
I had made a mapreduce program for reading text file. Now I need help regarding the fact that even if I some how manage to code my own custom InputFormat where would I write the code in respect to the mapreduce program I have already written.
PS:- converting the .xls file into .csv file is not an option.
Yes, you should create RecordReader to read each record from your excel document. Inside that record reader you should use POI like api to read from excel docs. More precisely please do the following steps:
Extend FileInputFromat and create your own CustomInputFrmat and overrride getRecordReader .
Create a CustomRecordReader by extending RecordReader ,here you have to write how to generate a key value pair from a given filesplit.
So first read bytes from filesplit and from that bufferedbytes read out desired key and value using POI.
You can check myown CustomInputFormat and RecordReader to deal with custom data objects here
myCustomInputFormat
Your research is correct. You need a custom InputFormat for Hadoop. If you are lucky, somebody already created one for your use case.
If not, I would suggest to look for a Java library that is able to read excel files.
Since Excel is a proprietary file format, it is unlikely that you will find an implementation that works perfectly.
Once you found a library that is able to read Excel files, integrate it with the InputFormat.
Therefore, You have to extend the FileInputFormat of Hadoop. The getRecordReader that is being returned by your ExcelInputFormat must return the rows from your excel file. You probably also have to overwrite the getSplits() method to tell the framework not to split the file at all.
I have about 100 different text files in the same format. Each file has observations about certain events at certain time periods for a particular person. I am trying to parse out the relevant information for a few individuals for each of the different text files. Ideally, I want to get parse through all this information and create a CSV file (eventually to be imported into Excel) with all the information I need from ALL the text files. Any help/ideas would be greatly appreciated. I would prefer to use java...but any simpler methods are appreciated.
The log files are structured as below: changed data to preserve private information
|||ID||NAME||TIME||FIRSTMEASURE^DATA||SECONDMEASURE^DATA|| etc...
TIME appears like 20110825111501 for 2011, 08/25 11:15:01 AM
Here are the steps in Java:
Open the file using FileReader
You could also wrapped the FileReader with BufferedReader and use readLine() method to read the file line by line
For each line you need to parse it. You know best the data definition of each line, to help you might be able to use various String functions or Java Regex
You could do the same thing for the Date. Check if you could utilize DateFormat
Once you parse the data, you could start building your CSV File using CSVParser mentioned above or write it your own using FileOutputStream
When you are ready to to convert to Excel, you could use Apache POI for Excel
Let me know if you need further clarification
Just to parse through the text file and use CSVParser from apache to write to a csv file. Additionally if you want to write in to excel, you can simply use Apache POI or JXL for that.
You can use SuperCSV to parse the file into a bean and
also to create a csv-file.
I want to insert the data at some positions in the text file without actually overwriting on the existing data.I tried RandomAccessFile ....but that also overwrites it....
Is there any way to insert the data without overwriting??
-Thanks in advance
You have to read your file and rewrite it. During this operation you have to find the place where you want to put your text and write it.
It is not possible (not with Java, and not with any other programming language) to "just" insert data in the middle of the file, without having to re-write the rest of the file. The problem is not the programming language, the problem is the way files work.