In the existing project user can download report and it has 10 million records, this process gets data from database and writes to csv by using super csv java api then sends an email to user by attaching, it takes huge heap space to hold 10 million java objects and writing these records to csv files, because of this server is crashing and going down as application has many reports like this. is there any better way to handle this.? I red sxssfworkbook documentation and it says specified records count can keep in memory and remaining records will be pushed to hard disk but this is using to create excel files. is there any similar api to create csv files or sxssfworkbook can be used to create csv files.?
There are few Java libraries for reading and writing CSV files. They typically support "streaming", so they do not have the problem of needing to hold the source data or the generated CSV in memory.
The Apache Commons CSV library would be a good place to start. Here is the User Guide. It supports various of flavors of CSV file, including the CSV formats generated by Microsoft Excel.
However, I would suggest that sending a CVS file containing 10 million records (say 1GB uncompressed data) is not going to make you popular with the people who run your users' email servers! Files that size should be made available via an web or file transfer service.
Related
I want to process an excel file with java spring. I am using apche poi to process the file. The excel file is auto generated and keeps growing. Example: Excel file has 20 lines on day 1. On day 2 the excel file has 35 lines. The first 20 lines are the same, but there are 15 new lines. It is unknown how many lines are added or when the excel will be uploaded.
The data from the excel is mapped to POJOs and saved to the database.
Is there a fast and reliable way to identify which new lines were added and only proccess those lines?
edit: I realised that this might not be an excel processing problem but (also) a database optimisation problem.
You can use the newer API of Apache POI, SXSSF, which an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. It consumes less memory. Check this link.
I am using superCSV to write data in csv format in my code. Its working absolutely fine and very efficiently , but now my requirement changed . I need to write multiple sheets in single xls file which is very time consuming task. So is there is any way in supercsv by which i can write multiple sheet data in single csv file and will send it to client, so that when client open this csv file in MS-Excel, he can see multiple sheets rather than me generating the excel file with with multiple sheets and sending it to client.
Thanks
CSV is a very simple format, and does not have the concept of a "sheet".
So, no, it's not possible directly.
The only thing that I can suggest is to send multiple csv files to the client, perhaps as a .zip file, and have the client unizp it and import one sheet at a time into Excel.
If you need it to open directly in the browser, you'll need to go with an xls file.
Take a look at the api here.
http://supercsv.sourceforge.net/apidocs/index.html
I'm not familiar SuperCsv please don't beat me up too bad if I'm wrong...
Can't you just set CsvPreference to EXCEL
Is there a way to read large xls files?
I have used Apache POI to read files but only till some limit.
I have a database which has some data and now I want to upload a file of large size (like i told). This file contains little more data in one of its sheets as compare to the data that database has. now how should i update that data in oracle database without setting up the xls file.
For large xls files you should use the streaming extension of XSSF, called SXSSF. It should be able to handle your requirements without memory problems.
As for the database problem: I'd suggest you read the xls files using XSSF and create temporary tables for each file in the database (depending on your needs). Once all the rows are stored in temp tables you can easily merge them with your existing data.
I'm trying to read excel file and pass all the data to DB. I found a few code examples but all of them required external jars. How can I read excel files using only the standard library?
IF you don't want to use a library then you will have to download the Excel file format specs from MS and write an Excel parser yourself (which is extremely complicated and takes > 10 years for one developer). For the OpenXML format spec see here and here.
Thus I really recommend using a library for that...
Try Apache POI - a free Java library for dealing with MS Office documents..
You can save as the excel file *.cvs and sperated ";". Then, you can read file line by line and get the columns which is getting from each token.
Microsoft excel uses a binary way to save its data, so manually reading excel files might be a hassle. If you could convert the excel (xls) to a comma seperated values (csv) file, then you can just read the file and split your input on the comma's.
This is a difficult problem. First off, it is not as simple as "adding a third party library". There are no existing EXCEL reading libraries that do not cost money and the one that I know that does work is very expensive AND has bugs in it.
One strategy is to create an Excel add in that reads the data and transfers it to your application by OLE or the clipboard or by a TCP/IP port or saves it to a temporary file. If you look in the source code for OPeNDAP.org's ODC project you can find an Excel add in and TCP capability to do this.
You can try referring to the reader in OpenOffice which is open source code, however, in my opinion that code is not easily refactorable into a private project for various reasons.
Microsoft has components and tools to open Excel files and expose them via COM objects.
You can also learn the BIFF format and write your own parser. You probably would want to write a parser for BIFF5, but be forewarned, this is a BIG project, even if you only parse a limited number of data types.
I am having a database in .dbf (FoxPro) format.
How to retrieve data from FoxPro using Java?
If the data can be migrated to MySQL, How to do the conversion?
Taking the data to intermediate formats seems flawed as there are limitation with memo fields and CSV or Excel files.
If you are interested in a more direct approach you could consider something like "VFP2MySQL Data Upload program" or "Stru2MySQL_2", both written by Visual FoxPro developers. Search for them on this download page:
http://leafe.com/dls/vfp
DB-Convert (http://dbconvert.com/convert-foxpro-to-mysql-sync.php) is a commercial product that you might find helpful.
Rick Schummer, VFP MVP
You can use XBaseJ to access (and even modify write) data from FoxPro databases directly from Java with simple API.
This would allow you to have the two applications (the old FoxPro and the new Java one) side by side by constantly synchronizing the data until the new application is ready to replace the old one (e.g. many times the customers still hang on and trust more their old application for a while).
Do you have a copy of FoxPro? You can save the database as an HTML file, if you want. Then, from HTML, you can save to any format you want. I recently did this to save a FoxPro table as an Excel spreadsheet (not that I'd suggest using that for your Java code).
If you plan on using Java, once you have access to the data, why not use one of Java's native storage types?
I worked on the same project once long back where the project had be done with FoxPro and then we migrated that project to Java with MySQL.
We had the data in Excel sheets or .txt files, so we created tables as exact replica of the FoxPro data and transferred the data from the Excel/CSV /txt to MySQL using the Import data feature.
Once we did this, I think further you can take care from MySQL Data.
But remember work will take some time, and we need to be patient.
I suppose doing a CSV export of your FoxPro data and then writing a little Java programme that takes the CSV as input is your best bet. Writing a programme that both connects to FoxPro and MySQL in Java is needlessly complicated, you are doing a one time migration.
By the way PHP could do an excellent job at inserting the data into MySQL too. The main thing is that you get your data in the MySQL schema, so you can use it with your new application (which I assume is in Java.)
Two steps: DBF => CSV and the CSV => MySQL.
To convert DBF(Foxpro tables) to CSV the below link helps a lot
http://1stopit.blogspot.com/2009/06/dbf-to-mysql-conversion-on-windows.html
CSV => MySQL
MySQL itself supports CSV import option (or) to read csv file this link helps
http://www.csvreader.com/java_csv.php
I read the CSV file using Java CsvReader and inserted the records through program. For that i used PreparedSatement with Batch the below link gives samples for that
http://www.codeweblog.com/switch-to-jdbc-oracle-many-data-insert/