Excel currency format with Apache POI - java

I use Apache POI to write excel sheets. I'm very happy with it, but I have the following problem: my program is multi-currency and internationalized. In the excel sheet I write, I want my currency cells to be properly formatted. For that,
I need a routine that would build the right Excel currency format (eg: "[$$-409]#,##0.00;-[$$-409]#,##0.00") from java.util.Currency and java..util.Locale parameters.
Something like: public String getExcelCurrencyFormat(Currency currency, Locale locale);
Is anyone aware of such a library?

Couldn't you just flag the field as being a currency field, and let Excel figure out the right format to use? Or does Excel require you to specify the format to produce the relevant 'type'?
I would have suggested using java.text.NumberFormat.getCurrencyInstance() but there appears to be no way to get the actual format used by the instance from the object itself (unless tostring() returns it).
Off the top of my head I can't think of anything else sorry.

Related

How do i apply custom date formatting from excel in java?

I have an 2007 excel file called Test10. The first row has 10 headers, test1 test2 etc to test10. The second row, column 3 has a date, 10-10-2020 on the cell with the 10-10-2020 date there is a custom date format -> dd-mm-yyyy
The date in the cell is written as 2020-10-10.
I have the format from the cell in Java, however when I get it it's dd/-mm/-yyyy;# and the date is 2020-10-10.
I need to somehow apply that format to the date at hand.
I need something more flexible, is there a class that does that format? All I have managed to find so far is how to apply formatting to the excel in java when creating the excel, but not the other way around, when getting it from the Excel file.
I can manually serialize the format and remove the special characters that are not needed but that is a workaround.
I need a class that will apply the strange looking format to the date, in Java. Is there anything like that?
Also the example I provided above is a very simple example.
We're using a custom handler, the date comes as a double and we currently use DateUtil.getJavaDate to get the date as a string.
I have not created the handler so it's quite hard to understand the whole of it, but somewhere in the handler, the custom formatting for the cell is extracted from the excel and put into a variable, at the point where the DateUtil.getJavaDate is called I have access to the formatting variable.
Previously it was using SimpleDateFormat(CONSTANT_FORMAT).format(DateUtil.getJavaDate(d)) to format it, but I need to apply the custom format. Is there a library that parses xlsx date formats. As the dd/-mm/-yyyy;# format will not work with SimpleDateFormat.
Also I cannot serialize the format hardcoded as if I replace mm with MM how do I know if the user wanted minutes or months?
DataFormatter from org.apache.poi handles the weird excel formats

How to extract data from a PDF file using Tika or any other library and store it in CSV/excel format

I want to extract the data present inside a PDF file and present it in the format of a CSV/Excel sheet.I got to know that this can be done using Tika library in java.But,i did find the solution as to how extract the data as simple text,but i want to know how to store it in an excel sheet.
If someone has done such type of work earlier,then please help me.
The first part (and the hard one) is to parse original data and interpret it as a table. Apache Tika will give you xhtml representation (or call your own handler with SAX events) but it usually won't construct a table for you. From pdf file, I mean, since pdf isn't a tabular format by itself.
So, you'll have to take Tika-produced paragraphs, split them and pass resulting cells to some csv/xls/xlsx writter.
It might work if you have some regular table in you pdf (one line per table row, clean cell logical separation etc). But it will look like parsing plain text, of course.
In case I wouldn't work, you'll have to take pdf parser (like Apache PDFBox) and try to interpret its output.
The second part (output) is simple. If csv/ssv/tsv is suitable for you -- use your preferred library to produce it (I can recommend Apache commons-csv).
But take into account that MS Excel requires BOM for UTF-8 and UTF-16 csv to understand that file isn't in one-byte encoding (like CP-1252 etc).
If you want Excel xls or xlsx format -- just use Apache POI to write it.

Using java to change excel date re

So I am creating a csv report within my java code and using excel to open the exported csv file. One of the column is a date which I am formatting within my code to be mm/dd/yyyy hh:mm:ss. This comes out as 02/10/2014 3:38:00 PM. Which is exactly how I want it. However the columns in the excel sheet display this as 02/10/2014 3:38. When I click on a cell in the excel sheet, it does display the full date at the top but I want it to display on the column itself so that it is easier to print. It doesn't seem like a column width issue since I have changed the column width but the full date still won't appear. I am however able to achieve it by changing the number format cells setting to custom. Is this something that can be done within java itself? Let me know if you need more information. Thanks!
Comma-separated values (CSV) is stores tabular data in plain-text format. To give Excel an instruction how to format a particular column you would need to user Excel format. In order to achieve it, you may use a Java library to export data in Excel format. One example of such a library is Apache POI - the Java API for Microsoft Documents (http://poi.apache.org/).
In addition, to work better with CSV files in Excel use import from text feature. This is a wizard you can specify the import settings, including column formats, width of the fields etc.
I hope it helps.

Change decimal and thousands separators in excel using Apache POI

Does anyone know if using apache-poi library you can change the decimal and thousands separators for Microsoft Excel?
I need to export in excel some data from an web application, and the numbers are formatted depending on some the user's settings. so when the data is exported the numbers should look exactly how they are in the application's page.
Thanks
You need to set your CellStyle dataFormat in this way (if you use integer and want thousand separator)
cellStyle.setDataFormat(creationHelper.createDataFormat().getFormat("#,##0"));
cell.setCellStyle(cellStyle);
I think that you need something like that: (I didn't try it, so maybe you need to modify it a little bit) #,##0.00
please note: is very important you use comma, and not dot. If your locale is setted correcty, you will see a dot.
Formatting in Excel is controlled through the Tools > Options > International dialogs, and is stored in local preferences, not in a file. So you can't control this through POI.
The only solution I can think of is to provide text rather than numbers. But it will prevent user from doing any calculation in Excel.
There's only formatting. It means this format is my format for formatting numeric. The comma is a symbol equals only part of thousands while the dot is part of decimal. You could use "#,##0.00" or "#,##0" does not matter because Microsoft Excel has local settings of separator applies to the application, not a file, you cannot override via API.
Remember, the sheet has a predefined cell style. A cell has a reference only to style. If you change on cell, you change all cells this type.
I have the same issue with format of cell. I think I try to use the method "setVBAProject" on XSSFWorkbook.
https://social.technet.microsoft.com/Forums/office/en-US/eaa4c7f6-197a-4b33-bc5f-20896e5a7e3a/workbook-or-worksheet-specific-decimal-separator?forum=excel

What is the most _robust_ way to generate an Excel datafile from Java?

I have a situation where I have been asked to write a program that essentially does an arbitrary SQL select over JDBC, convert the ResultSet to something loadable in Excel and send it as an attachment in an email.
The question goes for what dataformat to use in order to be loadable by as many different versions of Excel as possible.
I have considered:
XLS - native format, the simplest way to generate seems to be with JExcel.
CSV - comma separated format, must use semicolons instead of commas to cope with European decimal commas, and then there is all the quotation stuff.
HTML - it appears that Excel knows how to read an HTML table. It should be sufficient then to set the MIME-type to be application/vnd.ms-excel
but naturally there must be other interesting ways to do it.
My major concern is incorrect interpretion of the data:
Numbers with decimal commas gets misinterpreted on systems with decimal points.
Character encoding issues (We cannot rely on the recipient using ISO-Latin-#).
Date interpretation - we have earlier found that the YYYY-MM-DD format is pretty robust.
My major concern is robustness. I don't mind it being tedious to code, if I can count on the result being good.
Suggestions and experiences to share?
I am aware of JSP generating Excel spreadsheet (XLS) to download - that page does not discuss robustness.
I'd recommend Andy Khan's JExcel. It's the best library for working with Excel in Java.
Apache hssf
This has always been the chosen method where I've worked in Java development.
It's an acronym for Horrible SpreadSheet Format
The quick way to generate Excel files to to write out tab delineated text and name it <name>.xls. Excel will open any text file ending in .xls as a single worksheet.

Categories

Resources