I am using POI's Event API to process large volume of records without any memory foot print issues. Here is the refernce for it.
When i processing XLSX sheet, i am getting different format of Date value than specified format in excel sheet. Date format for a column in excel sheet is 'dd-mm-yyyy' where as I am getting the value in 'mm/dd/yy' format.
Can some one tell me how to get the actual format given in excel sheet. Reference of code snippet is given below.
ContentHandler handler = new XSSFSheetXMLHandler(styles, strings,
new SheetContentsHandler() {
public void startRow(int rowNum) {
}
public void endRow() {
}
public void cell(String cellReference, String formattedValue) {
System.out.println(formattedValue);
} catch (IOException e) {
System.out.println(
"Exception during file writing");
}
}
Getting formmatedValue in cell method for date column is like 'mm/dd/yy' and hence i cant able to do the validations properly in my pl/sql program.
Two points to keep in mind:
The original Excel cell may have a format that doesn't work for you
or may be formatted as general text.
You may want to control exactly how dates, times or numeric values
are formatted.
Another way to control the formatting of date, and other numeric values is to provide your own custom DataFormatter extending org.apache.poi.ss.usermodel.DataFormatter.
You simply override the formatRawCellContents() method (or other methods depending on your needs):
Sample code constructing the parser / handler:
public void processSheet(Styles styles, SharedStrings strings,
SheetContentsHandler sheetHandler, InputStream sheetInputStream)
throws IOException, SAXException {
DataFormatter formatter = new CustomDataFormatter();
InputSource sheetSource = new InputSource(sheetInputStream);
try {
XMLReader sheetParser = SAXHelper.newXMLReader();
ContentHandler handler = new XSSFSheetXMLHandler(styles, null, strings, sheetHandler,
formatter, false);
sheetParser.setContentHandler(handler);
sheetParser.parse(sheetSource);
} catch (ParserConfigurationException e) {
throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
}
}
private class CustomDataFormatter extends DataFormatter {
#Override
public String formatRawCellContents(double value, int formatIndex, String formatString,
boolean use1904Windowing) {
// Is it a date?
if (DateUtil.isADateFormat(formatIndex, formatString)) {
if (DateUtil.isValidExcelDate(value)) {
Date d = DateUtil.getJavaDate(value, use1904Windowing);
try {
return new SimpleDateFormat("yyyyMMdd").format(d);
} catch (Exception e) {
logger.log(Level.SEVERE, "Bad date value in Excel: " + d, e);
}
}
}
return new DecimalFormat("##0.#####").format(value);
}
}
I had the very same problem. After a few days googling and research, I came up with a solution. Unfortunately, it isn't nice, but it works:
Make a copy of org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler class in your project.
Find the interface SheetContentsHandler in the class.
Add a new method definition: String overriddenFormat(String cellRef, int formatIndex, String formatString);
Find this method in the class: public void endElement(String uri, String localName, String name) throws SAXException.
It has a long switch over the cell types.
In the case NUMBER there is an if statement like this: if (this.formatString != null) {...
Before that, paste this code:
String overriddenFormat = output.overriddenFormat(cellRef, formatIndex, formatString);
if (overriddenFormat != null) {
this.formatIndex = -1;
this.formatString = overriddenFormat;
}
Follow this article/answer: https://stackoverflow.com/a/11345859 but use your new class and interface.
Now you can use unique date formats if it is needed.
My use case was:
In a given sheet I have date values in G, H, and I columns, so my implementation of SheetContentsHandler.overriddenFormat is:
#Override
public String overriddenFormat(String cellRef, int formatIndex, String formatString) {
if (cellRef.matches("(G|H|I)\\d+")) { //matches all cells in G, H, and I columns
return "yyyy-mm-dd;#"; //this is the hungarian date format in excel
}
return null;
}
As you can see, in the endElement method I have overridden the formatIndex and formatString. The possible values of the formatIndex are described in org.apache.poi.ss.usermodel.DateUtil.isInternalDateFormat(int format). If the given value doesn't fit on these (and -1 does not fit), the formatString will be used through formatting the timestamp values. (The timestamp values are counted from about 1900.01.01 and have day-resolution.)
Excel stores some dates with regional settings. For example in the number format dialog in Excel you will see a warning like this:
Displays date and time serial numbers as date values, according to the type and locale (location) that you specify. Date formats that begin with an asterisk (*) respond to changes in regional date and time settings that are specified in Control Panel. Formats without an asterisk are not affected by Control Panel settings.
The Excel file that you are reading may be using one of those *dates. In which case POI probably uses a US default value.
You will probably need to add some workaround code to map the date format strings to the format that you want.
See also the following for a discussion of regional date settings in Excel.
Related
I am going crazy trying to parse an Excel from Java using Apache POI and I ami finding the following problem parsing a specific column containin dates.
This is the column into my Excel file:
As you can see this column contains date fields but some cells of this column contains number values some other celles contains text values (I can see it using the TYPE() Excel function). I really don't know how it is possible because all the cell contains date fields. Some idea?
Anyway in my code I am trying to handle this strange situation in this way:
if(currentRow.getCell(20) != null && !currentRow.getCell(20).toString().trim().isEmpty()) {
if(currentRow.getCell(20).getCellType().toString().equals("STRING")) {
System.out.println("Data sottoscrizione contratto è STRING: " + currentRow.getCell(20).toString());
}
else if(currentRow.getCell(20).getCellType().toString().equals("NUMERIC")) {
System.out.println("Data sottoscrizione contratto è NUMERIC");
String dateAsString = currentRow.getCell(20).toString();
System.out.println("DATA SOTTOSCRIZIONE CONTRATTO: " + currentRow.getCell(20).toString());
}
}
In this way I can handle both the case trying to convert to a date.
And here my problem. When it found an Excel numeric value in enter into the if NUMERIC case
and printing the cell value by:
System.out.println("DATA SOTTOSCRIZIONE CONTRATTO: " + currentRow.getCell(20).toString());
I obtain printed the value 16-ott-2017 related to the date value 16/10/2017
And here some doubts: Why am I obtaining in this format instead something like 16/10/2017.
16-ott-2017 should be the italian formattation of the date. How can I convert it into a propper Date object?
Buon giorno!
You are currently using the toString() method of the cell, which will not be very accurate in returning numeric values or even dates. It might work sometimes, but it won't do always.
Use the methods that get you a real value, like Cell.getNumericCellValue(), Cell.getDateCellValue() (outdated because it returns a java.util.Date) or Cell.getLocalDateTimeCellValue(). If your cell just contains text, like "16-ott-2020", use the getStringCellValue() and convert the value returned to a LocalDate (or LocalDateTime depends on if time of day matters for you).
Here's an example of the conversion (to a LocalDate):
public static void main(String[] args) {
// assuming you alread received the value as String
String cellStringValue = "16-ott-2020";
// provice a formatter that can parse Italian month names (or days of week)
DateTimeFormatter dtf = DateTimeFormatter.ofPattern("dd-MMM-uuuu", Locale.ITALIAN);
// parse the String to a LocalDate
LocalDate sediciOttobreDuemilaEVenti = LocalDate.parse(cellStringValue, dtf);
// and print its default value
System.out.println(sediciOttobreDuemilaEVenti);
// alternatively use the same formatter for output
System.out.println(sediciOttobreDuemilaEVenti.format(dtf));
}
The output of that code is
2020-10-16
16-ott-2020
case NUMERIC:
if(Cell.getLocalDateTimeCellValue()!=null) {
LocalDateTime actualdate = Cell.getLocalDateTimeCellValue();
String formattedDate = actualdate.format(DateTimeFormatter.ofPattern("MM/dd/yyyy"));
System.out.println(formattedDate);
}
break;
switch (Cell.getCellType()) {
case STRING:
System.out.print(Cell.getStringCellValue());
break;
case NUMERIC:
DataFormatter df = new DataFormatter();
String CellValue = df.formatCellValue(Cell);
System.out.println(CellValue);
break;
case BOOLEAN:
System.out.print(Cell.getBooleanCellValue());
break;
case BLANK:
System.out.println("");
}
System.out.print(" | ");
}
I have export method to csv format for all grid data. If my user filter content in the grid, it is only afected to view, and export button keeps exporting all grid data. How can I only export filtered grid data?
/**
* generateCSVExportFile
*/
public void generateCSVExportFile() {
try { // Try
// Actual date
DateFormat dateFormat = new SimpleDateFormat("yyyyMMddHHmmss");
Date date = new Date();
// (1) Generate String buffer
String string2csv = generateCSVBufferString();
// (2) Generate file downloader file
fileDownloaderCSV.setFileDownloadResource(createResourceFromString(
SAMPLE_CSV_FILE + dateFormat.format(date) + CONF_CSV_EXTENSION, string2csv));
fileDownloaderCSV.extend(generateCSVFileButton);
} catch (Exception error) { // Catch
logger.error(error.toString(), error);
}
}
Thank you
best regards
One alternative for this is to use fetchItemsWithRange method from DataCommunicator, which returns List of items after sorting and filtering. So I assume it is exactly what you want
grid.getDataCommunicator().fetchItemsWithRange(0, grid.getDataCommunicator().getDataProviderSize());
Using h2o, I have used a .csv data frame that includes a column of dates, some of which are NULL, to train a model. Looking at the .hex dataframe that was output by h2o Flow UI after parsing the input .csv file, the null values are represented by .s and the remaining dates are represented as timestamp doubles (ie. milliseconds since epoch time).
When trying to use the model's MOJO file in a java program to make predictions, on a dataset, I am getting the error
Exception in thread "main" java.lang.NullPointerException
at hex.genmodel.easy.EasyPredictModelWrapper.fillRawData(EasyPredictModelWrapper.java:554)
at hex.genmodel.easy.EasyPredictModelWrapper.predict(EasyPredictModelWrapper.java:615)
at hex.genmodel.easy.EasyPredictModelWrapper.preamble(EasyPredictModelWrapper.java:489)
at hex.genmodel.easy.EasyPredictModelWrapper.predictBinomial(EasyPredictModelWrapper.java:303)
at SimpleCsvPredictor.predictCsv(SimpleCsvPredictor.java:287)
at SimpleCsvPredictor.main(SimpleCsvPredictor.java:210)
since I am handling NULL values in the dataset's date column by setting them t null in the RowData object that h2o's model EasyPredictionModelWrapper can make predictions on.
The problem is that, for this column, the model is expecting a Double value. But there is no Double value to pass in because the value is null. Note that I cannot just set these null values to 0.0 because of how the model is trained (since not all the dates are null, so setting some to zero would be misrepresenting the particular sample the the model). So how can I fix this or what can I put in the place of a null where a Double is expected?
Thanks for the advice :)
Here is what I do to the date Strings before I row.put("date_field", "<date string>") some <date string> into a RowData object (see here) that EasyPredictModelWrapper can predict on:
/**
*
* #param str_date (in MM/dd/yyyy form)
* #return string representation of timestamp value, either the string value of the str_date timestamp or "NULL"
* if can parse str_date to Date object, else returns null
*/
private String dateString2TimestampString(String str_date) {
if (str_date.toUpperCase().equals("NULL")) {
return "NULL";
} else {
try {
// convert date (MM/DD/YYYY) string to java date
DateFormat formatter;
formatter = new SimpleDateFormat("MM/dd/yyyy");
Date date = (Date) formatter.parse(str_date);
// convert date string to timestamp (since epoch time) (double??)
double epochTimestamp = (double) date.getTime();
return new BigDecimal(epochTimestamp).toPlainString();
} catch (Exception e) {
System.out.println("** dateString2TimestampString: could not parse string \"" + str_date + "\" to Date object");
System.out.println(e.getClass().getCanonicalName());
System.err.println(e.getMessage());
System.exit(1);
return null;
}
}
}
Be sure to set the convertInvalidNumberToNa config (see near the top of this code) for the wrapper as well so that is nicely handles "NULL" strings. E.g.:
EasyPredictModelWrapper model = new EasyPredictModelWrapper(
new EasyPredictModelWrapper.Config()
.setModel(MojoModel.load(MODEL_CLASS_NAME))
.setConvertUnknownCategoricalLevelsToNa(true)
.setConvertInvalidNumbersToNa(true)
);
This question already has answers here:
SimpleDateFormat.parse() ignores the number of characters in pattern
(5 answers)
Closed 7 years ago.
I am working on a project where I need to validate multiple dates based on length and patterns. I am using simple date format and found many issues with that. My requirement is to strictly allow if date string matches "yyyy/MM/dd" and strictly 10 characters.
The below code is not giving expected results for various testing input strings.
public static boolean checkformat(String dateString){
boolean flag = false;
Date d1 = null;
SimpleDateFormat format = new SimpleDateFormat("yyyy/MM/dd");
format.setLenient(false);
try {
d1 = format.parse(dateString);
flag=true;
} catch (ParseException ex) {
ex.printStackTrace();
return false;
}
return flag;
}
the above code is returning "true" for various inputs like "99/03/1" (should be 0099/03/01) and 99/1/1( should be 0099/01/1). Since the input strings are not coming from a from so I cant perform validations before passing them to this method. Please suggest any implementation which should act very strict towards the dateformat("yyyy/MM/dd").
I suggest that you should try to validate date with regex before format it.
user below code for validate
public static boolean checkformat(String dateString){
boolean flag = false;
Date d1 = null;
SimpleDateFormat format = new SimpleDateFormat("yyyy/MM/dd");
format.setLenient(false);
try {
if (dateString.matches("([0-9]{4})/([0-9]{2})/([0-9]{2})")) { // use this regex
d1 = format.parse(dateString);
flag=true;
}
} catch (ParseException ex) {
ex.printStackTrace();
return false;
}
return flag;
}
Okay, first: You know what format you're expection. So why just parse it and catch an exception rather than checking preconditions ?
if(dateString.size() > 10) {
...
What you are actually doing is not checking your input format but rather parsing it - though the method is not expressing this contract -
so if your method is just for checking you could:
1. Use a regex
2. ... ?
I know that are quiet a lot of answers on the net which propose using SimpleDateFormat, but - to be frank -they are wrong.
If I am expecting a given format, e.g. as I know that conversions have been made on some user input, I can start parsing a string, and considering that something may have gone wrong, catch the exception. If I don't know which format is passed to me, I am at the validation layer and this layer should not try to perform a conversion but rather proof that the conversion would be valid.
You could try using the new java.time package from Java 8 and later. You could use it as so to replace the SimpleDateFormat:
public static boolean checkformat(String dateString){
boolean flag = false;
try {
TemporalAccessor ta = DateTimeFormatter.ofPattern("yyyyMMdd").parse(strDate);
flag=true;
} catch (DateTimeParseException ex) {
ex.printStackTrace();
return false;
}
return flag;
}
This would also limit the values from making no sense (e.g. month value being 18).
String[] removeSlashes=new String[3];
removeSlashes = enteredDate.split("/");
if(removeSlashes[0].length()!=4)
throw new IncorrectDateFormatException(); // user defined exception
if(removeSlashes[1].length()!=2)
throw new IncorrectDateFormatException();
if(removeSlashes[2].length()!=2)
throw new IncorrectDateFormatException();
//Then use SimpleDateFormat to verify
I am parsing JSON from server in my Android application by using Jackson JSON library. However, parsing requests fail whenever I receive DateTime since it's in this format:
"/Date(1277931782420)/"
I know I should do something like:
ObjectMapper om = new ObjectMapper();
om.setDateFormat(new TicksSinceFormat());
But I have no idea if I can use SimpleDateFormat at all (and what format string would I use?) or I need to write my own DateFormat parser. So, I would seriously appreciate if somebody could help with code example.
EDIT:
OK, see my answer for complete code.
This proved to be tougher then I expected:
public class TicksSinceFormat extends DateFormat {
#Override
public StringBuffer format(Date date, StringBuffer buffer, FieldPosition field) {
long millis = date.getTime();
return new StringBuffer("/Date(" + millis + ")/");
}
#Override
public Date parse(String string, ParsePosition position) {
int start = string.indexOf("(") + 1;
int end = string.indexOf(")");
String ms = string.substring(start, end);
Date date = new Date(Long.parseLong(ms));
position.setIndex(string.length() - 1); // MUST SET THIS
return date;
}
#Override
public Object clone() {
return new TicksSinceFormat(); // MUST SET THIS
}
}
Using class is then extremely simple, just do:
ObjectMapper om = new ObjectMapper();
om.setDateFormat(new TicksSinceFormat())
I presume that this can be coded better + that I'll need to deal with differences when it comes to .NET Ticks VS Java ticks - but for now this'll do. If somebody has better solution or more insight into mentioned problems I'll deal with later - feel free to post and I'll mark your answer as correct one if it's better.
EDIT: As I've explained in this question & answer I've switched to ServiceStack.Text library on the server and it returns different, ISO8601 format. For that format I'm using slightly different parsing (since Jackson has trouble parsing ISO8601 that contains milliseconds). Of course, as with other code I'm posting - let me know if you have better version (just please post code / edit this post, rather than resorting to philosophical rhetoric on how it should be done):
#SuppressLint("SimpleDateFormat")
public class JacksonSimpleDateFormat extends SimpleDateFormat {
public JacksonSimpleDateFormat() {
if (mParser == null) {
mParser = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
mParser.setTimeZone(TimeZone.getTimeZone("UTC"));
}
}
#Override
public StringBuffer format(Date date, StringBuffer buffer, FieldPosition field) {
return mParser.format(date, buffer, field);
}
private static SimpleDateFormat mParser;
#Override
public Date parse(String string, ParsePosition position) {
String str = string.split("\\.")[0];
Date date = null;
try {
date = mParser.parse(str);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
position.setIndex(string.length() - 1);
return date;
}
#Override
public Object clone() {
return new JacksonSimpleDateFormat();
}
}
I may be wrong on this, as I haven't gotten very far into Android development, but the format you presented:
"/Date(1277931782420)/"
Appears to be Unix epoch time.
If that is the case, you would not want/need to use SimpleDateFormat. Instead, try creating a Long from it and passing to the Date constructor, accounting for whether it is seconds or milliseconds-based epoch value.
Here is a StackOverflow post that provides the code for doing so: https://stackoverflow.com/a/535017/463196