Elastic Search and Y10k (years with more than 4 digits) - java

I discovered this issue in connection with Elastic Search queries, but since the ES date format documentation links to the API documentation for the java.time.format.DateTimeFormatter class, the problem is not really ES specific.
Short summary: We are having problems with dates beyond year 9999, more exactly, years with more than 4 digits.
The documents stored in ES have a date field, which in the index descriptor is defined with format "date", which corresponds to "yyyy-MM-dd" using the pattern language from DateTimeFormatter. We are getting user input, validate the input using org.apache.commons.validator.DateValidator.isValid also with the pattern "yyyy-MM-dd" and if valid, we create an ES query with the user input. This fails with an execption if the user inputs something like 20202-12-03. The search term is probably not intentional, but the expected behaviour would be not to find anything and not that the software coughs up an exception.
The problem is that org.apache.commons.validator.DateValidator is internally using the older SimpleDateFormat class to verify if the input conforms to the pattern and the meaning of "yyyy" as interpreted by SimpleDateFormat is something like: Use at least 4 digits, but allow more digits if required. Creating a SimpleDateFormat with pattern "yyyy-MM-dd" will thus both parse an input like "20202-07-14" and similarly format a Date object with a year beyond 9999.
The new DateTimeFormatter class is much more strict and means with "yyyy" exactly four digits. It will fail to parse an input string like "20202-07-14" and also fail to format a Temporal object with a year beyond 9999. It is worth to notice that DateTimeFormatter is itself capable of handling variable-length fields. The constant DateTimeFormatter.ISO_LOCAL_DATE is for example not equivalent to "yyyy-MM-dd", but does, conforming with ISO8601, allow years with more than four digits, but will use at least four digits. This constant is created programmatically with a DateTimeFormatterBuilder and not using a pattern string.
ES can't be configured to use the constants defined in DateTimeFormatter like ISO_LOCAL_DATE, but only with a pattern string. ES also knows a list of predefined patterns, occasionally the ISO standard is also referred to in the documentation, but they seem to be mistaken and ignore that a valid ISO date string can contain five digit years.
I can configure ES with a list of multiple allowed date patterns, e.g "yyyy-MM-dd||yyyyy-MM-dd". That will allow both four and five digits in the year, but fail for a six digit year. I can support six digit years by adding yet another allowed pattern: "yyyy-MM-dd||yyyyy-MM-dd||yyyyyy-MM-dd", but then it fails for seven digit years and so on.
Am I overseeing something, or is it really not possible to configure ES (or a DateTimeFormatter instance using a pattern string) to have a year field with at least four digits (but potentially more) as used by the ISO standard?

Edit
ISO 8601
Since your requirement is to conform with ISO 8601, let’s first see what ISO 8601 says (quoted from the link at the bottom):
To represent years before 0000 or after 9999, the standard also
permits the expansion of the year representation but only by prior
agreement between the sender and the receiver. An expanded year
representation [±YYYYY] must have an agreed-upon number of extra year
digits beyond the four-digit minimum, and it must be prefixed with a +
or − sign instead of the more common AD/BC (or CE/BCE) notation; …
So 20202-12-03 is not a valid date in ISO 8601. If you explicitly inform your users that you accept, say, up to 6 digit years, then +20202-12-03 and -20202-12-03 are valid, and only with the + or - sign.
Accepting more than 4 digits
The format pattern uuuu-MM-dd formats and parses dates in accordance with ISO 8601, also years with more than four digits. For example:
DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern("uuuu-MM-dd");
LocalDate date = LocalDate.parse("+20202-12-03", dateFormatter);
System.out.println("Parsed: " + date);
System.out.println("Formatted back: " + date.format(dateFormatter));
Output:
Parsed: +20202-12-03
Formatted back: +20202-12-03
It works quite similarly for a prefixed minus instead of the plus sign.
Accepting more than 4 digits without sign
yyyy-MM-dd||yyyyy-MM-dd||yyyyyy-MM-dd||yyyyyyy-MM-dd||yyyyyyyy-MM-dd||yyyyyyyyy-MM-dd
As I said, this disagrees with ISO 8601. I also agree with you that it isn’t nice. And obviously it will fail for 10 or more digits, but that would fail for a different reason anyway: java.time handles years in the interval -999 999 999 through +999 999 999. So trying yyyyyyyyyy-MM-dd (10 digit year) would get you into serious trouble except in the corner case where the user enters a year with a leading zero.
I am sorry, this is as good as it gets. DateTimeFormatter format patterns do not support all of what you are asking for. There is no (single) pattern that will give you four digit years in the range 0000 through 9999 and more digits for years after that.
The documentation of DateTimeFormatter says about formatting and parsing years:
Year: The count of letters determines the minimum field width below which padding is used. If the count of letters is two, then a
reduced two digit form is used. For printing, this outputs the
rightmost two digits. For parsing, this will parse using the base
value of 2000, resulting in a year within the range 2000 to 2099
inclusive. If the count of letters is less than four (but not two),
then the sign is only output for negative years as per
SignStyle.NORMAL. Otherwise, the sign is output if the pad width is
exceeded, as per SignStyle.EXCEEDS_PAD.
So no matter which count of pattern letters you go for, you will be unable to parse years with more digits without sign, and years with fewer digits will be formatted with this many digits with leading zeroes.
Original answer
You can probably get away with the pattern u-MM-dd. Demonstration:
String formatPattern = "u-MM-dd";
DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(formatPattern);
LocalDate normalDate = LocalDate.parse("2020-07-14", dateFormatter);
String formattedAgain = normalDate.format(dateFormatter);
System.out.format("LocalDate: %s. String: %s.%n", normalDate, formattedAgain);
LocalDate largeDate = LocalDate.parse("20202-07-14", dateFormatter);
String largeFormattedAgain = largeDate.format(dateFormatter);
System.out.format("LocalDate: %s. String: %s.%n", largeDate, largeFormattedAgain);
Output:
LocalDate: 2020-07-14. String: 2020-07-14.
LocalDate: +20202-07-14. String: 20202-07-14.
Counter-intuituvely but very practically one format letter does not mean 1 digit but rather as many digits as it takes. So the flip side of the above is that years before year 1000 will be formatted with fewer than 4 digits. Which, as you say, disagrees with ISO 8601.
For the difference between pattern letter y and u for year see the link at the bottom.
You might also consider one M and/or one d to accept 2020-007-014, but again, this will cause formatting into just 1 digit for numbers less than 10, like 2020-7-14, which probably isn’t what you want and again disagrees with ISO.
Links
Years section of Wikipedia article: ISO 8601
Documentation of DateTimeFormatter
uuuu versus yyyy in DateTimeFormatter formatting pattern codes in Java?

Maybe this will work:
[uuuu][uuuuu][...]-MM-dd
Format specifiers placed between square brackets are optional parts. Format specifiers inside brackets can be repeated to allow for multiple options to be accepted.
This pattern will allow a year number of either four or five digits, but rejects all other cases.
Here is this pattern in action. Note that this pattern is useful for parsing a string into a LocalDate. However, to format a LocalDate instance into a string, the pattern should be uuuu-MM-dd. That is because the two optional year parts cause the year number to be printed twice.
Repeating all possible year number digit counts, is the closest you can get in order to make it work the way you expect it to work.
The problem with the current implementation of DateTimeFormatter is that when you specify 4 or more u or ys, the resolver will try to consume exactly that number of year digits. However, with less than 4, then the resolver will try to consume as many as possible. I do not know whether this behavior is intentional.
So the intended behavior can be achieved with a formatter builder, but not with a pattern string. As JodaStephen once pointed out, "patterns are a subset of the possible formatters".
Maybe the characters #, { and }, which are reserved for future use, will be useful in this regard.

Update
You can use DateTimeFormatterBuilder#appendValueReduced to restrict the number of digits in a year in the range of 4-9 digits.
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeFormatterBuilder;
import java.time.temporal.ChronoField;
public class Main {
public static void main(String[] args) {
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendValueReduced(ChronoField.YEAR, 4, 9, 1000)
.appendPattern("-MM-dd")
.toFormatter();
String[] dateStrArr = { "2017-10-20", "20171-10-20", "201712-10-20", "2017123-10-20" };
for (String dateStr : dateStrArr) {
System.out.println(LocalDate.parse(dateStr, formatter));
}
}
}
Output:
2017-10-20
+20171-10-20
+201712-10-20
+2017123-10-20
Original answer
You can use the pattern [uuuu][u]-MM-dd where [uuuu] conforms to a 4-digit year and [u] can cater to the requirement of any number of digits allowed for a year.
Demo:
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
public class Main {
public static void main(String[] args) {
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("[uuuu][u]-MM-dd");
String[] dateStrArr = { "2017-10-20", "20171-10-20", "201712-10-20", "2017123-10-20" };
for (String dateStr : dateStrArr) {
System.out.println(LocalDate.parse(dateStr, formatter));
}
}
}
Output:
2017-10-20
+20171-10-20
+201712-10-20
+2017123-10-20

Related

How do I parse an ISO-8601 formatted string that contains no punctuation in Java 8?

How could I parse the following String to a LocalDateTime-Object?
20200203092315000000
I always get the following exception but I didn't understand it:
java.time.format.DateTimeParseException: Text '20200203092315000000' could not be parsed at index 0
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
at java.time.LocalDateTime.parse(LocalDateTime.java:492)
at de.x.struct.type.LocalDateTimeStructField.setBytesValue(LocalDateTimeStructField.java:44)
at de.x.struct.Struct.bytesToStruct(Struct.java:110)
at de.x.struct.StructTest.testStringToStruct(StructTest.java:60)
My application code looks like:
LocalDateTime ldt = LocalDateTime.parse("20200203092315000000", DateTimeFormatter.ofPattern("yyyyMMddHHmmssSSSSSS"));
looks like a known issue...
bug_id=JDK-8031085
bug_id=JDK-8138676
Workaround:
DateTimeFormatter dtf = new
DateTimeFormatterBuilder().appendPattern("yyyyMMddHHmmss").appendValue(ChronoField.MILLI_OF_SECOND,
3).toFormatter()
or
CUSTOMER SUBMITTED WORKAROUND : use the following format (mind the
'.'): "yyyyMMddHHmmss.SSS"
LocalDateTime.parse("20150910121314987",
DateTimeFormatter.ofPattern("yyyyMMddHHmmss.SSS"))
or alternatively use jodatime library
I have been using the following code for close to three years and it has served me well.
ISO 8601 allows two formats: basic and extended. The basic format does not contain any punctuation while the extended format does. For example, 2017-12-10T09:47:00-04:00 in extended format is equivalent to 20171210T094700-0400 in basic format.
Since the behavior in Java 8 is to only accept the extended format, I typically use a new DateTimeFormatter that accepts both.
This format is not an accurate representation of what ISO 8601 requires. The rule is that a textual representation of a timestamp either be completely basic (i.e., no punctuation) or completely extended (i.e., all punctuation present). It is not valid to have just some punctuation but I have found that some libraries do not honor this rule. Specifically, Gson produces timestamps with the colon missing in the time zone specifier. Since I wish to be liberal in what I accept, I completely ignore all punctuation.
Java 8 also fails to accurately handle years especially when the punctuation is missing. ISO 8601 requires exactly four digits in the year but allows extra digits if both parties agree on an explicit number of digits. In this case, the year MUST be preceded by a sign. However, Java 8 does not enforce the requirement to have a sign and accepts from four to nine digits. While this approach is more liberal and in keeping with what I am usually trying to accomplish, it makes it impossible to parse a year since I cannot know how many digits should be present. I tighten the format to honor ISO 8601 in this case although an alternative approach is available if needed. For example, you could pre-parse the text knowing how many digits you expect and add the sign if there are too many digits.
I recommend only using this formatter when parsing and not when serializing since I prefer to be strict in what I produce.
The code below should accept your format, with and without a timezone offset. It only deviates in what you have in that it accepts nine digits for the fractional seconds instead of six. You can adjust it if needed.
// somewhere above
import java.time.format.SignStyle;
import static java.time.temporal.ChronoField.*;
static DateTimeFormatter ISO_8601_LENIENT_FORMAT = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendValue(YEAR, 4, 4, SignStyle.EXCEEDS_PAD)
.optionalStart().appendLiteral('-').optionalEnd() // Basic format has no punctuation
.appendValue(MONTH_OF_YEAR, 2)
.optionalStart().appendLiteral('-').optionalEnd()
.appendValue(DAY_OF_MONTH, 2)
.optionalStart().appendLiteral('T').optionalEnd() // permitted to omit the 'T' character by mutual agreement
.appendValue(HOUR_OF_DAY, 2)
.optionalStart().appendLiteral(':').optionalEnd()
.appendValue(MINUTE_OF_HOUR, 2)
.optionalStart() // seconds are optional
.optionalStart().appendLiteral(':').optionalEnd()
.appendValue(SECOND_OF_MINUTE, 2)
.optionalStart().appendFraction(NANO_OF_SECOND, 0, 9, false).optionalEnd()
.optionalEnd()
.optionalStart().appendOffset("+HH:MM", "Z").optionalEnd()
.optionalStart().appendOffset("+HHMM", "Z").optionalEnd()
.optionalStart().appendOffset("+HH", "Z").optionalEnd()
.toFormatter()
;
A more strict version that rejects all punctuation, assumes non-negative years, limits fractional seconds to six digits and omits timezone information follows.
static DateTimeFormatter ISO_8601_NO_PUNCTUATION = new DateTimeFormatterBuilder()
.appendValue(YEAR, 4, 4, SignStyle.NEVER)
.appendValue(MONTH_OF_YEAR, 2)
.appendValue(DAY_OF_MONTH, 2)
.appendValue(HOUR_OF_DAY, 2)
.appendValue(MINUTE_OF_HOUR, 2)
.appendValue(SECOND_OF_MINUTE, 2)
.appendFraction(NANO_OF_SECOND, 6, 6, false)
.toFormatter()
;
your input is not valid and its out of range.
use this code when your input is correct.
Calendar cal = Calendar.getInstance();
cal.setTimeInMillis(your input * 1000);
System.out.println(cal.getTime());

(Re)Use DateTimeFormatter for parsing date ranges or mix DateTimeFormatter with regex

I have the following String representing a date range which I need to parse:
2018-10-20:2019-10-20
It consists of 2 ISO date strings separated by :
The string can get more complex by having repeated date ranges mixed with other text. This can be done by a Regex.
However, given that the latest Java has Date/Time support that most coders here and elsewhere are ecstatic about, is it possible to use, say, LocalDate's parser or a custom DateTimeFormatter in order to identify the bits in my String which are candidates for ISO-date and capture them?
Better yet, how can I extract the validation regex from a DateTimeFormatter (the regex which identifies an ISO-date, assuming there is one) and merge/compile it with my own regex for the rest of the String.
I just do not feel comfortable coding yet another ISO-date regex in my code when possibly there is already a regex in Java which does that and I just re-use it.
Please note that I am not asking for a regex. I can do that.
Please also note that my example String can contain other date/time formats, e.g. with timezones and milliseconds and all the whistles.
Actually, DateTimeFormatter doesn't have an internal regex. It uses a CompositePrinterParser, which in turn uses an array of DateTimePrinterParser instances (which is an inner interface of DateTimeFormatterBuilder), where each instance is responsible for parsing/formatting a specific field.
IMO, regex is not the best approach here. If you know that all dates are separated by :, why not simply split the string and try to parse the parts individually? Something like that:
String dates = // big string with dates separated by :
DateTimeFormatter parser = // create a formatter for your patterns
for (String s : dates.split(":")) {
parser.parse(s); // if "s" is invalid, it throws exception
}
If you just want to validate the strings, calling parse as above is enough - it'll throw an exception if the string is invalid.
To support multiple formats, you can use DateTimeFormatterBuilder::appendOptional. Example:
DateTimeFormatter parser = new DateTimeFormatterBuilder()
// full ISO8601 with date/time and UTC offset (ex: 2011-12-03T10:15:30+01:00)
.appendOptional(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
// date/time without UTC offset (ex: 2011-12-03T10:15:30)
.appendOptional(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
// just date (ex: 2011-12-03)
.appendOptional(DateTimeFormatter.ISO_LOCAL_DATE)
// some custom format (day/month/year)
.appendOptional(DateTimeFormatter.ofPattern("dd/MM/yyyy"))
// ... add as many you need
// create formatter
.toFormatter();
A regex to support multiple formats (as you said, "other date/time formats, e.g. with timezones and milliseconds and all the whistles") is possible, but the regex is not good to validate the dates - things like day zero, day > 30 is not valid for all months, February 29th in non-leap years, minutes > 60 etc.
A DateTimeFormatter will validate all these tricky details, while a regex will only guarantee that you have numbers and separators in the correct position and it won't validate the values. So regardless of the regex, you'll have to parse the dates anyway (which, IMHO, makes the use of regex pretty useless in this case).
Regex + Date Parser is the right option.
You have to write regex yourself, since the date parser is not using regex.
Your choice if regex can be simple, e.g. \d{2} for month, and let the date parser validate number range, or if it has to be more strict, e.g. (?:0[1-9]|1[0-2]) (01 - 12). Range checks like 28 vs 30 vs 31 days should not be done in regex. Let the date parser handle that, and since some value ranges are handled by date parser, might as well let it handle them all, i.e. a simple regex is perfectly fine.

Java 8 DateTimeFormatter dropping millis when they're zero?

This seems weird. Java 8 is formatting the output differently depending on whether the millis is zero. How do you force Java 8 (1.8.0_20) to always spit out the millis regardless of if they're zero or not?
public static void main(String[] args) {
TemporalAccessor zeroedMillis = DateTimeFormatter.ISO_OFFSET_DATE_TIME.parse("2015-07-14T20:50:00.000Z");
TemporalAccessor hasMillis = DateTimeFormatter.ISO_OFFSET_DATE_TIME.parse("2015-07-14T20:50:00.333Z");
System.out.println(DateTimeFormatter.ISO_OFFSET_DATE_TIME.format(zeroedMillis));
System.out.println(DateTimeFormatter.ISO_OFFSET_DATE_TIME.format(hasMillis));
}
2015-07-14T20:50:00Z
2015-07-14T20:50:00.333Z
You don't use ISO_OFFSET_DATE_TIME, basically :)
If you follow the documentation for that, you end up in the docs of ISO_LOCAL_TIME which has:
This returns an immutable formatter capable of formatting and parsing the ISO-8601 extended local time format. The format consists of:
Two digits for the hour-of-day. This is pre-padded by zero to ensure two digits.
A colon
Two digits for the minute-of-hour. This is pre-padded by zero to ensure two digits.
If the second-of-minute is not available then the format is complete.
A colon
Two digits for the second-of-minute. This is pre-padded by zero to ensure two digits.
If the nano-of-second is zero or not available then the format is complete.
A decimal point
One to nine digits for the nano-of-second. As many digits will be output as required.
If you always want exactly 3 digits, I suspect you want DateTimeFormatter.ofPattern with a pattern of yyyy-MM-dd'T'HH:mm:ss.SSSX.

Timestamp in ISO 8601 - the last 6 digits yyyy-MM-dd'T'HH:mm:ss.?

I have timestamps looking like this:
2015-03-21T11:08:14.859831
2015-03-21T11:07:22.956087
I read a Wiki article on ISO 8601, but did not get the meaning of the last 6 digits here.
I tried getting it down to milliseconds using "yyyy-MM-dd'T'HH:mm:ss.sss" or "yyyy-MM-dd'T'HH:mm:ss.ssssss". Is it just more precise than milliseconds - up to microseconds?
Is it just more precise than milliseconds?
Yes, it's microseconds in this case.
ISO-8601 doesn't actually specify a maximum precision. It states:
If necessary for a particular application a decimal fraction of hour, minute or second may be included. If a decimal fraction is included, lower order time elements (if any) shall be omitted and the decimal fraction shall be divided from the integer part by the decimal sign specified in ISO 31-0, i.e. the comma [,] or full stop [.]. Of these, the comma is the preferred sign. If the magnitude of the number is less than unity, the decimal sign shall be preceded by two zeros in accordance with 3.6.
The interchange parties, dependent upon the application, shall agree the number of digits in the decimal fraction. [...]
(You very rarely actually see comma as the decimal separator - at least, that's my experience.)
Unfortunately in my experience, parsing a value like this in Java 7 is tricky - there isn't a format specifier for "just consume fractional digits and do the right thing". You may find you need to manually chop the trailing 3 digits off before parsing as milliseconds.
As Java 8 supports a precision of nanoseconds, it's rather simpler - and in fact, the built-in ISO formatter can parse it fine:
import java.time.*;
import java.time.format.*;
public class Test {
public static void main(String[] args) {
DateTimeFormatter formatter = DateTimeFormatter.ISO_DATE_TIME;
System.out.println(LocalDateTime.parse("2015-03-21T11:07:22.956087", formatter));
}
}
You do not need a DateTimeFormatter
java.time API is based on ISO 8601 and therefore you do not need a DateTimeFormatter to parse a date-time string which is already in ISO 8601 format (e.g. your date-time string, 2015-03-21T11:08:14.859831).
Demo:
import java.time.LocalDateTime;
class Main {
public static void main(String args[]) {
String strDateTime = "2015-03-21T11:08:14.859831";
LocalDateTime ldt = LocalDateTime.parse(strDateTime);
System.out.println(ldt);
}
}
Output:
2015-03-21T11:08:14.859831
I read a Wiki article on ISO 8601, but did not get the meaning of the
last 6 digits here.
It is a fraction of a second, to microsecond resolution.
I tried getting it down to milliseconds using
"yyyy-MM-dd'T'HH:mm:ss.sss" or "yyyy-MM-dd'T'HH:mm:ss.ssssss".
Your pattern is not correct. The correct letter for the fraction of second is S (capital S) i.e. if you want the value to the millisecond resolution, the pattern should be yyyy-MM-dd'T'HH:mm:ss.SSS. Check the DateTimeFormatter documentation to learn more about this.
Is it just more precise than milliseconds - up to microseconds?
Yes, it is. The java.time API can give you an even more precise resolution, to nanoseconds (depending on the system clock).
Learn more about the modern Date-Time API from Trail: Date Time.

Two letter year, shall it be allowed?

Recently after library updagrade of Apache POI, I upgraded some other API as well. The other library I used read all cell contents as String and then I had to parse this string into Date.
The problem occurred when user started entering date as dd-mm-yy, the year appeared as 00yy AD.
As per documentation of SimpleDateFormat
For parsing, if the number of pattern letters is more than 2, the year
is interpreted literally, regardless of the number of digits. So using
the pattern "MM/dd/yyyy", "01/11/12" parses to Jan 11, 12 A.D.
So the question is, is it a better to enter the four letter year over two letter year?
The another question is what is best way to predict the year if its in two letter format.
Since the issue will come while parsing below year
Bond Start Date : 12-Jan-98 (1998)
Bond End Date : 12-Jan-70 (2070)
Regards,
Hanumant
It is not clear what you are asking.
If you are asking how to specify a date format that accepts 2 digit years (only) and interprets them conventionally, then you should use "dd-mm-yy".
If you are asking how to specify a date format that accepts 2 digit years and interprets them conventionally, AND ALSO handles 4 (or more) digit years, then you can't. As the javadoc says, if you use "dd-mm-yyyy", 2 digit years are interpreted as years in the first century AD.
One possible solution is to use TWO formats. First attempt to parse using "dd-mm-yy", and if that fails, try "dd-mm-yyyy".
But this is a hack ... and problematic if the user might actually need to enter a historical date.
If you are asking what you should do, then I'd recommend moving away from ambiguous ad-hoc formats that force you to (effectively) guess what the user means.
If the user has to enter dates / times in a character-based form, require them to use one of the ISO 8601 formats, and be strict when parsing the user-supplied date/time strings.
Otherwise, provide the user with a date picker widget.
The another question is what is best way to predict the year if its in two letter format.
Well this is the nub of the problem isn't it! In the 20th century, we all knew what a 2 digit year meant. You just slapped 19 on the front. (Ah ... those were the good old days!)
Now it is necessary to use a different heuristic. And the heuristic that SimpleDateFormat uses is described by the javadoc thus:
"For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created. For example, using a pattern of "MM/dd/yy" and a SimpleDateFormat instance created on Jan 1, 1997, the string "01/11/12" would be interpreted as Jan 11, 2012 while the string "05/04/64" would be interpreted as May 4, 1964."
The heuristic is 80 years before to 20 years after "now". So actually 12-Jan-98 is in 1998 and 12-Jan-70 is in 1970 ... if you parse using a SimpleDateFormat with a "yy" format.
If you need the dates to mean something else, then you will need to use a different date parser. For example, if you use the Joda-time libraries, it is possible to specify the "pivot year"; i.e. the middle year of the century in which 2-digit years fall.
Reference:
Joda-time Freaky Formatters

Categories

Resources