Parse java.time trying multiple patterns

Parse java.time trying multiple patterns - java

We have a library where users can pass in dates in multiple formats. They follow the ISO but are abbreviated at times.
So we get things like "19-3-12" and "2019-03-12T13:12:45.1234" where the fractional seconds can be 1 - 7 digits long. It's a very large number of combinations.
DateTimeFormatter.parseBest doesn't work because it won't accept "yy-m-d" for a local date. The solutions here won't work because it assumes we know the pattern - we don't.
And telling people to get their string formats "correct" won't work as there's a ton of existing data (these are mostly in XML & JSON files).
My question is, how can I parse strings coming in in these various pattersn without have to try 15 different explicit patterns?
Or even better, is there some way to parse a string and it will try everything possible and return a Temporal object if the string makes sense for any date[time]?

Without a full specification it is hard to give a precise recommendation. The techniques generally used for variable formats include:
Trying a number of known formats in turn.
Optional parts in the format pattern.
DateTimeFormatterBuilder.parseDefaulting() for parts that may be absent from the parsed string.
As you are aware, parseBest.
I am assuming that y-M-d always come in this order (never M-d-y or d-M-y, for example). 19-3-12 conflicts with ISO 8601 since the standard requires (at least) 4 digit year and 2 digit month. A challenge with 2-digit year is guessing the century: is this 1919 or 2019 or might it be 2119?
The good news: presence and absence of seconds and varying number of fractional digits are all built-in and pose no problem.
From what you have told us it seems to me that the following is a fair shot.
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendPattern("[uuuu][uu]-M-d")
.optionalStart()
.appendLiteral('T')
.append(DateTimeFormatter.ISO_LOCAL_TIME)
.optionalEnd()
.toFormatter();
TemporalAccessor dt = formatter.parseBest("19-3-12", LocalDateTime::from, LocalDate::from);
System.out.println(dt.getClass());
System.out.println(dt);
Output:
class java.time.LocalDate
2019-03-12
I figure that it should work with the variations of formats that you describe. Let’s just try your other example:
dt = formatter.parseBest( "2019-03-12T13:12:45.1234", LocalDateTime::from, LocalDate::from);
System.out.println(dt.getClass());
System.out.println(dt);
class java.time.LocalDateTime
2019-03-12T13:12:45.123400
To control the interpretation of 2-digit year you may use one of the overloaded variants of DateTimeFormatterBuilder.appendValueReduced(). I recommend that you consider a range check on top of it.

Trying all the possible formats would perform worse than trying only 15.
You can try to "normalize" to a single format but then you would be doing the work those 15 formats are supposed to do.
I think the best approach is the one described by #JB Nizet, to try only patterns that match string length.
public Date parse(String openFormat) {
String[] formats = {"YYY-MM-DD"};
switch(openFormat.length()) {
case 24: // 2019-03-12T13:12:45.1234
formats = new String[]{"YYY-MM-DDThh:mm:ssetcetc", }; // all the formats for length 24
break;
...
case 6: //YYY-MM-DD, DD-MM-YYYY
formats = new String[]{YYY-MM-DD", "DD-MM-YYYY", }; // all the formats for length 6
break;
}
Date myDate
// now try the reduced number of formats, possibly only 1 or 2
for( String format : formats) try {
myDate = date parse ( format ) etcetc
} catch (DateFormatException d) {
continue;
}
if (myDate == null){
throw InvalidDate
} else {
return myDate
}
}

Related

Create a DateTimeFormater with an Optional Section at Beginning

I have timecodes with this structure hh:mm:ss.SSS for which i have a own Class, implementing the Temporal Interface.
It has the custom Field TimecodeHour Field allowing values greater than 23 for hour.
I want to parse with DateTimeFormatter. The hour value is optional (can be omitted, and hours can be greater than 24); as RegEx (\d*\d\d:)?\d\d:\d\d.\d\d\d
For the purpose of this Question my custom Field can be replaced with the normal HOUR_OF_DAY Field.
My current Formatter
DateTimeFormatter UNLIMITED_HOURS = new DateTimeFormatterBuilder()
.appendValue(ChronoField.HOUR_OF_DAY, 2, 2,SignStyle.NEVER)
.appendLiteral(':')
.parseDefaulting(TimecodeHour.HOUR, 0)
.toFormatter(Locale.ENGLISH);
DateTimeFormatter TIMECODE = new DateTimeFormatterBuilder()
.appendOptional(UNLIMITED_HOURS)
.appendValue(MINUTE_OF_HOUR, 2)
.appendLiteral(':')
.appendValue(SECOND_OF_MINUTE, 2)
.appendFraction(MILLI_OF_SECOND, 3, 3, true)
.toFormatter(Locale.ENGLISH);
Timecodes with a hour value parse as expected, but values with hours omittet throw an Exception
java.time.format.DateTimeParseException: Text '20:33.123' could not be parsed at index 5
I assume, as hour and minute have the same pattern, the parser starts at front and captures the minute value for the optional section.
Is this right, and how can solve this?

I started to suspect that 20:33.123 wasn’t meant to indicate a time of day between 20 and 21 minutes past midnight. Maybe rather an amount of time, a little longer than 20 minutes. If this is correct, use a Duration for it.
Unfortunately java.time does not include means for parsing and formatting a Duration in other than ISO 8601 format. This leaves us with at least three options:
Use a third-party library. Time4J offers an elegant solution, see below. Joda-Time has its PeriodFormatter class. Apache may also offer facilities for parsing and formatting of durations.
Convert your string to ISO 8601 format before parsing with Duration.parse().
Write your own parser.
I was thinking that we’re too lazy for 3. and that Joda-Time is getting dated, so I want to pursue options 1. and 2. here, option 1. in the Time4J variant.
A regex for adapting to ISO 8601
ISO 8601 format for a duration feels unusual at first, but is straightforward. PT20M33.123S means 20 minutes 33.123 seconds.
public static Duration parse(String timeCodeString) {
String iso8601 = timeCodeString
.replaceFirst("^(\\d{2,}):(\\d{2}):(\\d{2}\\.\\d{3})$", "PT$1H$2M$3S")
.replaceFirst("^(\\d{2}):(\\d{2}\\.\\d{3})$", "PT$1M$2S");
return Duration.parse(iso8601);
}
Let’s try it out:
System.out.println(parse("20:33.123"));
System.out.println(parse("123:20:33.123"));
Output is:
PT20M33.123S
PT123H20M33.123S
My two calls to replaceFirst first handle the case with hours, then the case without hours. So either will convert a string that matches your regex to ISO 8601 format. Which the Duration class then parses. And as you can see, Duration also prints ISO 8601 format back. Formatting it differently is not bad, though, search for how.
Time4J
The Time4J library offers the really elegant solution very much along the same line of thought as yours. All we really need is this formatter:
private static final Formatter<ClockUnit> TIME_CODE_PARSER
= Duration.formatter(ClockUnit.class, "[###hh:mm:ss.fff][mm:ss.fff]");
Simply use like this:
System.out.println(TIME_CODE_PARSER.parse("20:33.123"));
System.out.println(TIME_CODE_PARSER.parse("123:20:33.123"));
PT20M33,123000000S
PT123H20M33,123000000S
The Time4J Duration class too prints ISO 8601 format. It appears that it uses comma as decimal separator as is preferred in ISO 8601, and that it prints 9 decimals on the seconds also when some of them are 0.
In the format pattern string ###hh means 2 to 5 digit hours, and fff means three digits of decimal fraction of second.
Anything wrong with your approach?
Was there anything wrong with your approach? ChronoField.HOUR_OF_DAY means that: hour of day. 0 is midnight, 12 is noon and 23 is near the end of the day. This is not what you want, so yes, you are using the wrong means. While you can probably get it to work, anyone maintaining your code after you will find it confusing and will probably have a hard time making modification in line with your intentions.
Links
Wikipedia article: ISO 8601
Joda-Time PeriodFormatter
Time4J TimeSpanFormatter

Try with two optional parts (one with hours, other without) like in:
var formatter = new DateTimeFormatterBuilder()
.optionalStart()
.appendValue(HOUR_OF_DAY, 2, 4, SignStyle.NEVER).appendLiteral(":")
.appendValue(MINUTE_OF_HOUR, 2).appendLiteral(":")
.appendValue(SECOND_OF_MINUTE, 2)
.optionalEnd()
.optionalStart()
.parseDefaulting(HOUR_OF_DAY, 0)
.appendValue(MINUTE_OF_HOUR, 2).appendLiteral(":")
.appendValue(SECOND_OF_MINUTE, 2)
.optionalEnd()
.toFormatter(Locale.ENGLISH);
I do not know about TimecodeHour, so I used HOUR_OF_DAY to test(also too lazy to include fractions)

I think fundamentally the problem is that it gets stuck going down the wrong path. It sees a field of length 2, which we know is the minutes but it believes is the hours. Once it believes the optional section is present, when we know it's not, the whole thing is destined to fail.
This is provable by changing the minimum hour length to 3.
.appendValue(TimecodeHour.HOUR, 3, 4, SignStyle.NEVER)
It now knows that the "20" cannot be hours, since hours requires at least 3 digits. With this small change, it now parses correctly, whether the optional section is present or not.
So presuming that the hours field really does need to be between 2 and 4 digits, I think you're stuck with having to implement a workaround. For example, count the number of colons in the string and use a different formatter depending on which one you run into. Using a different delimiter besides a colon for the hours would also work.
The parser logic has undergone quite a few bug fixes over the various Java versions since it was introduced - as you can imagine, there are so many potential edge cases - so I was hopeful using a recent version of Java would make this problem disappear. Unfortunately, it seems even in Java 16, the behaviour is still the same.

Java or Scala fast way to parse dates with many different formats using java.time

I would like to have a generic and fast parser for dates that comes with random format like:
2018
2018-12-31
2018/12/31
2018 dec 31
20181231151617
2018-12-31T15:16:17
2018-12-31T15:16:17.123456
2018-12-31T15:16:17.123456Z
2018-12-31T15:16:17.123456 UTC
2018-12-31T15:16:17.123456+01:00
... so many possibilities
Is there a nice way a or "magic" function do that?
Currently I am planning to use something like this:
val formatter = new DateTimeFormatterBuilder()
.appendPattern("[yyyy-MM-dd'T'HH:mm:ss]")
.appendPattern("[yyyy-MM-dd]")
.appendPattern("[yyyy]")
// add so many things here
.parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
.parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
.parseDefaulting(ChronoField.HOUR_OF_DAY, 0)
.parseDefaulting(ChronoField.MINUTE_OF_HOUR, 0)
.parseDefaulting(ChronoField.SECOND_OF_MINUTE, 0)
.parseDefaulting(ChronoField.MICRO_OF_SECOND, 0)
.toFormatter()
val temporalAccessor = formatter.parse("2018")
val localDateTime = LocalDateTime.from(temporalAccessor)
localDateTime.getHour
val zonedDateTime = ZonedDateTime.of(localDateTime, ZoneId.systemDefault)
val result = Instant.from(zonedDateTime)
But is there a smarter way than specifying hundreds of formats?
Most of answers I found are outdated (pre Java8) or do not focus on performance and a lot of different formats.

No, there is no nice/magic way to do this, for two main reasons:
There are variations and ambiguities in data formats that make a generic parser very difficult. e.g. 11/11/11
You are looking for very high performance, which rules out any brute-force methods. 1us per date means only a few thousand instructions to do the full parsing.
At some level you are going to have to specify what formats are valid and how to interpret them. The best way to do this is probably one or more regular expressions that extract the appropriate fields from all the allowable combinations of characters that might form a date, and then much simpler validation of the individual fields.
Here is an example that deals with all dates you listed:
val DateMatch = """(\d\d\d\d)[-/ ]?((?:\d\d)|(?:\w\w\w))?[-/ ]?(\d\d)?T?(\d\d)?:?(\d\d)?:?(\d\d)?[\.]*(\d+)?(.*)?""".r
date match {
case DateMatch(year, month, day, hour, min, sec, usec, timezone) =>
(year, Option(month).getOrElse("1"), Option(day).getOrElse(1), Option(hour).getOrElse(0), Option(min).getOrElse(0), Option(sec).getOrElse(0), Option(usec).getOrElse(0), Option(timezone).getOrElse(""))
case _ =>
throw InvalidDateException
}
As you can see it is going to get very hairy once all the possible dates are included. But if the regex engine can handle it then it should be efficient because the regex should compile to a state machine that looks at each character once.

(Re)Use DateTimeFormatter for parsing date ranges or mix DateTimeFormatter with regex

I have the following String representing a date range which I need to parse:
2018-10-20:2019-10-20
It consists of 2 ISO date strings separated by :
The string can get more complex by having repeated date ranges mixed with other text. This can be done by a Regex.
However, given that the latest Java has Date/Time support that most coders here and elsewhere are ecstatic about, is it possible to use, say, LocalDate's parser or a custom DateTimeFormatter in order to identify the bits in my String which are candidates for ISO-date and capture them?
Better yet, how can I extract the validation regex from a DateTimeFormatter (the regex which identifies an ISO-date, assuming there is one) and merge/compile it with my own regex for the rest of the String.
I just do not feel comfortable coding yet another ISO-date regex in my code when possibly there is already a regex in Java which does that and I just re-use it.
Please note that I am not asking for a regex. I can do that.
Please also note that my example String can contain other date/time formats, e.g. with timezones and milliseconds and all the whistles.

Actually, DateTimeFormatter doesn't have an internal regex. It uses a CompositePrinterParser, which in turn uses an array of DateTimePrinterParser instances (which is an inner interface of DateTimeFormatterBuilder), where each instance is responsible for parsing/formatting a specific field.
IMO, regex is not the best approach here. If you know that all dates are separated by :, why not simply split the string and try to parse the parts individually? Something like that:
String dates = // big string with dates separated by :
DateTimeFormatter parser = // create a formatter for your patterns
for (String s : dates.split(":")) {
parser.parse(s); // if "s" is invalid, it throws exception
}
If you just want to validate the strings, calling parse as above is enough - it'll throw an exception if the string is invalid.
To support multiple formats, you can use DateTimeFormatterBuilder::appendOptional. Example:
DateTimeFormatter parser = new DateTimeFormatterBuilder()
// full ISO8601 with date/time and UTC offset (ex: 2011-12-03T10:15:30+01:00)
.appendOptional(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
// date/time without UTC offset (ex: 2011-12-03T10:15:30)
.appendOptional(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
// just date (ex: 2011-12-03)
.appendOptional(DateTimeFormatter.ISO_LOCAL_DATE)
// some custom format (day/month/year)
.appendOptional(DateTimeFormatter.ofPattern("dd/MM/yyyy"))
// ... add as many you need
// create formatter
.toFormatter();
A regex to support multiple formats (as you said, "other date/time formats, e.g. with timezones and milliseconds and all the whistles") is possible, but the regex is not good to validate the dates - things like day zero, day > 30 is not valid for all months, February 29th in non-leap years, minutes > 60 etc.
A DateTimeFormatter will validate all these tricky details, while a regex will only guarantee that you have numbers and separators in the correct position and it won't validate the values. So regardless of the regex, you'll have to parse the dates anyway (which, IMHO, makes the use of regex pretty useless in this case).

Regex + Date Parser is the right option.
You have to write regex yourself, since the date parser is not using regex.
Your choice if regex can be simple, e.g. \d{2} for month, and let the date parser validate number range, or if it has to be more strict, e.g. (?:0[1-9]|1[0-2]) (01 - 12). Range checks like 28 vs 30 vs 31 days should not be done in regex. Let the date parser handle that, and since some value ranges are handled by date parser, might as well let it handle them all, i.e. a simple regex is perfectly fine.

Formatting date in Java using SimpleDateFormat

I am trying to parse a date into an appropriate format, but I keep getting the error
Unparseable date
Can anyone tell me what the mistake is?
try {
System.out.println(new SimpleDateFormat("d-MMM-Y").parse("05-03-2018").toString());
} catch (ParseException e) {
e.printStackTrace();
}
I want the date to have this format:
05-Mar-18

Since you want to change the format, first read and parse the date (from String) of your own format in a Date type object. Then use that date object by formatting it into a new (desired) format using a SimpleDateFormat.
The error in your code is with the MMM and Y. MMM is the month in string while your input is a numeric value. Plus the Y in your SimpleDateFormat is an invalid year. yy is what needs to be added.
So here is a code that would fix your problem.
SimpleDateFormat dateFormat = new SimpleDateFormat("d-MM-yyyy");
Date date = dateFormat.parse("05-03-2018");
dateFormat = new SimpleDateFormat("dd-MMM-yy");
System.out.println(dateFormat.format(date));
I hope this is what you're looking for.

There are some concepts about dates you should be aware of.
There's a difference between a date and a text that represents a date.
Example: today's date is March 9th 2018. That date is just a concept, an idea of "a specific point in our calendar system".
The same date, though, can be represented in many formats. It can be "graphical", in the form of a circle around a number in a piece of paper with lots of other numbers in some specific order, or it can be in plain text, such as:
09/03/2018 (day/month/year)
03/09/2018 (monty/day/year)
2018-03-09 (ISO8601 format)
March, 9th 2018
9 de março de 2018 (in Portuguese)
2018年3月5日 (in Japanese)
and so on...
Note that the text representations are different, but all of them represent the same date (the same value).
With that in mind, let's see how Java works with these concepts.
a text is represented by a String. This class contains a sequence of characters, nothing more. These characters can represent anything; in this case, it's a date
a date was initially represented by java.util.Date, and then by java.util.Calendar, but those classes are full of problems and you should avoid them if possible. Today we have a better API for that.
With the java.time API (or the respective backport for versions lower than 8), you have easier and more reliable tools to deal with dates.
In your case, you have a String (a text representing a date) and you want to convert it to another format. You must do it in 2 steps:
convert the String to some date-type (transform the text to numerical day/month/year values) - that's called parsing
convert this date-type value to some format (transform the numerical values to text in a specific format) - that's called formatting
For step 1, you can use a LocalDate, a type that represents a date (day, month and year, without hours and without timezone), because that's what your input is:
String input = "05-03-2018";
DateTimeFormatter inputParser = DateTimeFormatter.ofPattern("dd-MM-yyyy");
// parse the input
LocalDate date = LocalDate.parse(input, inputParser);
That's more reliable than SimpleDateFormat because it solves lots of strange bugs and problems of the old API.
Now that we have our LocalDate object, we can do step 2:
// convert to another format
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("dd-MMM-yy", Locale.ENGLISH);
String output = date.format(formatter);
Note that I used a java.util.Locale. That's because the output you want has a month name in English, and if you don't specify a locale, it'll use the JVM's default (and who guarantees it'll always be English? it's better to tell the API which language you're using instead of relying on the default configs, because those can be changed anytime, even by other applications running in the same JVM).
And how do I know which letters must be used in DateTimeFormatter? Well, I've just read the javadoc. Many developers ignore the documentation, but we must create the habit to check it, specially the javadoc, that tells you things like the difference between uppercase Y and lowercase y in SimpleDateFormat.

How to parse only 4 digit years

I'm using Joda-Time to parse years like this:
private DateTime attemptParse(String pattern, String date) {
DateTimeFormatter parser = DateTimeFormat.forPattern(pattern).withLocale(Locale.ENGLISH);
DateTime parsedDateTime = parser.parseLocalDateTime(date).toDateTime(WET);
return parsedDateTime;
}
I'm trying to parse multiple formats: "yyyy-MM-dd", "yyyy-MMM-dd","yyyy MMM dd-dd","yyyy MMM", (etc), "yyyy". When one doesn't work, I try the next one.
And it works like a charm when the string is indeed only 4 digits (e.g: "2016"). The problem is that I sometimes receive things like this: "201400". And Joda-Time matches this with "yyyy" pattern and returns a date with year 201400.
I wanted to avoid the ugly if to check if year > 9999. Is there any way to do this using Joda-Time?

To parse multiple formats, you can create lots of DateTimeParser instances and join all in one single formatter (instead of trying one after another).
This will require a DateTimeFormatterBuilder, which will also be used to enforce a specific number of digits in the input (unfortunately, there's no way to enforce a specific number of digits like you want using just DateTimeFormat.forPattern()).
First you create lots of org.joda.time.format.DateTimeParser instances (one for each possible pattern):
// only yyyy
DateTimeParser p1 = new DateTimeFormatterBuilder()
// year with exactly 4 digits
.appendYear(4, 4).toParser();
// yyyy-MM-dd
DateTimeParser p2 = new DateTimeFormatterBuilder()
// year with exactly 4 digits
.appendYear(4, 4)
// rest of the pattern
.appendPattern("-MM-dd").toParser();
// yyyy MMM
DateTimeParser p3 = new DateTimeFormatterBuilder()
// year with exactly 4 digits
.appendYear(4, 4)
// rest of the pattern
.appendPattern(" MMM").toParser();
Then you create an array with all these patterns and create a DateTimeFormatter with it:
// create array with all the possible patterns
DateTimeParser[] possiblePatterns = new DateTimeParser[] { p1, p2, p3 };
DateTimeFormatter parser = new DateTimeFormatterBuilder()
// append all the possible patterns
.append(null, possiblePatterns)
// use the locale you want (in case of month names and other locale sensitive data)
.toFormatter().withLocale(Locale.ENGLISH);
I also used Locale.ENGLISH (as you're also using it in your question's code). This locale indicates that the month names will be in English (so MMM can parse values like Jan and Sep). With this, you can parse the inputs:
System.out.println(parser.parseLocalDateTime("2014")); // OK
System.out.println(parser.parseLocalDateTime("201400")); // exception
System.out.println(parser.parseLocalDateTime("2014-10-10")); // OK
System.out.println(parser.parseLocalDateTime("201400-10-10")); // exception
System.out.println(parser.parseLocalDateTime("2014 Jul")); // OK
System.out.println(parser.parseLocalDateTime("201400 Jul")); // exception
When the year is 2014, the code works fine. When it's 201400, it throws a java.lang.IllegalArgumentException, such as:
java.lang.IllegalArgumentException: Invalid format: "201400" is malformed at "00"
DateTimeFormatter is immutable and thread-safe, so you don't need to create it every time your validation method is called. You can create it outside of the method (such as in a static final field).
This is better than creating one formatter everytime you perform a validation, and going to the next one when an exception occurs. The formatter created already does it internally, going to the next pattern until it finds one that works (or throwing the exception if all patterns fail).
Java new Date/Time API
Joda-Time is in maintainance mode and is being replaced by the new APIs, so I don't recommend start a new project with it. Even in joda's website it says: "Note that Joda-Time is considered to be a largely “finished” project. No major enhancements are planned. If using Java SE 8, please migrate to java.time (JSR-310).".
If you can't (or don't want to) migrate from Joda-Time to the new API, you can ignore this section.
If you're using Java 8, consider using the new java.time API. It's easier, less bugged and less error-prone than the old APIs.
If you're using Java 6 or 7, you can use the ThreeTen Backport, a great backport for Java 8's new date/time classes. And for Android, you'll also need the ThreeTenABP (more on how to use it here).
The code below works for both.
The only difference is the package names (in Java 8 is java.time and in ThreeTen Backport (or Android's ThreeTenABP) is org.threeten.bp), but the classes and methods names are the same.
This new API is much more strict than the previous ones, so the formatter only works with the exact number of digits (note that some classes are very similar to Joda-Time):
// 4 digits in year
DateTimeFormatter fmt = DateTimeFormatter.ofPattern("yyyy", Locale.ENGLISH);
fmt.parse("2014"); // OK
fmt.parse("201400"); // exception
fmt.parse("201"); // exception
This code works with 2014, but with 201400 or 201 (or any other value without exactly 4 digits) it throws an exception:
java.time.format.DateTimeParseException: Text '201400' could not be parsed at index 0
With this, your validation code could work with the array of strings.
There's only one detail: when parsing to a date, Joda-Time sets default values when the input doesn't have some fields (like month becomes January, day becomes 1, hour/minute/second are set to zero, etc).
If you are just validating the input, then you don't need to return anything. Just check if the exception is thrown and you'll know if the input is valid or not.
If you just need the year value, though, you can use the Year class:
DateTimeFormatter parser = DateTimeFormatter.ofPattern("yyyy", Locale.ENGLISH);
System.out.println(Year.parse("2014", parser)); // ok
System.out.println(Year.parse("201400", parser)); // exception
If you want the year value as an int:
Year year = Year.parse("2014", parser);
int yearValue = year.getValue(); // 2014
But if you want to get a date object, you'll need to set the default values manually - the new API is very strict and don't set those values automatically. In this case, you must set the default values, by using a DateTimeFormatterBuilder.
I also parse it to a LocalDateTime, just as example:
DateTimeFormatter fmt = new DateTimeFormatterBuilder()
// string pattern
.appendPattern("yyyy")
// default month is January
.parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
// default day is 1
.parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
// default hour is zero
.parseDefaulting(ChronoField.HOUR_OF_DAY, 0)
// default minute is zero
.parseDefaulting(ChronoField.MINUTE_OF_HOUR, 0)
// set locale
.toFormatter(Locale.ENGLISH);
// create LocalDateTime
System.out.println(LocalDateTime.parse("2014", fmt)); // 2014-01-01T00:00
System.out.println(LocalDateTime.parse("201400", fmt)); // exception
You can choose whatever values you want as the default for the fields, and use any of the new available date types.

What you are saying is that Jodatime should somehow guess that it should parse "201400" as 2014. I don't think that's reasonably within the scope of that library. You should pre-process the data yourself, for example by using:
String normalizedDate = String.format("%4s", date).trim();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.