When I parse a date with the year 0000 it appears to be stored as the year 0001.
See below for code:
String dateStr = "00000102";
System.out.println(dateStr);
DateFormat dateFormat = new SimpleDateFormat("yyyyMMdd");
Date date = dateFormat.parse("00000102");
String convertedStr = dateFormat.format(date);
System.out.println(convertedStr);
The output is as per below:
00000102
00010102
Is there a way to represent the year 0000 in Java using the standard Java API?
I don't believe it's possible, since java.util.Date is based on UTC, which is based on the Gregorian calendar, and the Gregorian calendar has no year zero.
...the traditional proleptic Gregorian calendar (like the Julian calendar) does not have a year 0 and instead uses the ordinal numbers 1, 2, … both for years AD and BC. Thus the traditional time line is 2 BC, 1 BC, AD 1, and AD 2.
(Source: The Wikipedia article on the Gregorian calendar)
I don't think the calendar is zero-based. Before 1 AD there was 1 BC. No 0.
Also: what kind of application are you building that needs to handle dates from that era? And if you need to cover that area, consider this: "Dates obtained using GregorianCalendar are historically accurate only from March 1, 4 AD onward, when modern Julian calendar rules were adopted. Before this date, leap year rules were applied irregularly, and before 45 BC the Julian calendar did not even exist."
Year 0 does not exist in the Gregorian calendar. From Year 0 at Wikipedia:
"Year zero" does not exist in the widely used Gregorian calendar or in its predecessor, the Julian calendar. Under those systems, the year 1 BC is followed by AD 1.
...
The absence of a year 0 leads to some confusion concerning the boundaries of longer decimal intervals, such as decades and centuries. For example, the third millennium of the Gregorian calendar began on Monday, 1 January, 2001, rather than the widely celebrated Saturday, 1 January, 2000. Likewise, the 20th century began on 1 January 1901.
...
java.time
I recommend that you use java.time, the modern Java date and time API, for your date work.
As others have said, the Julian/Gregorian calendar that the old Date and SimpleDateFormat classes used does not have a year zero. Before year one in the current era (CE also known as AD) came year 1 before the current era (BCE also known as BC).
A side effect of using java.time is that you do get a year zero! That’s right. java.time uses the proleptic Gregorian calendar, a modern inventions that not only extends the rules of the Gregorian back into times before the Gregorian calendar was invented, but also includes a year 0 before year 1, and a year -1 (minus one) before that. You may say that year 0 corresponds to 1 BCE and -1 to 2 BCE, etc.
So parsing your string is no problem. There’s even a built-in formatter for it.
String dateStr = "00000102";
LocalDate date = LocalDate.parse(dateStr, DateTimeFormatter.BASIC_ISO_DATE);
System.out.println("Parsed date is " + date);
String convertedStr = date.format(DateTimeFormatter.BASIC_ISO_DATE);
System.out.println(convertedStr);
Output:
Parsed date is 0000-01-02
00000102
We see that in both output lines year 0000 is printed back as expected.
What went wrong in your code?
When we all agree that there was no year 0, we should have expected your parsing to fail with an exception because of the invalid year. Why didn’t it? It’s one of the many problems with the old SimpleDateFormat class: with default settings it just extrapolates and takes year 0000 to mean the year before year 0001, so year 1 BCE. And falsely pretends that all is well. This explains why year 0001 was printed back: it meant year 1 BCE, but since you didn’t print the era too, this was really hard to tell.
Links
Oracle tutorial: Date Time explaining how to use java.time.
Proleptic Gregorian calendar on Wikipedia.
Related
I try to provide a tool to convert datetime from Java to C#. But there is a serious problem.
In Java, I read '0001-01-01' from the SQL Server database via java.sql.Date, and get the millisecond -62135798400000.
I also consider the timezone offset.
private static long getMilliSecondWithoutTimeZone(long origin) {
return origin + (ZonedDateTime.now().getOffset().getLong(OFFSET_SECONDS) * 1000);
}
And the final millisecond is -62135769600000.
In C#, I use this millisecond to new Datetime
var ticks = new DateTime(1970, 1, 1).Ticks + (-62135769600000 * 10000);
var date = new DateTime(ticks);
When the code runs, it will throw the exception:
System.ArgumentOutOfRangeException: 'Ticks must be between DateTime.MinValue.Ticks and DateTime.MaxValue.Ticks. (Parameter 'ticks')'
However, the conversion is correct after '1600-01-01' according to my test.
Before '1600-01-01', there always is a few days of error.
It makes me very confused.
I find the remarks in https://learn.microsoft.com/en-us/dotnet/api/system.globalization.juliancalendar?view=net-5.0#remarks
The Gregorian calendar was developed as a replacement for the Julian calendar (which is represented by the JulianCalendar class) and was first introduced in a small number of cultures on October 15, 1582. When working with historic dates that precede a culture's adoption of the Gregorian calendar, you should use the original calendar if it is available in the .NET Framework. For example, Denmark changed from the Julian calendar to the Gregorian calendar on February 19 (in the Julian calendar) or March 1 (in the Gregorian calendar) of 1700. In this case, for dates before the adoption of the Gregorian calendar, you should use the Julian calendar. However, note that no culture offers intrinsic support for the JulianCalendar class. You must use the JulianCalendar class as a standalone calendar. For more information, see Working with calendars.
The actual reason is:
C# uses the Gregorian calendar all the time.
Java uses the Gregorian calendar after October 15, 1582, and uses the Julian calendar before.
The solution:
import java.sql.Date;
import java.time.chrono.IsoChronology;
import java.time.*;
public class Test {
public static Long getMilliSeconds(Date date) {
if (null == date) {
return null;
}
IsoChronology ISO = IsoChronology.INSTANCE;
LocalDate ld = date.toLocalDate();
return ISO.localDateTime(LocalDateTime.of(ld.getYear(), ld.getMonth(), ld.getDayOfMonth(), 0, 0, 0)).toInstant(ZoneOffset.UTC).toEpochMilli();
}
}
It seems like the millisecond value that you mention, -62_135_798_400_000, comes out of an old-fashioned java.sql.Date object created in a timezone that is assumed to be at UTC offset +08:00 back then, perhaps just Etc/GMT-8. With this assumption, the value is historically correct since it was the Julian calendar that was used back then, and Date does use that.
I don’t know the .NET classes that C# uses, but I consider it a likely that a few days error are caused by them using the proleptic Gregorian calendar, that is, pretending that the Gregorian calendar was used in all past even though it didn’t come into existence before 1582. The modern Java date and time API does this and therefore gives you millisecond values that usually differ by a few days.
long milliseconds = LocalDate.of(1, 1, 1)
.atStartOfDay(ZoneOffset.ofHours(8))
.toInstant()
.toEpochMilli();
System.out.format(Locale.ENGLISH, "%,d%n", milliseconds);
Output:
-62,135,625,600,000
It is 48 hours — or 2 days — later than the time you mentioned. See if it solves your issue.
Link
Oracle tutorial: Date Time explaining how to use java.time.
You forgot to account for time zone offset.
If we set the time zone to UTC, you'll see this:
TimeZone.setDefault(TimeZone.getTimeZone("UTC"));
System.out.println(new Date(-62135798400000L));
Output
Fri Dec 31 16:00:00 UTC 1
It is actually year 1 BC, not year 1 AD.
The time 16:00 indicates a time zone offset of 8 hours, so if we change to GMT+8 we get:
TimeZone.setDefault(TimeZone.getTimeZone("GMT+8"));
System.out.println(new Date(-62135798400000L));
Output
Sat Jan 01 00:00:00 GMT+08:00 1
That is correctly year 1 AD.
Which means that you need to adjust the millisecond value by 8 hours, aka 28800000 milliseconds.
For the date 0001-01-01 00:00 UTC, the correct value for milliseconds is -62135769600000. Anything less than that will be rejected by the C# DateTime class.
Exasol is transforming old dates incorrectly:
SELECT ADD_SECONDS('1970-01-01 00:00:00',-30610224000.000)
-- 0999-12-27 00:00:00
SELECT ADD_SECONDS('1970-01-01 00:00:00',-30609792000.000)
-- 1000-01-01 00:00:00
While in java:
System.out.println(Instant.ofEpochMilli(0).plus(-30610224000L, ChronoUnit.SECONDS));
System.out.println(Instant.ofEpochMilli(0).plus(-30609792000L, ChronoUnit.SECONDS));
1000-01-01T00:00:00Z
1000-01-06T00:00:00Z
Do you know why that diference?
Not knowing Exasol I bet it’s the difference between the Julian and the proleptic Gregorian calendar.
The Julian calendar (named after Julius Caesar) has leap years every 4 years always. At a point in history they discovered that this gave a bit too many leap years. So under pope Gregor the Gregorian calendar was introduced, leaving out leap years for years that are divisible by 100 but not by 400 (so 1900 was not a leap year, 2000 was and 2100 will not be). The change is known as the Gregorian change or changeover.
The proleptic Gregorian calendar is a newer invention. It extrapolates the Gregorian calendar back into times before the Gregorian change, thus using dates that are in disagreement with the calendar actually used back in those times. The advantages being that calculations are simpler, and we are free from deciding when to make the Gregorian change, which is nice since each jurisdiction had their own date for that.
Instant and the other classes from java.time use the proleptic Gregorian calendar, so give inaccurate dates for years 999 and 1000. If Exasol uses the Julian calendar (which I do not know), this could be the explanation for the differences you observed.
Links
Julian calendar
Proleptic Gregorian calendar
Related question: Java: How to get the timestamp of '1000-01-01 00:00:00' in UTC?
As per https://en.wikipedia.org/wiki/ISO_8601#Week_dates, Weekdays start on Monday. But, from Java, if you try to extract week number in two different ways, two different outputs come if the date is a Sunday.
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Calendar;
public class TestDate {
public static void main(String[] args) throws ParseException {
final SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd");
final SimpleDateFormat weekFormatter = new SimpleDateFormat("ww");
String date = "2018-10-21";
System.out.println(weekFormatter.format(formatter.parse(date)));
Calendar calendar = Calendar.getInstance();
calendar.setFirstDayOfWeek(Calendar.MONDAY);
calendar.setTime(formatter.parse(date));
System.out.println(calendar.get(Calendar.WEEK_OF_YEAR));
}
}
Output:
43
42
Is this an inconsistency?
This is just a test program I wrote to reproduce the issue, I noticed the problem in Hive, like the following:
0: jdbc:hive2://zk0-something> select from_unixtime(t, 'ww'), weekofyear(from_unixtime(t, 'yyyy-MM-dd')) from (select 1540122033 as t) a;
+------+------+--+
| _c0 | _c1 |
+------+------+--+
| 43 | 42 |
+------+------+--+
1 row selected (0.388 seconds)
0: jdbc:hive2://zk0-something>
java.time
String date = "2018-10-21";
LocalDate ld = LocalDate.parse(date);
int weekOfYear = ld.get(WeekFields.ISO.weekOfYear());
System.out.println(weekOfYear);
Output:
42
Since you are interested in the ISO 8601 rules for week numbers, use WeekFields.ISO for getting week related data from a LocalDate. You may also use a formatter if you like:
DateTimeFormatter weekFormatter = DateTimeFormatter.ofPattern("ww", Locale.FRANCE);
System.out.println(ld.format(weekFormatter));
Output is the same:
42
The locale passed to DateTimeFormatter.ofPattern determines the week scheme. If I pass Locale.US instead, I get 43.
I recommend you use java.time, the modern Java date and time API, and stay away from the old date-time classes like SimpleDateFormat and Calendar. The old ones were poorly designed and the modern ones are much nicer to work with.
What went wrong in your code?
Both the outdated SimpleDateFormat class and the modern DateTimeFormatter take their week numbering scheme from their locale. If no locale is specified for the formatter, it uses the default locale of the JVM. So if the JVM has American locale, for example, the formatter will print 43 in your first example because in the US Sunday October 21 this year was in week 43. If the locale is French, it will print 42 because that day was in week 42 in France. France follows the ISO 8601 standard, the USA does not.
In your example, setting the Calendar’s first day of week to Monday causes the week number to be 42 as you had expected. This will not always be the case, however. Week numbers are defined not only by the first day of the week but also by the definition of week 1. From your link:
The first ISO week of a year may have up to three days that are
actually in the Gregorian calendar year that is ending; if they are
Monday, Tuesday and Wednesday. Similarly, the last ISO week of a year
may have up to three days that are actually in the Gregorian calendar
year that is starting; if they are Friday, Saturday, and Sunday. The
Thursday of each ISO week is always in the Gregorian calendar year
denoted by the ISO week-numbering year.
The American definition of which week is week 1 is different: In the US January 1 is always in week 1. Therefore if your Calendar is created with American locale, setting its first day of week to Monday is not enough to make is follow ISO 8601 rules. Coincidentally, for 2018 the week numbers agree, though.
I'm trying to get the sunday of the same week as a given date.
During this I ran into this problem:
Calendar calendar = Calendar.getInstance(Locale.GERMANY);
calendar.set(2017, 11, 11);
calendar.set(Calendar.DAY_OF_WEEK, Calendar.SUNDAY);
System.out.println(calendar.getTime().toString());
results in "Sun Jan 07 11:18:42 CET 2018"
but
Calendar calendar2 = Calendar.getInstance(Locale.GERMANY);
calendar2.set(2017, 11, 11);
calendar2.getTime();
calendar2.set(Calendar.DAY_OF_WEEK, Calendar.SUNDAY);
System.out.println(calendar2.getTime().toString());
gives me the correct Date "Sun Dec 17 11:18:42 CET 2017"
Can someone explain why the first exmple is behaving this way? Is this really intended?
Thanks
Basically, the Calendar API is horrible, and should be avoided. It's not documented terribly clearly, but I think I see where it's going, and it's behaving as intended in this situation. By that I mean it's following the intention of the API authors, not the intention of you or anyone reading your code...
From the documentation:
The calendar field values can be set by calling the set methods. Any field values set in a Calendar will not be interpreted until it needs to calculate its time value (milliseconds from the Epoch) or values of the calendar fields. Calling the get, getTimeInMillis, getTime, add and roll involves such calculation.
And then:
When computing a date and time from the calendar fields, there may be insufficient information for the computation (such as only year and month with no day of month), or there may be inconsistent information (such as Tuesday, July 15, 1996 (Gregorian) -- July 15, 1996 is actually a Monday). Calendar will resolve calendar field values to determine the date and time in the following way.
If there is any conflict in calendar field values, Calendar gives priorities to calendar fields that have been set more recently. The following are the default combinations of the calendar fields. The most recent combination, as determined by the most recently set single field, will be used.
For the date fields:
YEAR + MONTH + DAY_OF_MONTH
YEAR + MONTH + WEEK_OF_MONTH + DAY_OF_WEEK
YEAR + MONTH + DAY_OF_WEEK_IN_MONTH + DAY_OF_WEEK
YEAR + DAY_OF_YEAR
YEAR + DAY_OF_WEEK + WEEK_OF_YEAR
In the first example, the fact that the last field set was "day of week" means it will then use the YEAR + MONTH + WEEK_OF_MONTH + DAY_OF_WEEK calculation (I think). The year and month have been set to December 2017, but the week-of-month is the current week-of-month, which is the week 5 of January 2018... so when you then say to set the day of week to Sunday, it's finding the Sunday in the "week 5" of December 2017. December only had 4 weeks, so it's effectively rolling it forward... I think. It's all messy and you shouldn't have to think about that, basically.
In the second example, calling getTime() "locks in" the year/month/day you've specified, and computes the other fields. When you set the day of week, that's then adjusting it within the existing computed fields.
Basically, avoid this API as far as you possibly can. Use java.time, which is a far cleaner date/time API.
As Jon Skeet said, avoid Calendar. For your case it is truly horrible, and it’s poorly designed in general. Instead do
WeekFields weekFieldsForLocale = WeekFields.of(Locale.GERMANY);
// To find out which number Sunday has in the locale,
// grab any Sunday and get its weekFieldsForLocale.dayOfWeek()
int dayNumberOfSundayInLocale = LocalDate.now()
.with(TemporalAdjusters.nextOrSame(DayOfWeek.SUNDAY))
.get(weekFieldsForLocale.dayOfWeek());
LocalDate date = LocalDate.of(2017, Month.DECEMBER, 11);
LocalDate sunday
= date.with(weekFieldsForLocale.dayOfWeek(), dayNumberOfSundayInLocale);
System.out.println(sunday);
This prints the expected date
2017-12-17
As others have already mentioned, the solution is to use java.time, the modern Java date and time API. Also generally it is so much nicer to work with. One nice feature is the LocalDate class that I am using. It is a date without time of day, which seems to match your requirements more precisely that Calendar did.
If the above looks complicated, it’s because, as I think you are aware, “Sunday of the same week” means different things in different locales. In the international standard that Germany follows, weeks begin on Monday, so Sunday is the last day of the week. In the American standard, for example, Sunday os the first day of the week. WeekFields.dayOfWeek() numbers the days of the week from 1 to 7, so when we want to set the day to Sunday, we first need to find out which number Sunday has got in this numbering (7 in Germany, 1 in the US). So for any Sunday, get its weekFieldsForLocale.dayOfWeek() value and later use this for setting the day of week to Sunday. The reason why this is necessary is that the with() method is so general and therefore has been designed to accept only numeric values; we can’t just pass it a DayOfWeek object.
If I substitute Locale.US into the code, I get 2017-12-10, which is the correct Sunday for a calendar where Sunday is the first day of the week. If you are sure your only want your code to work for Germany, you may of course just hardcode a 7 (please make it a constant with a very explanatory name).
Link: Oracle Tutorial Date Time explaining how to use java.time. There are other resources on the net (just avoid the outdated placed that suggest java.util.Calendar :-)
Consider this code:
Date date = new SimpleDateFormat("MMddyyyy").parse("01011500");
LocalDate localDateRight = LocalDate.parse(formatter.format(date), dateFormatter);
LocalDate localDateWrong = LocalDateTime.ofInstant(date.toInstant(), ZoneId.systemDefault()).toLocalDate();
System.out.println(date); // Wed Jan 01 00:00:00 EST 1500
System.out.println(localDateRight); // 1500-01-01
System.out.println(localDateWrong); // 1500-01-10
I know that 1582 is the cutoff between the Julian and Gregorian calendars. What I don't know is why this happens, or how to adjust for it.
Here's what I've figured out so far:
The date Object has a BaseCalender set to JulianCalendar
date.toInstant() just returns Instant.ofEpochMilli(getTime())
date.getTime() returns -14830974000000
-14830974000000 is Wed, 10 Jan 1500 05:00:00 GMT Gregorian
So it seems like either the millis returned by getTime() is wrong (unlikely) or just different than I expect and I need to account for the difference.
LocalDate handles the proleptic gregorian calendar only. From its javadoc:
The ISO-8601 calendar system is the modern civil calendar system used
today in most of the world. It is equivalent to the proleptic
Gregorian calendar system, in which today's rules for leap years are
applied for all time. For most applications written today, the
ISO-8601 rules are entirely suitable. However, any application that
makes use of historical dates, and requires them to be accurate will
find the ISO-8601 approach unsuitable.
In contrast, the old java.util.GregorianCalendar class (which is indirectly also used in toString()-output of java.util.Date) uses a configurable gregorian cut-off defaulting to 1582-10-15 as separation date between julian and gregorian calendar rules.
So LocalDate is not useable for any kind of historical dates.
But bear in mind that even java.util.GregorianCalendar often fails even when configured with correct region-dependent cut-off date. For example UK started the year on March 25th before 1752. And there are many more historical deviations in many countries. Outside of Europe even the julian calendar is not useable before introduction of gregorian calendar (or best useable only from a colonialist perspective).
UPDATE due to questions in comment:
To explain the value -14830974000000 let's consider following code and its output:
SimpleDateFormat format = new SimpleDateFormat("MMddyyyy", Locale.US);
format.setTimeZone(TimeZone.getTimeZone("America/New_York"));
Date d = format.parse("01011500");
long t1500 = d.getTime();
long tCutOver = format.parse("10151582").getTime();
System.out.println(t1500); // -14830974000000
System.out.println(tCutOver); // default gregorian cut off day in "epoch millis"
System.out.println((tCutOver - t1500) / 1000); // output: 2611699200 = 30228 * 86400
It should be noted that the value -12219292800000L mentioned in your earlier comment is different by 5 hours from tCutOver due to timezone offset difference between America/New_York and UTC. So in timezone EST (America/New_York) we have exactly 30228 days difference. For the timespan in question we apply the rules of julian calendar that is every fourth year is a leap year.
Between 1500 and 1582 we have 82 * 365 days + 21 leap days. Then we have also to add 273 days between 1582-01-01 and 1582-10-01, finally 4 days until cut-over (remember 4th of Oct is followed by 15th of Oct). At total: 82 * 365 + 21 + 273 + 4 = 30228 (what was to be proved).
Please explain to me why you have expected a value different from -14830974000000 ms. It looks correct for me since it handles the timezone offset of your system, the julian calendar rules before 1582 and the jump from 4th of Oct 1582 to cut-over date 1582-10-15. So for me your question "how do I tell the date object to return the ms to the correct Gregorian date?" is already answered - no correction needed. Keep in mind that this complex stuff is a pretty long time in production use and can be expected to work correctly after so many years.
If you really want to use JSR-310 for that stuff I repeat that there is no support for gregorian cut-over date. The best thing is that you might do your own work-around.
For example you might consider the external library Threeten-Extra which contains a proleptic julian calendar since release 0.9. But it will still be your effort to handle the cut-over between old julian calendar and new gregorian calendar. (And don't expect such libraries to be capable of handling REAL historic dates due to many other reasons like new year start etc.)
Update in year 2017: Another more powerful option would be using HistoricCalendar of my library Time4J which handles much more than just julian/gregorian-cutover.