Monday, April 20, 2009

Local to UTC date conversions: Microsoft vs. Java

I was recently working on project for a client that involves a Java-based enterprise content management system (FileNet P8). In this version of FileNet, all dates are stored as UTC. In previous versions of FileNet, this was not true, thus to upgrade, all datetime fields must first be converted to UTC before being put into the new system. Otherwise, FileNet will assume that the datetime you give it, which is probably in local time, is in UTC and will be wrong. Later, when you view that datetime in the user interface (called Workplace), it will get converted to your local time (using Java) and thus probably shift several hours (unless your local time IS UTC). If your datetime was really meant to be just a date, then the time was probably stored as 12:00 AM. If it gets shifted back, then the date will change to the prior day... probably NOT what you had expected.

If you use a Microsoft based program to help you convert, as we are, then you might assume that using the conversion function in .Net (TimeZone.ToUniversalTime()) to convert your local datetimes to UTC datetimes would do the trick. If you have dates prior to 2006, however, you would be wrong.

It looks like Microsoft decided on a pattern for daylight saving time and applied it to all time in the past…. never mind that it did not exist before 1918. The Microsoft daylight savings times are wrong all the way up to 2006. Even FileNet's own FileNet Enterprise Manager, which is a Microsoft MMC snap-in, does the wrong conversion. To be fair, this conversion is very difficult because before we had standard time zones and politicians deciding what time it is, it was up to each locale (state, city, village etc.) to decide how they would determine the time compared to UTC. How can you determine what UTC time they thought it was in Timbuktu in 1849 when somebody's birth date was recorded? To make it simple (I'm assuming) Microsoft just took the current rules (on some date) and applied those rules to all time which makes for easy, but inaccurate, conversions.

The Java solution, however, which is really the Unix solution is to use the Zoneinfo database (also known as the tz or Olson database) which attempts to record all the rules for all the times for all the locales. Although it's probably impossible for this database to ever be completely accurate, it's a lot better than applying a single rule for all time. This database contains the various daylight savings time rules for various locals. Here in California we use the 'America/Los_Angeles' locale.

To solve this problem, I created a class in VB.net which wraps a Java .class file. The Java .class file provides a function (using the TimeZone.getTimeZone() and DateFormat.setTimeZone() methods) for converting a local datetime string into a UTC datetime. I then wrote a small program to loop through and calculate the offset (in hours) between the local time (in California) and UTC time for every day from 1900 to 2009 using the Java methods and the Microsoft function TimeZone.ToUniversalTime() and saved the data in a database. The results are shown below. If you look at the graph and read the history below (borrowed from the U.S. Navy) you can see how the Java offset is much more accurate. You can see how in 1918, daylight savings time was first implemented and then repealed. It was then re-established for a few years starting in 1942. What you cannot see in the graph is that even after it was set 'for good' in 1966, the Microsoft offset is wrong a few days of every year around the 'spring forward' and 'fall back' dates until 2006 at which point the Microsoft functions and Java functions are completely in sync.

 

From http://aa.usno.navy.mil/faq/docs/daylight_time.php

History of Daylight Time in the U.S.

Although standard time in time zones was instituted in the U.S. and Canada by the railroads in 1883, it was not established in U.S. law until the Act of March 19, 1918, sometimes called the Standard Time Act. The act also established daylight saving time, a contentious idea then. Daylight saving time was repealed in 1919, but standard time in time zones remained in law. Daylight time became a local matter. It was re-established nationally early in World War II, and was continuously observed from 9 February 1942 to 30 September 1945. After the war its use varied among states and localities. The Uniform Time Act of 1966 provided standardization in the dates of beginning and end of daylight time in the U.S. but allowed for local exemptions from its observance. The act provided that daylight time begin on the last Sunday in April and end on the last Sunday in October, with the changeover to occur at 2 a.m. local time.

During the "energy crisis" years, Congress enacted earlier starting dates for daylight time. In 1974, daylight time began on 6 January and in 1975 it began on 23 February. After those two years the starting date reverted back to the last Sunday in April. In 1986, a law was passed that shifted the starting date of daylight time to the first Sunday in April, beginning in 1987. The ending date of daylight time was not subject to such changes, and remained the last Sunday in October. The Energy Policy Act of 2005 changed both the starting and ending dates. Beginning in 2007, daylight time starts on the second Sunday in March and ends on the first Sunday in November.

For a very readable account of the history of standard and daylight time in the U.S., see Ian R. Bartky and Elizabeth Harrison: "Standard and Daylight-saving Time", Scientific American, May 1979 (Vol. 240, No. 5), pp. 46-53.

Although the Java solution works and we are currently using it, I've since discovered that there are several completely .net implementations of conversion functions using the Zoneinfo database which can be found if you know what to search for (ie. Zoneinfo database .net).



No comments: